Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

更详细的环境配置信息 #1

Closed
TPF2017 opened this issue Jan 31, 2024 · 5 comments
Closed

更详细的环境配置信息 #1

TPF2017 opened this issue Jan 31, 2024 · 5 comments

Comments

@TPF2017
Copy link

TPF2017 commented Jan 31, 2024

您好,我下载了代码并安装了相应的包,但是还是无法直接运行。请问有更详细的环境配置信息吗,比如python版本,cuda版本。
Traceback (most recent call last): File "/home/tianpengfei1/Time-LLM/run_main.py", line 132, in <module> model = TimeLLM.Model(args).float() File "/home/tianpengfei1/Time-LLM/models/TimeLLM.py", line 53, in __init__ self.llama = LlamaModel.from_pretrained( File "/home/tianpengfei1/anaconda3/envs/llmtime/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2256, in from_pretrained quantization_config, kwargs = BitsAndBytesConfig.from_dict( File "/home/tianpengfei1/anaconda3/envs/llmtime/lib/python3.9/site-packages/transformers/utils/quantization_config.py", line 189, in from_dict config = cls(**config_dict) File "/home/tianpengfei1/anaconda3/envs/llmtime/lib/python3.9/site-packages/transformers/utils/quantization_config.py", line 118, in __init__ self.post_init() File "/home/tianpengfei1/anaconda3/envs/llmtime/lib/python3.9/site-packages/transformers/utils/quantization_config.py", line 144, in post_init if self.load_in_4bit and not version.parse(importlib.metadata.version("bitsandbytes")) >= version.parse( File "/home/tianpengfei1/anaconda3/envs/llmtime/lib/python3.9/importlib/metadata.py", line 569, in version return distribution(distribution_name).version File "/home/tianpengfei1/anaconda3/envs/llmtime/lib/python3.9/importlib/metadata.py", line 542, in distribution return Distribution.from_name(distribution_name) File "/home/tianpengfei1/anaconda3/envs/llmtime/lib/python3.9/importlib/metadata.py", line 196, in from_name raise PackageNotFoundError(name) importlib.metadata.PackageNotFoundError: bitsandbytes ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 9021) of binary: /home/tianpengfei1/anaconda3/envs/llmtime/bin/python Traceback (most recent call last): File "/home/tianpengfei1/anaconda3/envs/llmtime/bin/accelerate", line 8, in <module> sys.exit(main()) File "/home/tianpengfei1/anaconda3/envs/llmtime/lib/python3.9/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main args.func(args) File "/home/tianpengfei1/anaconda3/envs/llmtime/lib/python3.9/site-packages/accelerate/commands/launch.py", line 932, in launch_command multi_gpu_launcher(args) File "/home/tianpengfei1/anaconda3/envs/llmtime/lib/python3.9/site-packages/accelerate/commands/launch.py", line 627, in multi_gpu_launcher distrib_run.run(args) File "/home/tianpengfei1/anaconda3/envs/llmtime/lib/python3.9/site-packages/torch/distributed/run.py", line 785, in run elastic_launch( File "/home/tianpengfei1/anaconda3/envs/llmtime/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 134, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/home/tianpengfei1/anaconda3/envs/llmtime/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

@m6129
Copy link

m6129 commented Feb 3, 2024

Joining in. I need more detailed documentation. I also encountered problems during the library installation from requirements.txt, but different ones problems.
I found out that Python no higher than 3.9.x is needed and a GPU, but still, there were problems afterward.

@akbism
Copy link

akbism commented Feb 4, 2024

For me, installation of the package Time-LLM works when I select the python package 3.8.
However, I am getting the following error when I try to execute " bash ./scripts/TimeLLM_ETTh1.sh":-

RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

@KimMeen
Copy link
Owner

KimMeen commented Feb 5, 2024

@TPF2017 and @m6129, can you give a try with the following configuration to see whether this works in your local environments:

  • Python=3.8.5
  • PyTorch=2.0.1
  • CUDA and pytorch-cuda=11.7
  • accelerate=0.21.0
  • transformers=4.29.2
  • deepspeed=0.10.0

Other dependencies:

  • numpy=1.24.3
  • pandas=1.5.3
  • scikit_learn=1.2.2
  • reformer_pytorch=1.4.4
  • tqdm=4.65.0

@KimMeen
Copy link
Owner

KimMeen commented Feb 5, 2024

For me, installation of the package Time-LLM works when I select the python package 3.8. However, I am getting the following error when I try to execute " bash ./scripts/TimeLLM_ETTh1.sh":-

RuntimeError: CUDA error: invalid device ordinal CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

@akbism Could you first check whether num_process=8 is compatible with your local environment? This value should typically correspond to the number of GPUs utilized. Let me and @kwuking know if this issue persists.

@KimMeen
Copy link
Owner

KimMeen commented Feb 8, 2024

This issue has been closed as there were no further questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants