Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda is not available after using accelerate launch to run script #2626

Closed
2 of 4 tasks
MagicianWu opened this issue Apr 5, 2024 · 2 comments
Closed
2 of 4 tasks

Comments

@MagicianWu
Copy link

System Info

- `Accelerate` version: 0.28.0
- Platform: Linux-5.4.0-173-generic-x86_64-with-glibc2.31
- Python version: 3.10.14
- Numpy version: 1.26.4
- PyTorch version (GPU?): 2.2.1+cu121 (True)
- PyTorch XPU available: False
- PyTorch NPU available: False
- System RAM: 2015.53 GB
- GPU type: NVIDIA A800-SXM4-80GB
- `Accelerate` default config:
        - compute_environment: LOCAL_MACHINE
        - distributed_type: MULTI_GPU
        - mixed_precision: bf16
        - use_cpu: False
        - debug: True
        - num_processes: 8
        - machine_rank: 0
        - num_machines: 1
        - gpu_ids: [0,1,2,3,4,5,6,7]
        - rdzv_backend: static
        - same_network: True
        - main_training_function: main
        - downcast_bf16: no
        - tpu_use_cluster: False
        - tpu_use_sudo: F

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
  • My own task or dataset (give details below)

Reproduction

import torch
from accelerate import Accelerator

def main():
    accelerator = Accelerator()
    accelerator.print(torch.cuda.is_available())

if __name__ == "__main__":
    main()

image

Expected behavior

torch.cuda.is_available() should return True

@SunMarc
Copy link
Member

SunMarc commented Apr 15, 2024

Hi @MagicianWu, I'm unable to reproduce this behavior. Did you try reinstalling torch with cuda ? Does this happens only after you initialize Accelerator() ? I see that you posted something related here before.

Copy link

github-actions bot commented May 9, 2024

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants