Skip to content

Version 1.3 no longer supporting Tesla K40m? #30532

@JamesOwers

Description

@JamesOwers

🐛 Bug

I am using a Tesla K40m, installed pytorch 1.3 with conda, using CUDA 10.1

To Reproduce

Steps to reproduce the behavior:

  1. Have a box with a Tesla K40m
  2. conda install pytorch cudatoolkit -c pytorch
  3. show cuda is available
python -c 'import torch; print(torch.cuda.is_available());'
>>> True
  1. Instantiate a model and call .forward()
Traceback (most recent call last):
  File "./baselines/get_results.py", line 395, in <module>
    main(args)
  File "./baselines/get_results.py", line 325, in main
    log_info = eval_main(eval_args)
  File "/mnt/cdtds_cluster_home/s0816700/git/midi_degradation_toolkit/baselines/eval_task.py", line 165, in main
    log_info = trainer.test(0, evaluate=True)
  File "/mnt/cdtds_cluster_home/s0816700/git/midi_degradation_toolkit/mdtk/pytorch_trainers.py", line 110, in test
    evaluate=evaluate)
  File "/mnt/cdtds_cluster_home/s0816700/git/midi_degradation_toolkit/mdtk/pytorch_trainers.py", line 220, in iteration
    model_output = self.model.forward(input_data, input_lengths)
  File "/mnt/cdtds_cluster_home/s0816700/git/midi_degradation_toolkit/mdtk/pytorch_models.py", line 49, in forward
    self.hidden = self.init_hidden(batch_size, device=device)
  File "/mnt/cdtds_cluster_home/s0816700/git/midi_degradation_toolkit/mdtk/pytorch_models.py", line 40, in init_hidden
    return (torch.randn(1, batch_size, self.hidden_dim, device=device),
RuntimeError: CUDA error: no kernel image is available for execution on the device

First tried downgrading to cudatoolkit=10.0, that exhibited same issue.

The code will run fine if you repeat steps above but instead conda install pytorch=1.2 cudatoolkit=10.0 -c pytorch.

Expected behavior

If no longer supporting a specific GPU, please bomb out upon load with useful error message.

Environment

Unfort ran your script after I 'fixed' so pytorch version will be 1.2 here - issue encountered with version 1.3.

Collecting environment information...
PyTorch version: 1.2.0
Is debug build: No
CUDA used to build PyTorch: 10.0.130

OS: Scientific Linux release 7.6 (Nitrogen)
GCC version: (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36)
CMake version: version 2.8.12.2

Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: Tesla K40m
Nvidia driver version: 430.50
cuDNN version: /usr/lib64/libcudnn.so.6.5.18

Versions of relevant libraries:
[pip3] numpy==1.16.3
[pip3] numpydoc==0.8.0
[conda] blas                      1.0                         mkl  
[conda] mkl                       2019.4                      243  
[conda] mkl-service               2.3.0            py37he904b0f_0  
[conda] mkl_fft                   1.0.15           py37ha843d7b_0  
[conda] mkl_random                1.1.0            py37hd6b4f25_0  
[conda] pytorch                   1.2.0           py3.7_cuda10.0.130_cudnn7.6.2_0    pytorch
[conda] torchvision               0.4.0                py37_cu100    pytorch

cc @ezyang @gchanan @zou3519 @jerryzh168 @ngimel

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: binariesAnything related to official binaries that we release to usersmodule: cudaRelated to torch.cuda, and CUDA support in generalmodule: docsRelated to our documentation, both in docs/ and docblockstriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions