Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cutorch fails on getDeviceCount() #782

Open
sergei-sh opened this issue Jun 20, 2017 · 9 comments
Open

cutorch fails on getDeviceCount() #782

sergei-sh opened this issue Jun 20, 2017 · 9 comments

Comments

@sergei-sh
Copy link

sergei-sh commented Jun 20, 2017

Installed torch, cuda.

test.py

import lutorpy as lua
lua.require('cutorch')

Output:

THCudaCheck FAIL file=/home/serj/torch/extra/cutorch/lib/THC/THCGeneral.c line=66 error=30 : unknown error
Traceback (most recent call last):
  File "test.py", line 4, in <module>
    lua.require('cutorch')
  File "/home/serj/work/beautytorch/venv/local/lib/python2.7/site-packages/lutorpy/__init__.py", line 112, in require
    ret = luaRuntime.require(module_name)
  File "lutorpy/_lupa.pyx", line 318, in lutorpy._lupa.LuaRuntime.require (lutorpy/_lupa.c:6189)
  File "lutorpy/_lupa.pyx", line 1658, in lutorpy._lupa.call_lua (lutorpy/_lupa.c:25877)
  File "lutorpy/_lupa.pyx", line 1667, in lutorpy._lupa.execute_lua_call (lutorpy/_lupa.c:25987)
  File "lutorpy/_lupa.pyx", line 1620, in lutorpy._lupa.raise_lua_error (lutorpy/_lupa.c:25297)
lutorpy._lupa.LuaError: cuda runtime error (30) : unknown error at /home/serj/torch/extra/cutorch/lib/THC/THCGeneral.c:66

Related code:

{
  if (!state->cudaDeviceAllocator) {
    state->cudaDeviceAllocator = &defaultDeviceAllocator;
  }
  if (!state->cudaHostAllocator) {
    state->cudaHostAllocator = &THCudaHostAllocator;
  }
  if (!state->cudaUVAAllocator) {
    state->cudaUVAAllocator = &THCUVAAllocator;
  }

  int numDevices = 0;
  THCudaCheck(cudaGetDeviceCount(&numDevices)); // FAILS HERE!!!

Maybe the problem relates to the fact that my GPU is not NVidia, but cutorch is needed as a part of 3-rd party project which I'm trying to use

@sergei-sh sergei-sh changed the title cutorch fails right after installation cutorch fails on getDeviceCount() Jun 20, 2017
@albanD
Copy link
Contributor

albanD commented Jun 20, 2017

Please do not post the same question at multiple places. It is useless and creates a lot of noise.

cutorch/cunn only supports nvidia CUDA devices as explained here

@sergei-sh
Copy link
Author

Thanks for your comment.

cutorch/cunn only supports nvidia CUDA devices as explained here

Sure. Can I avoid crash when importing as a part of a bigger project, sometimes running with non-supported devices?

@albanD
Copy link
Contributor

albanD commented Jun 20, 2017

Do not require it? You won't be able to use anything from it anyway.

@sergei-sh
Copy link
Author

I would use other parts of the enclosing project. Do you mean it should fail in this situation by design? I suggest some other control path.

@sergei-sh
Copy link
Author

Any workaround ?...

@albanD
Copy link
Contributor

albanD commented Jun 26, 2017

Make sure the library does not require cutorch, cunn or cudnn when gpu_mode is set to false.

@sergei-sh
Copy link
Author

sergei-sh commented Jun 26, 2017

The problem is that I need those requirements. If I can do without them finally, there will be no issue (I don't like to spend your time).

This is the example code. It converts GPU model to CPU model so that I could run it without NVidia GPU.

Convert GPU model to CPU compatible version
Author: Eren Golge - erengolge@gmail.com

require 'torch'
require 'cudnn'
require 'nn'
require 'cunn'
require "models/dropresnet"

cmd = torch.CmdLine()
cmd:text()
cmd:text()
cmd:text('convert GPU model to CPU version')
cmd:text()
cmd:text('Options')
cmd:option('-loadPath','','Path to load GPU model.')
cmd:option('-savePath','', 'Path to save CPU model')
cmd:text()

params = cmd:parse(arg)

print(params.loadPath)
local model = torch.load(params.loadPath)
model_cpu = cudnn.convert(model, nn):float()
torch.save(params.savePath, model_cpu)

As you can see the task doesn't require the hardware (does it?). At the same time it will fail on torch.load() without cudnn and cunn.

/home/serj/torch/install/share/lua/5.1/torch/File.lua:343: unknown Torch class <cudnn.SpatialConvolution>

@albanD
Copy link
Contributor

albanD commented Jun 26, 2017

The thing is that if you have a GPU model (with modules built to run on GPU and parameters in CUDA tensors), you need cutorch and cunn to load them. Unfortunately there is no workaround: you need a CUDA device to load a torch GPU model.
The simplest thing to do is to ask whoever gave you the GPU model (or someone with a GPU) to run the conversion script locally and give you the CPU model directly.

@sergei-sh
Copy link
Author

Thanks a lot, this is much helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants