Skip to content
This repository has been archived by the owner on Dec 11, 2020. It is now read-only.

AttributeError: Can't get attribute '_rebuild_tensor_v2' #5

Closed
roy7 opened this issue May 3, 2018 · 15 comments
Closed

AttributeError: Can't get attribute '_rebuild_tensor_v2' #5

roy7 opened this issue May 3, 2018 · 15 comments

Comments

@roy7
Copy link

roy7 commented May 3, 2018

This error happens when trying to read in the weights file using an older version of Pytorch. I assume this is why you say Pytorch needs to be built from source. However I've tried that all evening and I can't find a way to navigate all of the nvcc/gcc/cuda incompatibilities to get it to compile. Many errors, all of which are common when I google, with lots of workarounds, but all of them only partially work. Fundamentally it seems like some sort of std::tuple issue with CUDA/nvcc which Nvidia acknowledges but say they won't fix until the next CUDA release.

Is there any chance you could cave your weight file out into an older PyTorch format? Then I could just install python-pytorch-cuda-0.3.1-2 for my version of linux and be up and running in moments. ELF itself compiled fine, and it runs with python-pytorch-cuda-0.3.1-2... it just can't read the weights file. pytorch/pytorch#5729 states it's because of the newer file format.

Thanks!

@roy7
Copy link
Author

roy7 commented May 3, 2018

I did try going to https://pytorch.org/ and choosing Linux / pip / Python 3.6 / CUDA 9.2 which gives me these commands:

pip3 install http://download.pytorch.org/whl/cu91/torch-0.4.0-cp36-cp36m-linux_x86_64.whl
pip3 install torchvision

Which doesn't give me the error above, but I do instead get:

RuntimeError: Error(s) in loading state_dict for Model_PolicyValue:
Unexpected key(s) in state_dict: "init_conv.1.num_batches_tracked", "pi_final_conv.1.num_batches_tracked", "value_final_conv.1.num_batches_tracked", "resnet.resnet.0.conv_lower.1.num_batches_tracked", "resnet.resnet.0.conv_upper.1.num_batches_tracked", "resnet.resnet.1.conv_lower.1.num_batches_tracked", "resnet.resnet.1.conv_upper.1.num_batches_tracked", "resnet.resnet.2.conv_lower.1.num_batches_tracked", "resnet.resnet.2.conv_upper.1.num_batches_tracked", "resnet.resnet.3.conv_lower.1.num_batches_tracked", "resnet.resnet.3.conv_upper.1.num_batches_tracked", "resnet.resnet.4.conv_lower.1.num_batches_tracked", "resnet.resnet.4.conv_upper.1.num_batches_tracked", "resnet.resnet.5.conv_lower.1.num_batches_tracked", "resnet.resnet.5.conv_upper.1.num_batches_tracked", "resnet.resnet.6.conv_lower.1.num_batches_tracked", "resnet.resnet.6.conv_upper.1.num_batches_tracked", "resnet.resnet.7.conv_lower.1.num_batches_tracked", "resnet.resnet.7.conv_upper.1.num_batches_tracked", "resnet.resnet.8.conv_lower.1.num_batches_tracked", "resnet.resnet.8.conv_upper.1.num_batches_tracked", "resnet.resnet.9.conv_lower.1.num_batches_tracked", "resnet.resnet.9.conv_upper.1.num_batches_tracked", "resnet.resnet.10.conv_lower.1.num_batches_tracked", "resnet.resnet.10.conv_upper.1.num_batches_tracked", "resnet.resnet.11.conv_lower.1.num_batches_tracked", "resnet.resnet.11.conv_upper.1.num_batches_tracked", "resnet.resnet.12.conv_lower.1.num_batches_tracked", "resnet.resnet.12.conv_upper.1.num_batches_tracked", "resnet.resnet.13.conv_lower.1.num_batches_tracked", "resnet.resnet.13.conv_upper.1.num_batches_tracked", "resnet.resnet.14.conv_lower.1.num_batches_tracked", "resnet.resnet.14.conv_upper.1.num_batches_tracked", "resnet.resnet.15.conv_lower.1.num_batches_tracked", "resnet.resnet.15.conv_upper.1.num_batches_tracked", "resnet.resnet.16.conv_lower.1.num_batches_tracked", "resnet.resnet.16.conv_upper.1.num_batches_tracked", "resnet.resnet.17.conv_lower.1.num_batches_tracked", "resnet.resnet.17.conv_upper.1.num_batches_tracked", "resnet.resnet.18.conv_lower.1.num_batches_tracked", "resnet.resnet.18.conv_upper.1.num_batches_tracked", "resnet.resnet.19.conv_lower.1.num_batches_tracked", "resnet.resnet.19.conv_upper.1.num_batches_tracked".

@bochen2027
Copy link

bochen2027 commented May 3, 2018

Hey guys, can you provide a binary instead? failing that, how about a VM/ova with one click compile? or at least some step by step instructions

@yuandong-tian @qucheng @jma127 @shubho

@ghost
Copy link

ghost commented May 3, 2018

https://github.com/pytorch/ELF#dependencies

"You also need to install PyTorch from source"

https://github.com/pytorch/pytorch#from-source

@roy7
Copy link
Author

roy7 commented May 3, 2018

I know. The problem is after a few hours of trying I gave up on compiling Pytorch from source. I tried all the tricks I could find online, using various gcc versions, making changes to cuda include files, etc etc. Part of it is a bug in nvcc that Nvidia won't fix until next CUDA release, but I just couldn't find the right combinations of tricks to make it compile.

There's no way I'll be alone in this. :)

@ghost
Copy link

ghost commented May 3, 2018

If you require tricks your environment is set up wrong or perhaps your hardware is old.

I have been using a simple Ubuntu 16.04 environment for the last year and building pytorch from source automatically every single day. 1080 x3 setup.

@bochen2027
Copy link

I dont think he is using ubuntu. Recommend using clonezilla to dd harddrive, wipe, install Ubuntu version that ELF used, and then try again.

@zchen0211
Copy link
Contributor

@roy7 set track_running_stats=True in the operators may help.

@soumith
Copy link
Member

soumith commented May 3, 2018

hey everyone, i'll get you a pytorch-nightly binary by tomorrow, so that you dont have to go through the trouble of installing from source.

@roy7
Copy link
Author

roy7 commented May 3, 2018

@soumith Thanks! :)

The python-pytorch-cuda maintainer has been working on getting a newer package out but has trouble getting it to compile as well. I realize 0.4.0 might also be too old and only the master branch is compatible. Down the road of course I'm sure a 0.4.1 will come out, and packages will get made. Just takes time.

@jma127
Copy link
Contributor

jma127 commented May 3, 2018

@roy7 many thanks for braving other distros with our project, and thanks @soumith for preparing the binary!

I'll keep this issue open as I'm very curious to hear about how it goes after the PyTorch issue is resolved.

@soumith
Copy link
Member

soumith commented May 4, 2018

if you use anaconda python, you can now use:

conda install -c pytorch pytorch-nightly

@soumith
Copy link
Member

soumith commented May 4, 2018

if you are using volta GPU, use conda install -c pytorch pytorch-nightly cuda90 (updated readme now)

@14327319
Copy link

Could you please make saved models backward compatible with older versions of pytorch?

@jma127
Copy link
Contributor

jma127 commented Jun 21, 2018

Backwards compatibility with old PyTorch versions is not currently on our radar.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants