Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

undefined symbol: THPVariableClass #6

Open
jatentaki opened this issue May 1, 2018 · 14 comments
Open

undefined symbol: THPVariableClass #6

jatentaki opened this issue May 1, 2018 · 14 comments

Comments

@jatentaki
Copy link

  • OS: Ubuntu 16.04
  • PyTorch version: 0.4.0
  • How you installed PyTorch (conda, pip, source): conda
  • Python version: 3.6
  • CUDA/cuDNN version: 9.0
  • GPU models and configuration: Titan XP

I built an extension basing on this tutorial and it used to work. I was then doing some refactoring and fixes (in cuda/cpp code) and afterwards it started failing at runtime:

/home/jatentaki/anaconda3/lib/python3.6/site-packages/sort2_cuda-0.0.0-py3.6-linux-x86_64.egg/lltm_cpp.cpython-36m-x86_64-linux-gnu.so: undefined symbol: THPVariableClass

(both for CUDA and cpp versions). Then I tried if the original example still worked, and to my surprise, no longer.

Timeline:

  • My initial success was on some 0.4.0 pre-release source build for cuda8.0.
  • I broke it
  • Trying to troubleshoot, I reinstalled conda and torch for the release 0.4.0 version, with cuda9.0
  • Neither my code nor your original example work

I believe the error just means I am not linking against some static library, but I don't see when and how I could have introduced that change.

@goldsborough
Copy link
Contributor

goldsborough commented May 1, 2018

This often occurs when you import the extension before import torch. Are you sure the order you are importing is:

import torch
import your_extension

Also, does this error occur when you import torch or import your_extension?
Or does it fail when compiling the extension?

@jatentaki
Copy link
Author

Ok, I won't be able to test on the same machine before tomorrow, but the fix works on my personal laptop. Perhaps this should be mentioned in the tutorial? Maybe it's common setuptools knowledge, but it caught me off guard.

@goldsborough
Copy link
Contributor

It says it in the tutorial -- there is a line saying

Just be sure to import torch first, as this will resolve some symbols that the dynamic linker must see

It doesn't have anything to do with setuptools, it's just a dynamic linking issue. The torch module is a shared (dynamic) library which defines certain symbols that are unresolved in the extension library. To make these symbols available, the library containing the symbols (torch) must be imported before the library using them (your_extension) so that the dynamic linker can match the symbols with those from the torch library.

@ezyang
Copy link

ezyang commented Sep 6, 2018

I helped another user who made the same mistake. Maybe we can figure out a good way to give a better error message.

@goldsborough
Copy link
Contributor

@ezyang I'll think of something

@goldsborough goldsborough reopened this Sep 6, 2018
@Spandan-Madan
Copy link

Having a similar error, and loading torch before the extension doesn't solve it. Here's the error stack:-
Version Info:

Pytorch version: 0.4.1
CUDA version: 8.0
GCC version: 5.2.0

Error stack:-

>>> import torch
>>> import modules
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/data/graphics/toyota-pytorch/inplace_abn/modules/__init__.py", line 2, in <module>
    from .bn import ABN, InPlaceABN, InPlaceABNSync
  File "/data/graphics/toyota-pytorch/inplace_abn/modules/bn.py", line 10, in <module>
    from .functions import *
  File "/data/graphics/toyota-pytorch/inplace_abn/modules/functions.py", line 17, in <module>
    extra_cuda_cflags=["--expt-extended-lambda"])
  File "/afs/csail.mit.edu/u/s/smadan/miniconda3/envs/test/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 494, in load
    with_cuda=with_cuda)
  File "/afs/csail.mit.edu/u/s/smadan/miniconda3/envs/test/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 670, in _jit_compile
    return _import_module_from_library(name, build_directory)
  File "/afs/csail.mit.edu/u/s/smadan/miniconda3/envs/test/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 753, in _import_module_from_library
    return imp.load_module(module_name, file, path, description)
  File "/afs/csail.mit.edu/u/s/smadan/miniconda3/envs/test/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/afs/csail.mit.edu/u/s/smadan/miniconda3/envs/test/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: /tmp/torch_extensions/inplace_abn/inplace_abn.so: undefined symbol: _ZN2at5ErrorC1ENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
>>> 

Code base which I'm trying to run when the error occurs:-
https://github.com/mapillary/inplace_abn

Any leads on what I should try?

@soumith
Copy link
Member

soumith commented Sep 27, 2018

@Spandan-Madan this is basically flaking on an ABI incompatibility. (gcc > 5.1 binaries have different std::string ABI than gccc <= 5.1 binaries).

For this, we (pytorch) have a patch in 0.4.1 that sets a flag to compile the cpp-extension with _GLIBCXX_USE_CXX11_ABI=0 (see pytorch/pytorch@f08f222 ).

Did you build the extension with pytorch-master and switch back to pytorch-0.4.1 (or something of that sort)?

@Spandan-Madan
Copy link

Thanks for the reply @soumith.

I am using an extension present in the folder modules here in this repo: https://github.com/mapillary/inplace_abn

I installed Pytorch using conda (both normal and your channel), but I get this error in both.

Any leads on what I should try would be helpful. I've tried running with GCC 4.8 and 5.2 both, error persists.

Thanks in advance :)

@ChujunWhu
Copy link

@Spandan-Madan Hi, have you solved the problem yet?
Met the same problem and tired gcc 4.8, gcc 4.9 and gcc 5.4 but all failed. The error still exists
My pytorch is 0.4.1.

@etoilestar
Copy link

This often occurs when you import the extension before import torch. Are you sure the order you are importing is:

import torch
import your_extension

Also, does this error occur when you import torch or import your_extension?
Or does it fail when compiling the extension?

hello, i meet the same problem, and i import torch before import _C, but it also occur, could you help me?

@heiner
Copy link

heiner commented Jul 20, 2020

I suspect the underlying error is pytorch/pytorch#38122.

@monajalal
Copy link

Could you please check
daniilidis-group/neural_renderer#92
and
daniilidis-group/neural_renderer#93

I was able to reproduce this error for two repos.

$ python
Python 3.7.6 (default, Jan  8 2020, 19:59:22) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'1.6.0'
>>> torch.version.cuda
'10.1'
>>> torch.cuda.is_available()
True


$ gcc --version
gcc (Ubuntu 9.3.0-10ubuntu2) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 20.04.1 LTS
Release:	20.04
Codename:	focal



@monajalal
Copy link

@goldsborough

here is the code I am trying to run:


"""
Example 1. Drawing a teapot from multiple viewpoints.
"""
import os
import argparse

import torch
import numpy as np
import tqdm
import imageio

import neural_renderer as nr

not sure it throws this error

(base) mona@mona:~/research/3danimals/neural_renderer/examples$ python example1.py 
Traceback (most recent call last):
  File "example1.py", line 12, in <module>
    import neural_renderer as nr
  File "/home/mona/anaconda3/lib/python3.7/site-packages/neural_renderer/__init__.py", line 3, in <module>
    from .load_obj import load_obj
  File "/home/mona/anaconda3/lib/python3.7/site-packages/neural_renderer/load_obj.py", line 8, in <module>
    import neural_renderer.cuda.load_textures as load_textures_cuda
ImportError: /home/mona/anaconda3/lib/python3.7/site-packages/neural_renderer/cuda/load_textures.cpython-37m-x86_64-linux-gnu.so: undefined symbol: THPVariableClass

daniilidis-group/neural_renderer#93

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants