Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

importing model error #3

Closed
homink opened this issue Jan 7, 2019 · 5 comments
Closed

importing model error #3

homink opened this issue Jan 7, 2019 · 5 comments

Comments

@homink
Copy link

homink commented Jan 7, 2019

Hi Jinserk,

I encountered the importing model error and tried to find what root cause is with some logs. But it has off the top of my head. Any suggestions?

...
try:
    print('models: ' + " ".join(models))
    print('input model: ' + model)
    print('trying importlib.import_module(f"asr.models.{model}")')
    m = importlib.import_module(f"asr.models.{model}")
    m.train(argv)
except:
    raise
...

python -V
Python 3.6.3 :: Anaconda custom (64-bit)

ls /home/kwon/EXP/ted_pytorch -alF
total 20940
drwxr-xr-x    5 kwon domain users      202 Jan  7 13:42 ./
drwxr-xr-x    3 kwon domain users       33 Jan  7 13:38 ../
drwxr-xr-x   10 kwon domain users      241 Jan  7 13:42 dev/
-rw-r--r--    1 kwon domain users     2022 Jan  7 13:42 dev_convert.txt
-rw-r--r--    1 kwon domain users   109533 Jan  7 13:42 dev.csv
drwxr-xr-x   13 kwon domain users      320 Jan  7 13:42 test/
-rw-r--r--    1 kwon domain users     2093 Jan  7 13:42 test_convert.txt
-rw-r--r--    1 kwon domain users   246191 Jan  7 13:42 test.csv
drwxr-xr-x 1497 kwon domain users    53248 Jan  7 13:40 train/
-rw-r--r--    1 kwon domain users   372408 Jan  7 13:42 train_convert.txt

ls -alF
total 88
drwxr-xr-x  4 kwon domain users   317 Jan  7 15:20 ./
drwxr-xr-x 18 kwon domain users  4096 Jan  7 10:09 ../
drwxr-xr-x  7 kwon domain users   128 Jan  7 11:47 asr/
-rw-r--r--  1 kwon domain users   455 Jan  7 10:10 batch_train.py
drwxr-xr-x  8 kwon domain users   211 Jan  7 10:10 .git/
-rw-r--r--  1 kwon domain users  1339 Jan  7 10:10 .gitignore
-rw-r--r--  1 kwon domain users 35147 Jan  7 10:10 LICENSE
-rw-r--r--  1 kwon domain users   451 Jan  7 10:10 predict.py
-rw-r--r--  1 kwon domain users   473 Jan  7 10:10 prepare.py
-rw-r--r--  1 kwon domain users  5381 Jan  7 10:10 README.md
-rw-r--r--  1 kwon domain users   547 Jan  7 10:10 requirements.txt
-rw-r--r--  1 kwon domain users   448 Jan  7 10:10 test.py
-rwxr-xr-x  1 kwon domain users   670 Jan  7 10:10 train_deepspeech.sh*
-rwxr-xr-x  1 kwon domain users   737 Jan  7 10:10 train_las.sh*
-rw-r--r--  1 kwon domain users   592 Jan  7 15:20 train.py

python train.py deepspeech_ctc --data-path /home/kwon/EXP/ted_pytorch
models: densenet deepspeech_ce deepspeech_var resnet_ce resnet_ctc resnet_split convnet ssvae capsule1 deepspeech_ctc capsule2 resnet_split_ce las densenet_ctc
input model: deepspeech_ctc
trying importlib.import_module(f"asr.models.{model}")
Segmentation fault (core dumped)

python train.py las --data-path /home/kwon/EXP/ted_pytorch
models: deepspeech_var las ssvae capsule2 convnet densenet densenet_ctc resnet_split resnet_split_ce deepspeech_ctc capsule1 resnet_ctc resnet_ce deepspeech_ce
input model: las
trying importlib.import_module(f"asr.models.{model}")
Segmentation fault (core dumped)

[kwon@ssi-dnn-slave-001 pytorch-asr]$ python train.py deepspeech_var --data-path /home/kwon/EXP/ted_pytorch
models: las convnet resnet_split densenet resnet_split_ce capsule2 ssvae deepspeech_ce deepspeech_ctc densenet_ctc resnet_ctc deepspeech_var resnet_ce capsule1
input model: deepspeech_var
trying importlib.import_module(f"asr.models.{model}")
Segmentation fault (core dumped)
@jinserk
Copy link
Owner

jinserk commented Jan 8, 2019

Hi @homink,

Hmm, it's odd. I guess the importlib has some issue to load modules. What OS are you using?
Could you check where the segfault is generated by following this?

@homink
Copy link
Author

homink commented Jan 8, 2019

CentOS 7. I found that importing _torch_sox gives error in my system. The following links could be similar symptoms but reinstalling pytorch/audio with pip or cloning&install doesn't work.

pytorch/audio#62
pytorch/audio#68

cat /etc/os-release 
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
python train.py deepspeech_ctc --data-path /home/kwon/EXP/ted_pytorch
models: convnet deepspeech_ce resnet_ce ssvae las capsule1 capsule2 deepspeech_var densenet_ctc resnet_ctc densenet resnet_split resnet_split_ce deepspeech_ctc
input model: deepspeech_ctc
trying importlib.import_module(f"asr.models.{model}")
Fatal Python error: Segmentation fault

Current thread 0x00007f2d90162740 (most recent call first):
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 922 in create_module
  File "<frozen importlib._bootstrap>", line 571 in module_from_spec
  File "<frozen importlib._bootstrap>", line 658 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 955 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 971 in _find_and_load
  File "/home/kwon/anaconda3/lib/python3.6/site-packages/torchaudio/__init__.py", line 5 in <module>
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 678 in exec_module
  File "<frozen importlib._bootstrap>", line 665 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 955 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 971 in _find_and_load
  File "/home/kwon/3rdParty/pytorch-asr/asr/utils/dataset.py", line 15 in <module>
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 678 in exec_module
  File "<frozen importlib._bootstrap>", line 665 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 955 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 971 in _find_and_load
  File "/home/kwon/3rdParty/pytorch-asr/asr/models/deepspeech_ctc/train.py", line 10 in <module>
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 678 in exec_module
  File "<frozen importlib._bootstrap>", line 665 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 955 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 971 in _find_and_load
  File "/home/kwon/3rdParty/pytorch-asr/asr/models/deepspeech_ctc/__init__.py", line 1 in <module>
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 678 in exec_module
  File "<frozen importlib._bootstrap>", line 665 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 955 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 971 in _find_and_load
  File "<frozen importlib._bootstrap>", line 994 in _gcd_import
  File "/home/kwon/anaconda3/lib/python3.6/importlib/__init__.py", line 126 in import_module
  File "train.py", line 25 in <module>
Segmentation fault (core dumped)
echo $CPLUS_INCLUDE_PATH
/usr/include/sox:
which sox
/usr/bin/sox
which th
/usr/local/torch/install/bin/th
ls /home/kwon/anaconda3/lib/python3.6/site-packages/*.so -hal
-rwxr-xr-x 2 kwon domain users 185K Sep 17  2017 /home/kwon/anaconda3/lib/python3.6/site-packages/_cffi_backend.cpython-36m-x86_64-linux-gnu.so
-rwxr-xr-x 2 kwon domain users 539K Sep 18  2017 /home/kwon/anaconda3/lib/python3.6/site-packages/gmpy2.cpython-36m-x86_64-linux-gnu.so
-rwxr-xr-x 2 kwon domain users  36K Sep 18  2017 /home/kwon/anaconda3/lib/python3.6/site-packages/greenlet.cpython-36m-x86_64-linux-gnu.so
-rwxrwxr-x 2 kwon domain users  93K Jul  5  2018 /home/kwon/anaconda3/lib/python3.6/site-packages/pycosat.cpython-36m-x86_64-linux-gnu.so
-rwxr-xr-x 2 kwon domain users 137K Sep 18  2017 /home/kwon/anaconda3/lib/python3.6/site-packages/pycurl.cpython-36m-x86_64-linux-gnu.so
-rwxr-xr-x 2 kwon domain users 154K Sep 18  2017 /home/kwon/anaconda3/lib/python3.6/site-packages/pyodbc.cpython-36m-x86_64-linux-gnu.so
-rwxr-xr-x 2 kwon domain users 121K Sep 18  2017 /home/kwon/anaconda3/lib/python3.6/site-packages/sip.so
-rwxr-xr-x 1 kwon domain users 6.0M Jan  7 11:07 /home/kwon/anaconda3/lib/python3.6/site-packages/_torch_sox.cpython-36m-x86_64-linux-gnu.so
-rwxr-xr-x 2 kwon domain users 228K Sep 18  2017 /home/kwon/anaconda3/lib/python3.6/site-packages/_yaml.cpython-36m-x86_64-linux-gnu.so
python
Python 3.6.3 |Anaconda custom (64-bit)| (default, Oct 13 2017, 12:02:49) 
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import _torch_sox
Segmentation fault (core dumped)
python
Python 3.6.3 |Anaconda custom (64-bit)| (default, Oct 13 2017, 12:02:49) 
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import _torch_sox
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /home/kwon/anaconda3/lib/python3.6/site-packages/_torch_sox.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN2at19UndefinedTensorImpl10_singletonE

@jinserk
Copy link
Owner

jinserk commented Jan 8, 2019

This could be a silly question, but did you install sox-devel using yum or sox package in anaconda env?
Please try to reinstall after manually deletion of _torch_sox. If it is not effective, then you can consider the sox installation from the source.
Actually I didn't use the anaconda env, but I guess the sox linked in your system looks like the system's package rather than the anaconda package, which could be the issue of consistency. In my experience, torch 1.0 has some ABI related issues.

@homink
Copy link
Author

homink commented Feb 7, 2019

Anaconda looks not fully working with pytorch. pyenv works perfectly.

@homink homink closed this as completed Feb 7, 2019
@jmlemercier
Copy link

jmlemercier commented Apr 20, 2020

Hello there, I would like to reopen the issue, as I had the same error when trying to use torchaudio:
My OS is CentOS 7, I am using a Conda environment defined by the following .yml file:

name: audinet_env channels: - defaults - conda-forge - pytorch dependencies: - python = 3.7 - numpy = 1.18.* - scipy = 1.4.* - pytorch - torchaudio - librosa - matplotlib - pytest - scikit-learn - progressbar2 - IPython - pip - pip: - pystoi

The installed versions of for the packages of interest are then :
- torch = 1.4.0
- torchaudio = 0.4.0

When using the torchaudio in my main function, I get the error Undefined symbol: ... when trying to import _torch_sox for the __init__.py script of torchaudio.

I had the same problem with a Ubuntu 18.04, which I solved by downgrading torchaudio to 0.3.1, but the same manoeuver does not work here.

I tried :
- Downgrading torchaudio pytorch
- Building torchaudio from source, but I get the building issue for sox.h listed [HERE], which I can not solve since I am not sudo on this session
- Installing and Updating from binary releases (pip and conda)

I haven't yet tried (because it is a real drag to get out of conda and switch everything to pipenv):
- Using pipenv instead of conda virtual environments

No success so far, would appreciate a little help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants