Skip to content
This repository has been archived by the owner on Dec 11, 2020. It is now read-only.

Illegal instruction (core dumped) happens when import rlpytorch in container #141

Closed
Breath123 opened this issue Feb 18, 2019 · 5 comments
Closed

Comments

@Breath123
Copy link

Breath123 commented Feb 18, 2019

Hello,
Firstly, I build the image with Dockerfile under project root then run it.
Secondly, I follow the part of "Training a Go bot" in README, and when I execute step 5, there is an issue that the script start_server.sh calls python(2.7) raather than python3.7. So I correct it and use python3.7 in start_server.sh.
Then I re-excute step 5 and found that there is an error of "Illegal instruction (core dumped)" when import rlpytorch in train.py.

The environment I use in host:
A Dell server with 20 cores,
Host OS version: Red Hat 4.8.3-9
One Tesla V100 GPU

Could you help to investigate this issue? Thanks a lot.

@qucheng
Copy link

qucheng commented Feb 18, 2019

You need to compile the python lib in python 3.7 too

@Breath123
Copy link
Author

@qucheng Thank you for answering, I found that the python version is not 3.7 because the line of "RUN bash -c "source activate base && make -j4"" in Dockerfile not take effect when I run "docker run -it elf",
then I execute "source activate base && make -j4" after running elf container, and the python link to python3.7 now. But the same issue described above still happens.
How to compile the python lib? I already have tried commands of "make" and "make -j4".

@qucheng
Copy link

qucheng commented Feb 19, 2019

Can you paste the full error?
You can specify pythonpath and pythonlib when make
Also gcc/g++ need to be 7.x

@Breath123
Copy link
Author

@qucheng Commands I used and outputs like blew:
1.run the container:
docker run --runtime=nvidia --name elf -it elf bash
2. set pythonpath in the container:
source scripts/devmode_set_pythonpath.sh
3.use python3.7:
source activate base
4. check env:
base) root@fbcb15f0af15:/go-elf/ELF/scripts/elfgames/go# echo $PYTHONPATH
/go-elf/ELF/src_py/:/go-elf/ELF/build/elf/:/go-elf/ELF/build/elfgames/go/:
(base) root@fbcb15f0af15:/go-elf/ELF/scripts/elfgames/go# which python
/root/miniconda3/bin/python
(base) root@fbcb15f0af15:/go-elf/ELF/scripts/elfgames/go# python --version
Python 3.7.1
5. build:
make
6.try to import rlpytorch:
cd scripts/elfgames/go/
python -c "import rlpytorch"
then the error of Illegal instruction happens with error log:
(base) root@fbcb15f0af15:/go-elf/ELF/scripts/elfgames/go# python -c "import rlpytorch"
Illegal instruction (core dumped)
Also, It generated a core dumped file named core.30197, and I upload it to https://pan.baidu.com/s/1JWs9F30yhMGRhs1dotzwZw
In addition, you can pull the docker image I used from Here.

@Breath123
Copy link
Author

Breath123 commented Mar 21, 2019

@qucheng Thanks a lot for your kindly help. I switch to another server and run container with host network, though I don't know what makes it worked, this issue disappears.
I will close this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants