Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation Fault (core dumped) #56

Open
akshayklr057 opened this issue May 31, 2023 · 3 comments
Open

Segmentation Fault (core dumped) #56

akshayklr057 opened this issue May 31, 2023 · 3 comments

Comments

@akshayklr057
Copy link

Hi Team,
As suggested by you here to evaluate the model on the CoNLL2003 dataset, I was running the command CUDA_VISIBLE_DEVICES=0 python train.py --config config/conll_03_english.yaml --test to test the working of the code. However, when doing so I get below error:
Screenshot 2023-05-31 at 10 12 29 AM
I had tried debugging it as well but couldn't get a way around this.
My system configurations are:
Ubuntu: 20.04
RAM: 32GB
GPU: NVIDIA GeForce RTX 3080 Ti

@wangxinyu0922
Copy link
Member

I have not met such kinds of problem before. It seems that the problem comes when loading the embeddings. Maybe the CPU memory is not enough.

@akshayklr057
Copy link
Author

After enough digging into the issue, I can see that the issue is because PyTorch is not able to access the CUDA.
Also, the recommended PyTorch version (1.3.1) is not listed on the PyTorch website of official releases but is somehow present in the PyPi.
This is the snippet where torch fails to put a variable on CUDA:
Screenshot 2023-06-01 at 9 34 11 AM

Moreover, the transformers "from_pretrained" is not able to load the pre-trained models. Thus, throwing "Segmentation fault" issue.
Screenshot 2023-06-01 at 9 35 08 AM

Apart from this, the flair code also threw this error in the "embeddings.py" in the constructor of TransformerWordEmbeddings when calling the parent class transformer. The code was throwing the same error. Attached is the screenshot of the place of code where the issue happened.

Screenshot 2023-06-01 at 9 37 42 AM

Could you tell me which CUDA & Nvidia-Drivers version did you run it with? I was just trying to set up the repository and evaluate to see if the set up was successful.

@MintMerlot
Copy link

I have the same problem.
My cuda is 11.3, so I update torch 1.11.0, the problem is solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants