-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error while running examples - segmentation fault #7
Comments
I just saw thread #4 (#4) had a similar issue and that the cuda drivers were the problem. I have tried reinstalling them with I have cuda v10.2 NVIDIA-SMI 440.59 Driver Version: 440.59 CUDA Version: 10.2 I've never had such cuda issues on other PyTorch projects, so I don't really know how to troubleshoot this problem. update: I also tried with the --no_cuda arg but I am getting still the same segmentation fault. |
The problem happens during loading the model. Do you have any unavailable GPUs. Can you try to specify GPUs by adding |
Yes, there are 10 GPUs on the server and some are used. The fix you proposed did not change anything. I hard-coded a free GPU (8) in run_finetune.py to check if that was the problem and still got the same error. Could you please explain what does the local_rank arg does, I am not sure I understand correctly. Here's the error I get: 01/09/2021 18:29:41 - WARNING - main - Process rank: -1, device: cuda:8, n_gpu: 1, distributed training: False, 16-bits training: False ============================================================ |
@alexandremarcil this might help: |
Thanks @hkmztrk. I downgraded the sentencepiece to 0.1.91 and I do not have the segmentation fault anymore. but I have other issues :( here's the fix for anyone else having this issue: |
Hi,
I'm trying to run the example. I created the dnabest env and downloaded the packages and files. I get an error at step 3.3. while trying to run the Fine-tune with pre-trained model (DNABERT6). I get the following error message:
<class 'transformers.tokenization_dna.DNATokenizer'>
01/05/2021 17:08:16 - INFO - transformers.tokenization_utils - loading file https://raw.githubusercontent.com/jerryji1993/DNABERT/master/src/transformers/dnabert-config/bert-config-6/vocab.txt from cache at /home/mcb/users/zipcode/.cache/torch/transformers/ea1474aad40c1c8ed4e1cb7c11345ddda6df27a857fb29e1d4c901d9b900d32d.26f8bd5a32e49c2a8271a46950754a4a767726709b7741c68723bc1db840a87e
01/05/2021 17:08:16 - INFO - transformers.modeling_utils - loading weights file /home/mcb/users/zipcode/code/DNABERT/6-new-12w-0/pytorch_model.bin
Segmentation fault (core dumped)
I have tried redownloading the pretrained model, but got the same error. Strangely, I do not get this error locally on my mac, but without any GPU, it would take too long to run. I get this error on a linux server.
Any ideas on how to fix this? Thanks!
The text was updated successfully, but these errors were encountered: