Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training seems not no begin #1

Closed
ghpu opened this issue Jul 10, 2019 · 2 comments
Closed

Training seems not no begin #1

ghpu opened this issue Jul 10, 2019 · 2 comments

Comments

@ghpu
Copy link

ghpu commented Jul 10, 2019

I am trying to reproduce the experiment, but it look as if the training process stay stuck at start :

2019-07-09 17:15:45,951 - INFO - allennlp.training.trainer - Beginning training.
2019-07-09 17:15:45,951 - INFO - allennlp.training.trainer - Epoch 0/79
2019-07-09 17:15:45,951 - INFO - allennlp.training.trainer - Peak CPU memory usage MB: 19202.28
2019-07-09 17:15:46,225 - INFO - allennlp.training.trainer - GPU 0 memory usage MB: 1694
2019-07-09 17:15:46,226 - INFO - allennlp.training.trainer - GPU 1 memory usage MB: 37
2019-07-09 17:15:46,231 - INFO - allennlp.training.trainer - Training
0%| | 0/46617 [00:00<?, ?it/s]

After a night, the progress bar has not moved at all.

Cpu usage is 100% for 1 core, memory use is slightly increasing, and gpus are not working.

Could you please indicate which versions of python, allennlp and pytorch you are using ?

Mine are
python=3.6
allennlp==0.8.4
pytorch-pretrained-bert==0.6.1
pytorch=1.0.0

@ghpu
Copy link
Author

ghpu commented Jul 10, 2019

When truncating train.conllu to a dozens of sentences, it works, so I presume a night of waiting was not enough for preprocessing the whole UD train corpus.

@ghpu ghpu closed this as completed Jul 10, 2019
@Hyperparticle
Copy link
Owner

Yes, make sure to check your RAM usage. This is the stage where the training set is loaded into memory, and if you don't have enough, it may be waiting forever. I believe it's an issue with AllenNLP being very inefficient by creating several objects per line, but I haven't investigated far enough to verify exactly how it can be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants