New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training speed #3
Comments
That speed sounds about right. RNNs are very slow for long sequences, unfortunately. In the "Experiments" section of the paper we note that we find it expedient to start training with highly-truncated sequences, then increase Edit: This is now in the README. |
Thank you! |
Indeed , the code uses cpu 0 ... https://github.com/abisee/pointer-generator/blob/master/run_summarization.py#L109 |
@StevenLOL It uses GPU for the main computations: https://github.com/abisee/pointer-generator/blob/master/model.py#L294 You can see which ops are performed on which device by looking at the "graph" in Tensorboard. You can change any of these to fit your needs. |
@StevenLOL @bugtig same here. As referenced by tianjianjiang we also discussed the fact that Nvidia 1080 is completely useless if the device is set to CPU. It is really close to 0% load on GPU which is problematic. |
Hello, thank you for your work.
With the default settings on a 1080 and TF 1.0, i'm getting about 13 secs per size 16 batch, which would mean 1 epoch takes about 3 days, which is clearly off. Do you any ideas what may be causing the slowdown?
The text was updated successfully, but these errors were encountered: