Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-calculate the batch size? #7

Closed
ajratner opened this issue Aug 4, 2015 · 3 comments
Closed

Auto-calculate the batch size? #7

ajratner opened this issue Aug 4, 2015 · 3 comments
Assignees

Comments

@ajratner
Copy link
Contributor

ajratner commented Aug 4, 2015

@raphaelhoffmann Especially since the loading time of coreNLP is so (relatively) long, it seems like a better default than batch_size=1000 would be to divide lines by the number of cores (or cores*nodes for distribute). For example I just got a 2x speedup on an ec-2 node really easily here (had forgotten to set batch size first time around...). Thoughts?

@raphaelhoffmann
Copy link
Contributor

That's a great idea! Please check this in.

@ajratner
Copy link
Contributor Author

ajratner commented Aug 4, 2015

Sure, will do tomorrow

On Tue, Aug 4, 2015 at 12:29 AM Raphael Hoffmann notifications@github.com
wrote:

That's a great idea! Please check this in.


Reply to this email directly or view it on GitHub
#7 (comment).

@ajratner ajratner self-assigned this Aug 5, 2015
@ajratner
Copy link
Contributor Author

ajratner commented Aug 5, 2015

Will push my new wrapper functions (in the fabfile) when done processing everything...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants