Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too Slow #2

Closed
domoench opened this issue Sep 4, 2014 · 1 comment
Closed

Too Slow #2

domoench opened this issue Sep 4, 2014 · 1 comment

Comments

@domoench
Copy link
Owner

domoench commented Sep 4, 2014

Make it multithreaded so you aren't blocking on network I/O for such a large fraction of runtime.

domoench added a commit that referenced this issue Sep 5, 2014
Works for small sites, still has bugs though.
@domoench
Copy link
Owner Author

domoench commented Sep 6, 2014

Things are much faster now with multithreading.

Below are 3 experiments for a tiny website, medium-ish website, and largish-website. Each tests how long it takes to crawl in relation to the number of threads executing.

For a tiny site, as long as I had more than 1 thread - I didn't gain speed by adding more.
For medium and larger sites, I got major speed gains that level off at around 30 threads.

graph1
graph2
graph3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant