Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault when CPU load is high #311

Open
ivan opened this issue Feb 22, 2016 · 1 comment
Open

Segmentation fault when CPU load is high #311

ivan opened this issue Feb 22, 2016 · 1 comment
Labels
bug

Comments

@ivan
Copy link
Contributor

@ivan ivan commented Feb 22, 2016

I'm seeing wpull 1.2.3 on Python 3.4.3 repeatably crash when running it on a machine with high CPU load. I've observed this on both Ubuntu 14.04 and 15.10 (different machines; the 14.04 is on a Core i3 from 2011 and 15.10 on a 4-core 4790K).

You can probably reproduce this by running this web server: https://github.com/ludios/crawl-destroyer

and then starting a lot of crawls with

(mkdir j-1 && cd j-1 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-2 && cd j-2 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-3 && cd j-3 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-4 && cd j-4 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-5 && cd j-5 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-6 && cd j-6 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-7 && cd j-7 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-8 && cd j-8 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-9 && cd j-9 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-10 && cd j-10 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-11 && cd j-11 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-12 && cd j-12 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-13 && cd j-13 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-14 && cd j-14 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-15 && cd j-15 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-16 && cd j-16 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-17 && cd j-17 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-18 && cd j-18 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-19 && cd j-19 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-20 && cd j-20 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-21 && cd j-21 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-22 && cd j-22 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-23 && cd j-23 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-24 && cd j-24 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-25 && cd j-25 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-26 && cd j-26 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-27 && cd j-27 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-28 && cd j-28 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-29 && cd j-29 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-30 && cd j-30 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-31 && cd j-31 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &
(mkdir j-32 && cd j-32 && ~/.local/bin/wpull --quiet --output-file wpull.log --delete-after --concurrent 5 --warc-file warc --recursive http://127.0.0.1:3000/ > log 2> log) &

After about 10-30 minutes on a 4-core 4790K, I see at least one process crash with

[11]    segmentation fault  ( mkdir j-11 && cd j-11 && ~/.local/bin/wpull --quiet --output-file wpull.log)

It may take longer to crash on other processors. A lower or higher number of wpull processes may be optimal, but I think the load average needs to be > 25 for a good chance of a crash.

This happens both with and without cchardet installed.

This one might be tricky to track down because heap corruption is probably happening some time before the crash. A Mozilla person suggested I use http://rr-project.org/ to try to track it down, but that the overhead might not be acceptable for Python. I will continue investigating.

@ivan
Copy link
Contributor Author

@ivan ivan commented Feb 22, 2016

My next step might be to check if sqlalchemy is to blame, by verifying the heap with gc.collect() before and after calls to sqlalchemy.

@chfoo chfoo added the bug label Feb 22, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.