Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Longevity & system freeze with Ubuntu 16 #26

Closed
randrews01 opened this issue May 2, 2017 · 34 comments
Closed

Longevity & system freeze with Ubuntu 16 #26

randrews01 opened this issue May 2, 2017 · 34 comments

Comments

@randrews01
Copy link
Contributor

Hi,

I'm facing a rather perplexing problem with this script. After some period of time (hours), my Ubuntu 16 - desktop flavor VM will completely freeze. This means that it doesn't send or respond to network activity and the console is frozen. I am forced to reset the VM. I checked of the various logs in /var/log but there wasn't anything of interest in there. The usual googling/fixes suggested for Ubuntu freezes (such as editing grub) didn't help. Has anyone else encountered this issue or have thoughts on where to look?

FWIW, this is a really 'clean' install of Ubuntu 16. I grabbed the latest ISO, built a VM, installed the required packages to run this script, and that's about it.

@essandess
Copy link
Owner

  1. All python and phantomjs upgraded to latest stable releases?
  2. Pulled latest commit?
  3. Successfully run the SSCCE?
  4. Not running in debug mode?The debug output is voluminous and will fill up available buffers and caches. That shouldn't crash a box, emphasis on "should," but stuff happens.

@randrews01
Copy link
Contributor Author

randrews01 commented May 2, 2017 via email

@essandess
Copy link
Owner

Perhaps a memory consumption issue.

I coded the script to shutdown and restart phantomjs if its footprint grows larger than a gigabyte.

Would a 1 GB process cripple your VM?

I'll add a command line argument to set this maximum memory usage as a parameter. Please watch for the next commit.

@randrews01
Copy link
Contributor Author

I have plenty of memory available. I added another 8GB to the VM last week hoping it would make a difference. Thanks for the update.

@essandess
Copy link
Owner

Try

python isp_data_pollution.py -mm 256

Or to relaunch phantomjs for every GET:

python isp_data_pollution.py -mm 0

@randrews01
Copy link
Contributor Author

I tried with -mm 0; still freezing after hours of usage.

@essandess
Copy link
Owner

Okay, thanks. I've wrapped all the calls to phantomjs methods with timeouts that should return control to the script if phantomjs hangs—emphasis on should.

I'm not seeing this myself. Would you please go back to debug mode and post the last few lines where the hang occurs to determine which method might be the culprit? Perhaps do this a few times to see if the problem is consistent.

Hopefully that would point the way to a fix or at least where more verbose out is required.

@randrews01
Copy link
Contributor Author

Here is a screenshot from the time it froze:
image

@essandess
Copy link
Owner

Would you please add a debug -g to the call to help isolate the block where this occurs.

Also, if you know how, call ps for the phantomjs process and report its state (R == running, zombie, etc.).

@randrews01
Copy link
Contributor Author

Hi,

I can't issue a ps because the entire system is frozen. Here is a screen capture from the moment it froze (about 30 minutes ago).
image

@randrews01
Copy link
Contributor Author

Here's another one where I ran watch for phantomjs and python3.
image

@randrews01
Copy link
Contributor Author

Here is the latest freeze:
image

@essandess
Copy link
Owner

Thanks—that's very helpful. I hadn't seen one of those exceptions before.

I'm going to go puzzle over this and install the script on my own VM and see if I see the same issue, and search to see if others have had issues with phantomjs crashing their box.

I have a speculation-based request: would you please comment out the os.nice command and see if this resolves the issue? That shouldn't matter, but I can imagine timing issues with two asynchrounous processes way down in the priority queue.

@essandess
Copy link
Owner

Another candidate area is inadequate garbage collection after opening and closing jillions of phantomjs driver objects. I'll look into whether my code or selenium's has this issue.

@randrews01
Copy link
Contributor Author

I'm testing with that commented out. Will report back. Thanks!

@randrews01
Copy link
Contributor Author

Commenting out os.nice hasn't helped. Latest freeze:
image

@essandess
Copy link
Owner

Thanks for trying. I'll start looking into possible garbage collection issues and try to replicate on my own VM.

@essandess
Copy link
Owner

This phantomjs issue is relevant, and comments are consistent with the macOS versus Ubuntu differences.

ariya/phantomjs#14028

@randrews01
Copy link
Contributor Author

Maybe I missed it, but that's discussing a freeze of phantjoms itself and not entire system freeze? It's possible they are related. Seems like a rather nasty bug to kill the whole OS with bad app like that.

@essandess
Copy link
Owner

The OS has an issue if a hanging process brings it down.

I'm just cross-referencing relevant phantomjs issues looking for a solution. This issue is also potentially relevant:

ariya/phantomjs#13210

@randrews01
Copy link
Contributor Author

Interesting comment here: ariya/phantomjs#14972

@essandess
Copy link
Owner

Interesting comment here: ariya/phantomjs#14972

This issue is covered with the -mm 0 setting, which restarts phantomjs for each GET.

Also, I checked a couple of the url's you show during a freeze above, and do not see anything unusual about those pages. For example, this links has only 35 elements and loads just fine:

url = 'https://www.foolfunds.com/contact/?source=ifufungb0010006'

I'm investigating the garbage collection possibility.

You're just seeing a single phantomjs instance running at any time, correct?

For reference, here's a command that will show the process status of all running phantomjs processes:

for pid in `ps -ef | grep phantomjs | grep -v grep | gawk '{print $2}' | tr '\n' ' '`; do ps -O pid,%cpu,rss,state,nice,time -p $pid; done

@randrews01
Copy link
Contributor Author

image

@essandess
Copy link
Owner

See ariya/phantomjs#14990.

@essandess
Copy link
Owner

Having posted this I see that my own error handler is the culprit in the stack overflow, which could also cause a system freeze if Python's max recursion depth exceeds system resources.

I believe that I've isolated and fixed this issue.

Please pull the latest and let me know what happens.

And thanks for pushing on this thread.

@randrews01
Copy link
Contributor Author

I appreciate your wiliness to help fix this :)

I took the last commit but alas, still hit a freeze:

image

@essandess
Copy link
Owner

Thanks for trying. Throwing rocks at the code is the only way to make it bulletproof, to mix metaphors.

Please try this before we throw this over the fence to PhantomJS.

  1. Download and install the latest binary from phantomjs.org. Delete whatever apt-get version you're using, or make sure your path is set up so that you're using the phantomjs.org binary.
  2. Upgrade everything: OS, Python, everything.
  3. Pull the latest.
  4. Run in debug mode and hope we catch the failing block or phantomjs call.
  5. Perhaps do this on a non-headless box to confirm, and actually watch what's going on.

@randrews01
Copy link
Contributor Author

1 - that's how I installed
2 - already done
3 - pulled todays commit and am testing with that now
4 - that's turned on
5 - I'm running the desktop version of Ubuntu so I can watch what's going on

@essandess
Copy link
Owner

Did this work for you?

I've been running smoothly for over a week now. Here are a couple bash commands to check the process status of both the phantomjs process(es) and the parent python script process:

for pid in `ps -ef | grep phantomjs | grep -v grep | gawk '{print $2}' | tr '\n' ' '`; do ps -O ppid,%cpu,rss,state,nice,time -p $pid; done
ps -O ppid,%cpu,rss,state,nice,start -p `ps -ef | grep phantomjs | grep -v grep | tail -1 | awk '{print $3}'`

@randrews01
Copy link
Contributor Author

I grabbed the latest commit. It lasted about an hour before it froze.
image

@essandess
Copy link
Owner

I've tested on a CentOS VM and it's solid there.

I believe this a some combination of phantomjs running with the OS and stack on your box.

Would you please initiate an issue for this at https://github.com/ariya/phantomjs/issues/?

@randrews01
Copy link
Contributor Author

Confirming I have over 12 hours of up time on CentOS. There is just something broken with Ubuntu.

@essandess
Copy link
Owner

Great. Thanks again for your time and comments. Though we weren't able to address the Ubuntu issue, thinking about it helped to point out and address other robustness issues, so it was definitely effort well spent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants