Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

huge number of threads spin up for lot's of messages processing. #28

Closed
caliloo opened this issue Jul 11, 2018 · 11 comments
Closed

huge number of threads spin up for lot's of messages processing. #28

caliloo opened this issue Jul 11, 2018 · 11 comments

Comments

@caliloo
Copy link
Contributor

caliloo commented Jul 11, 2018

Hi,

I'm using pysher which works for me -so thanks !- but with some caveats. I'm connected to a pusher channel sending 10's or 100's of message per second, each a few B to 0.5MB in size (bigger rarer following a log rule).

I observe fairly high cpu usage on the process using the library, and that has caused me trouble on low end machines (Atom N2800 @ 1.8Ghz). higher end machines (i3) take up the load without problems.

So looking at cpu usage and profiling a bit, I find 2 things fairly odd. One, it seems 2 threads are spun off for each message call back call and die thereafter. But the call back is not processed in the context of either of those 2 threads, it is processed in the context of the main websocket app thread, it seems.
ok, that's fair enough, why not.

I'm not sure why those 2 threads are spun off, and do what, then die. The other thing is that these threads are definitely created in the websocket module, as far as I can make out.

I'm hoping to reduce the cpu load that pysher uses to receive messages. Not sure if that will be achieved by changing the threading situation or something, but thought worth mentioning the thread behaviour, as I found it unusual (I quickly get into 10k 's of threads created and destroyed after a few minutes) and certainly it seems related to cpu consumption.

It's late for me, so I'm hanging off for today, I'm happy to work on solving this later, but thru this message hoping to get some kind of hint or architectural description that may help me start on it at least. (and who knows, maybe it'll get solved while I sleep :) )

Cheers.

@berentn
Copy link
Contributor

berentn commented Jul 31, 2018

Hi caliloo,

I addressed this issue today by using a scheduler so the timer threads are no longer needed. This greatly reduces the cpu load and increases the message processing speed. It is a very rough implementation and needs some refactoring and formal testing. But so far it seems to work.

However, I'm new to using git(hub), so I'm not sure how to share it with others (or anything else you're supposed to do on github). But you can find the updated file here:
https://github.com/berentn/Pysher/blob/master/pysher/connection.py

@deepbrook
Copy link
Owner

Hey @berentn ,
check the faq - I'd love to merge your changes!
https://help.github.com/articles/creating-a-pull-request/

@caliloo
Copy link
Contributor Author

caliloo commented Jul 31, 2018

Hi I've tested the proposed merge.

While I do not see huge numbers of threads anymore which is a good thing, I d'ont see a significant drop in CPU usage (I'm going from 85/95% to 75/85% , I was expecting a drop in excess of 50-80% to get to 45/15% after the change ...)

Cheers

@berentn
Copy link
Contributor

berentn commented Jul 31, 2018

@nlsdfnbch
Thanks, I gave it a try :)

@caliloo
I also get an absolute reduction of about 10% on my CPU. However, that is a reduction of ~40% in my case. From my profiler I see that stopping the timers is now ~4 times faster and starting the timer is ~40 times faster.

Maybe your callback functions are using a lot of resources? Did you check what happens to your CPU load when you use a simple callback function which discards the messages?

@caliloo
Copy link
Contributor Author

caliloo commented Jul 31, 2018

Yep have tried the function that discards messages, it has next to no effect on what I observe, maybe a drop by 1-2% of the numbers I see. As if all the load was generated by pysher rather than my callback.

In my normal case, I have actually moved my processing (which currently resumes itself to disk IO and not much else) to another process, the callback only packs stuff into a string and posts it to a queue, which is little load afaik.

What profiler do you use to figure which thread does what ? I'm willing to give it a try.

@berentn
Copy link
Contributor

berentn commented Jul 31, 2018

I constructed a test / benchmark file to compare the the timeout scheduling methods using cprofile and pstat. I'm not knowledgeable enough to create usable profiles / stats when running the pusherclient itself.

@caliloo
Copy link
Contributor Author

caliloo commented Jul 31, 2018

ok, too bad, I was hoping to profile the pusherclient ...

@caliloo
Copy link
Contributor Author

caliloo commented Aug 8, 2018

ok, I think I finally cracked this one. After some profiling, I figured that the webscoket module does dumb utf8 validation (byte by byte !!) unless you have wsaccel installed. So one pip intall wsccel later, it looks much better.

To resume my process was taking about 100-110% cpu (it was crashing regularly for lack of cycles and falling behind): the cpu contributors were :

  • ssl about 10%
  • dumb utf8 validation about 60% (vs wsaccel which is maybe 5-10%)
  • crazy threads spin up 15% (vs sched 0%)
  • the rest 10-15 %

so with ssl, wsaccel, sheduler instead of threads, my process shows about 25-30% cpu usage, that a 4x division of cpu usage which is nice :)
I'd definitely merge the timer shceduler patch if possible, it's much neater.

@deepbrook
Copy link
Owner

I've uploaded a new preview version (0.6.0b1) - Since tests are currently broken, it would be great if you guys could give that release a spin and see if anything breaks your current code.

@caliloo
Copy link
Contributor Author

caliloo commented Aug 15, 2018

Thanks,
I'm giving it a try right now, it has started and run properly for the last 15 minutes, I'll give it a few days, see how it behaves and report back (I don't expect big stuff to show up, as most of the connection.py hasn't changed from berentn proposal, but pusher.py has seen some churn, so hey :) )

Could you add thread names to the various threads that are spun up ? I found that it helped with debugging.

connection.py:72:

    self.name = "PysherEventLoop"

connection.py:197:

    if not self.timeout_scheduler_thread:
        self.timeout_scheduler_thread = Thread(target=self.timeout_scheduler.run, daemon=True, name="PysherScheduler")
        self.timeout_scheduler_thread.start()

    elif not self.timeout_scheduler_thread.is_alive():
        self.timeout_scheduler_thread = Thread(target=self.timeout_scheduler.run, daemon=True, name="PysherScheduler")
        self.timeout_scheduler_thread.start()

(I may be wrong, but there may be one other thread flying around, havn't looked into it further, as this served my purpose)

@caliloo
Copy link
Contributor Author

caliloo commented Aug 18, 2018

Have checked my logs for the past 3 days. Couldn't see anything that was a problem. This is good from my pov.

@caliloo caliloo closed this as completed Aug 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants