huge number of threads spin up for lot's of messages processing. #28

caliloo · 2018-07-11T22:45:10Z

Hi,

I'm using pysher which works for me -so thanks !- but with some caveats. I'm connected to a pusher channel sending 10's or 100's of message per second, each a few B to 0.5MB in size (bigger rarer following a log rule).

I observe fairly high cpu usage on the process using the library, and that has caused me trouble on low end machines (Atom N2800 @ 1.8Ghz). higher end machines (i3) take up the load without problems.

So looking at cpu usage and profiling a bit, I find 2 things fairly odd. One, it seems 2 threads are spun off for each message call back call and die thereafter. But the call back is not processed in the context of either of those 2 threads, it is processed in the context of the main websocket app thread, it seems.
ok, that's fair enough, why not.

I'm not sure why those 2 threads are spun off, and do what, then die. The other thing is that these threads are definitely created in the websocket module, as far as I can make out.

I'm hoping to reduce the cpu load that pysher uses to receive messages. Not sure if that will be achieved by changing the threading situation or something, but thought worth mentioning the thread behaviour, as I found it unusual (I quickly get into 10k 's of threads created and destroyed after a few minutes) and certainly it seems related to cpu consumption.

It's late for me, so I'm hanging off for today, I'm happy to work on solving this later, but thru this message hoping to get some kind of hint or architectural description that may help me start on it at least. (and who knows, maybe it'll get solved while I sleep :) )

Cheers.

berentn · 2018-07-31T09:34:33Z

Hi caliloo,

I addressed this issue today by using a scheduler so the timer threads are no longer needed. This greatly reduces the cpu load and increases the message processing speed. It is a very rough implementation and needs some refactoring and formal testing. But so far it seems to work.

However, I'm new to using git(hub), so I'm not sure how to share it with others (or anything else you're supposed to do on github). But you can find the updated file here:
https://github.com/berentn/Pysher/blob/master/pysher/connection.py

deepbrook · 2018-07-31T10:08:57Z

Hey @berentn ,
check the faq - I'd love to merge your changes!
https://help.github.com/articles/creating-a-pull-request/

caliloo · 2018-07-31T16:55:59Z

Hi I've tested the proposed merge.

While I do not see huge numbers of threads anymore which is a good thing, I d'ont see a significant drop in CPU usage (I'm going from 85/95% to 75/85% , I was expecting a drop in excess of 50-80% to get to 45/15% after the change ...)

Cheers

berentn · 2018-07-31T18:30:32Z

@nlsdfnbch
Thanks, I gave it a try :)

@caliloo
I also get an absolute reduction of about 10% on my CPU. However, that is a reduction of ~40% in my case. From my profiler I see that stopping the timers is now ~4 times faster and starting the timer is ~40 times faster.

Maybe your callback functions are using a lot of resources? Did you check what happens to your CPU load when you use a simple callback function which discards the messages?

caliloo · 2018-07-31T19:17:16Z

Yep have tried the function that discards messages, it has next to no effect on what I observe, maybe a drop by 1-2% of the numbers I see. As if all the load was generated by pysher rather than my callback.

In my normal case, I have actually moved my processing (which currently resumes itself to disk IO and not much else) to another process, the callback only packs stuff into a string and posts it to a queue, which is little load afaik.

What profiler do you use to figure which thread does what ? I'm willing to give it a try.

berentn · 2018-07-31T20:14:31Z

I constructed a test / benchmark file to compare the the timeout scheduling methods using cprofile and pstat. I'm not knowledgeable enough to create usable profiles / stats when running the pusherclient itself.

caliloo · 2018-07-31T20:25:34Z

ok, too bad, I was hoping to profile the pusherclient ...

caliloo · 2018-08-08T21:18:44Z

ok, I think I finally cracked this one. After some profiling, I figured that the webscoket module does dumb utf8 validation (byte by byte !!) unless you have wsaccel installed. So one pip intall wsccel later, it looks much better.

To resume my process was taking about 100-110% cpu (it was crashing regularly for lack of cycles and falling behind): the cpu contributors were :

ssl about 10%
dumb utf8 validation about 60% (vs wsaccel which is maybe 5-10%)
crazy threads spin up 15% (vs sched 0%)
the rest 10-15 %

so with ssl, wsaccel, sheduler instead of threads, my process shows about 25-30% cpu usage, that a 4x division of cpu usage which is nice :)
I'd definitely merge the timer shceduler patch if possible, it's much neater.

deepbrook · 2018-08-12T08:17:48Z

I've uploaded a new preview version (0.6.0b1) - Since tests are currently broken, it would be great if you guys could give that release a spin and see if anything breaks your current code.

caliloo · 2018-08-15T09:33:59Z

Thanks,
I'm giving it a try right now, it has started and run properly for the last 15 minutes, I'll give it a few days, see how it behaves and report back (I don't expect big stuff to show up, as most of the connection.py hasn't changed from berentn proposal, but pusher.py has seen some churn, so hey :) )

Could you add thread names to the various threads that are spun up ? I found that it helped with debugging.

connection.py:72:

    self.name = "PysherEventLoop"

connection.py:197:

    if not self.timeout_scheduler_thread:
        self.timeout_scheduler_thread = Thread(target=self.timeout_scheduler.run, daemon=True, name="PysherScheduler")
        self.timeout_scheduler_thread.start()

    elif not self.timeout_scheduler_thread.is_alive():
        self.timeout_scheduler_thread = Thread(target=self.timeout_scheduler.run, daemon=True, name="PysherScheduler")
        self.timeout_scheduler_thread.start()

(I may be wrong, but there may be one other thread flying around, havn't looked into it further, as this served my purpose)

caliloo · 2018-08-18T14:18:48Z

Have checked my logs for the past 3 days. Couldn't see anything that was a problem. This is good from my pov.

deepbrook mentioned this issue Aug 18, 2018

Add descriptive Thread names for debugging #31

Closed

caliloo closed this as completed Aug 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

huge number of threads spin up for lot's of messages processing. #28

huge number of threads spin up for lot's of messages processing. #28

caliloo commented Jul 11, 2018

berentn commented Jul 31, 2018

deepbrook commented Jul 31, 2018

caliloo commented Jul 31, 2018 •

edited

Loading

berentn commented Jul 31, 2018

caliloo commented Jul 31, 2018

berentn commented Jul 31, 2018

caliloo commented Jul 31, 2018

caliloo commented Aug 8, 2018

deepbrook commented Aug 12, 2018

caliloo commented Aug 15, 2018 •

edited

Loading

caliloo commented Aug 18, 2018

huge number of threads spin up for lot's of messages processing. #28

huge number of threads spin up for lot's of messages processing. #28

Comments

caliloo commented Jul 11, 2018

berentn commented Jul 31, 2018

deepbrook commented Jul 31, 2018

caliloo commented Jul 31, 2018 • edited Loading

berentn commented Jul 31, 2018

caliloo commented Jul 31, 2018

berentn commented Jul 31, 2018

caliloo commented Jul 31, 2018

caliloo commented Aug 8, 2018

deepbrook commented Aug 12, 2018

caliloo commented Aug 15, 2018 • edited Loading

caliloo commented Aug 18, 2018

caliloo commented Jul 31, 2018 •

edited

Loading

caliloo commented Aug 15, 2018 •

edited

Loading