Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segment causing celery workers to hang in django #51

Closed
gitumarkk opened this issue Jun 16, 2015 · 13 comments
Closed

Segment causing celery workers to hang in django #51

gitumarkk opened this issue Jun 16, 2015 · 13 comments

Comments

@gitumarkk
Copy link

We use the django post save signal to trigger segment analytics tracking asynchronously using celery. However when multiple events (about 350 in 20 seconds) are created, all the celery workers hang up consistently at the following output.

[2015-06-12 22:41:34,245: INFO/Worker-2] Starting new HTTPS connection (1): api.segment.io [2015-06-12 22:41:34,574: DEBUG/Worker-2] "POST /v1/batch HTTP/1.1" 200 21 [2015-06-12 22:41:34,578: DEBUG/Worker-2] data uploaded successfully

When the analytics tracking is commented out the workers function as expected. When the the celery rate limit is set to "600/m" the celery workers run without hanging. We have a celery hard time limit of 30 seconds to prevent segment from hanging. We found at a higher rate limit, the hard time limit was hit at high frequency and the analytics tracking was not sent through.

Not sure why the segment library is causing this to happen, please advice.

@leotsem
Copy link

leotsem commented Jul 13, 2015

Possible related celery/celery#2429

@23doors
Copy link

23doors commented Sep 15, 2015

I am also experiencing something similar. And in my case it was also narrowed down to analytics library. I haven't tried higher rate limits but it still blocks after a while even with mild traffic.

@sperand-io
Copy link

hey all, thanks so much for the report and apologies for the delay getting back on this sooner.

@calvinfo @f2prateek any idea here?

@shredding
Copy link

Updates?

@calvinfo
Copy link
Contributor

Unfortunately I haven't gotten a chance to deeply investigate here.

One question though, how are you using celery (multiprocessing, eventlet,
gevent)? Since the library itself does its own threading, there might be
some issues when used with other multi-threaded libraries. We keep one
thread per client-instance (so it's non-blocking), so you won't want to
constantly create new clients in the workers.

We might move this to be a coroutine approach since other libraries might
not play nicely when it comes to the threading.

On Thu, Jan 21, 2016 at 2:16 AM, Christian Peters notifications@github.com
wrote:

Updates?


Reply to this email directly or view it on GitHub
#51 (comment)
.

@23doors
Copy link

23doors commented Jan 21, 2016

We are using multiprocessing with 1 thread per process. In our case, subclassing Client class and changing queue to JoinableQueue (from multiprocessing module).
Haven't investigated that much, but this may be related to how celery handles multiprocessing (through billiard https://github.com/celery/billiard) and it seems that queue object is shared across processes which is causing issues (as queue.Queue is not multiprocessing-safe, only thread-safe).

@shredding
Copy link

I think a proper solution would be to make the threading part optional. The idea to introduce celery is the same - push the processing away from the request/responce cycle, so it's double-trouble to have both of them in the way.

@calvinfo
Copy link
Contributor

I'd be supportive of that. We could end up passing in an option to force
synchronous calls in the case where you're using a queue or workers.

On Fri, Jan 22, 2016 at 12:46 AM, Christian Peters <notifications@github.com

wrote:

I think a proper solution would be to make the threading part optional.
The idea to introduce celery is the same - push the processing away from
the request/responce cycle, so it's double-trouble to have both of them in
the way.


Reply to this email directly or view it on GitHub
#51 (comment)
.

@kmctown
Copy link

kmctown commented Nov 4, 2016

Hey all, new to segment and looking to use this package from the server. We already have a celery/rabbitmq task queue pattern, and am concerned about bringing this package in based on this thread. Is this still an issue worth worrying about? If so, considering just making standard REST calls?

@shredding
Copy link

shredding commented Nov 4, 2016

we ended up not triggering segment from within workers, but from within the request. Don't like that solution, but this package is not exactly under what you'd call active development.

@kmctown
Copy link

kmctown commented Nov 4, 2016

Gotcha. Thanks for the info, @shredding. No worries about potentially dropping events there? Are you just trusting the internal queueing mechanism?

@shredding
Copy link

Not really ...

@f2prateek
Copy link
Contributor

Merging into #101

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants