[druid] Fixing issue 3894 multi-processing w/ Gunicorn #3895

john-bodley · 2017-11-17T03:49:16Z

This PR fixes issue #3894.

It seems that using multi-processing for fetching Druid datasource metadata asynchronously fails under a Gunicorn environment. Switching to multi-threading seems to remedy the problem.

Note previously Pool() created only 4 processes and I'm not overly certain why this part of the code needs to leverage multi-processing from a performance standpoint, i.e., I would be supportive of remove the multi-threading/processing logic entirely.

Also it seems like we're using the synchronous PyClient and thus refresh_async seems like a misnomer to me.

to: @mistercrunch @Mogball @xrmx

Mogball · 2017-11-17T04:19:53Z

Multithreaded execution was introduced because refreshing a cluster with many datasources took... a long time. Metadata queries were issued sequentially, but multithreading this makes refreshing lots of datasources much faster.

john-bodley · 2017-11-17T04:23:02Z

Thanks @Mogball for the explanation. Are you opposed to using multi-threading rather than multi-processing? Also could you provide context on the refresh_async method name?

Mogball · 2017-11-17T04:28:39Z

Looks okay to me. I'm pretty sure it should be thread-safe.

refresh_async was the name of the function when it was suppose to be actually asynchronous but I never changed the name afterwards (i.e. a bit of a misnomer).

john-bodley · 2017-11-17T18:57:19Z

@Mogball per @xrmx's comment in the issue this may actually be a Gevent issue. I'm not certain whether we want to disable that or go with the approach outlined in this PR, i.e., moving from multi-processing to multi-threading.

xrmx · 2017-11-17T20:39:22Z

If you are just waiting on druid IO threads instead of processes should be fine. To avoid the possible creation of ton of threads you can initialize the pool outside the function so the number is fixed.

john-bodley · 2017-11-18T00:59:38Z

@mistercrunch @Mogball @xrmx I'm not overly familiar with this portion of the code base, and the interactions with Gevent, I'm merely trying to fix an issue which is plaguing us in production.

I can't fathom the scale @Mogball of your Druid datasources, but for us this request takes mere seconds and thus I'm unsure whether the 4x speedup (maximum) of a multi-threaded environment outweighs the additional complexity, potential thread-safeness etc.

Mogball · 2017-11-18T01:13:45Z

The speed up is more than just 4x (or whatever the number of available threads). Most of the time is spent waiting for Druid to respond, which means that the metadata requests are issued more or less simultaneously for each datasource, and then processed when they all start coming back. This makes a difference of like 40 seconds to 3 seconds (when hard refreshing all datasources).

This should be thread-safe since there is no interaction outside of the datasource object and with the Superset backend during refresh.

mistercrunch · 2017-11-18T04:42:50Z

Heads up that there are caveats with thread-safety around SQLAlchemy

Mogball · 2017-11-18T05:04:47Z

latest_metadata doesn't make any calls to SQLAlchemy 👍

[druid] Fixing issue 3894 multi-processing w/ Gunicorn

3f20d60

mistercrunch merged commit 4bfe08d into apache:master Nov 19, 2017

michellethomas pushed a commit to michellethomas/panoramix that referenced this pull request May 24, 2018

[druid] Fixing issue 3894 multi-processing w/ Gunicorn (apache#3895)

e581fa1

wenchma pushed a commit to wenchma/incubator-superset that referenced this pull request Nov 16, 2018

[druid] Fixing issue 3894 multi-processing w/ Gunicorn (apache#3895)

295f615

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.21.0 labels Feb 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[druid] Fixing issue 3894 multi-processing w/ Gunicorn #3895

[druid] Fixing issue 3894 multi-processing w/ Gunicorn #3895

john-bodley commented Nov 17, 2017 •

edited

Mogball commented Nov 17, 2017

john-bodley commented Nov 17, 2017

Mogball commented Nov 17, 2017

john-bodley commented Nov 17, 2017

xrmx commented Nov 17, 2017 •

edited

john-bodley commented Nov 18, 2017

Mogball commented Nov 18, 2017 •

edited

mistercrunch commented Nov 18, 2017

Mogball commented Nov 18, 2017

[druid] Fixing issue 3894 multi-processing w/ Gunicorn #3895

[druid] Fixing issue 3894 multi-processing w/ Gunicorn #3895

Conversation

john-bodley commented Nov 17, 2017 • edited

Mogball commented Nov 17, 2017

john-bodley commented Nov 17, 2017

Mogball commented Nov 17, 2017

john-bodley commented Nov 17, 2017

xrmx commented Nov 17, 2017 • edited

john-bodley commented Nov 18, 2017

Mogball commented Nov 18, 2017 • edited

mistercrunch commented Nov 18, 2017

Mogball commented Nov 18, 2017

john-bodley commented Nov 17, 2017 •

edited

xrmx commented Nov 17, 2017 •

edited

Mogball commented Nov 18, 2017 •

edited