Support async key retrieval #7

jhamman · 2020-02-25T00:46:28Z

We are currently using FastAPI as xpublish's web framework / api engine. FastAPI supports async out of the box (details here: https://fastapi.tiangolo.com/async/).

Some initial applications indicate that we're not getting the expected async behavior. I suspect this is somehow related to how we're using dask to fetch chunks of data. Here's what we have now:

GET chunk endpoint:

https://github.com/jhamman/xpublish/blob/044e9b7b07202f57302a9368b16c2db4cddaa79a/xpublish/rest.py#L140-L143

Which calls the get_key method:

https://github.com/jhamman/xpublish/blob/044e9b7b07202f57302a9368b16c2db4cddaa79a/xpublish/rest.py#L93-L105

Within get_data_chunk, we call compute() on individual chunks:

https://github.com/jhamman/xpublish/blob/044e9b7b07202f57302a9368b16c2db4cddaa79a/xpublish/rest.py#L254-L255

Is this the best way to do this? Do we need to modify how dask's scheduler see's these tasks or is configured to use async?

The text was updated successfully, but these errors were encountered:

andersy005 · 2020-02-26T18:39:12Z

@TomAugspurger and I are looking into this issue. We changed get_data_chunk to an awaited call without any success. With Tom's help, I will keep looking into this for the next few days.

jhamman · 2020-02-27T05:13:45Z

Thanks @andersy005, much appreciated! I've been thinking it may be worth constructing a new test case for this. The test case we have now has very small chunks and I'm not sure we're effectively exercising the server-side cluster to an extent that would fully illuminate the issue here.

TomAugspurger · 2020-02-27T12:16:13Z

One question I had: what level should the parallelism be happening at? Will it just be at the request level, or will each request be getting results in parallel?

IIUC, with a typical async webserver, you're serving many requests simultaneously on a single thread, since most of the time you're just waiting around on IO.

In this case, a user filters to some subset which is eventually mapped to some set of keys like ['/air/0.0.0', '/air/0.0.1', ...] and they make many requests. So we would hope that those n requests are served in parallel. As a small test, we tried something roughly like the following.

import requests
import dask

responses = [dask.delayed(requests.get, pure=False)("localhost:9000/air/0.0.0") for i in range(100)]
dask.compute(responses)

Anderson can correct me if I'm wrong, but I think we didn't observe any parallelism there.

jhamman · 2020-02-27T20:10:39Z

@TomAugspurger - I think we're on the same page here.

Dask question: Can two threads call chunk_data.compute() in parallel? And will the cluster block on the first before moving on to the second? I suspect the answers are yes and yes. In which case we aren't going to get much parallelization since the cluster will only be computing one chunk at a time.

I was trying to avoid adding any dask-distributed functionality in xpublish but it seems we may need to support something like client.get(..., asynchronous=True). This is a part of the distributed api that I am entirely unfamiliar with so I'm shooting in the dark...

TomAugspurger · 2020-02-29T00:14:33Z

I suspect the answers are yes and yes.

I'm not sure either, but I'd also guess yes and yes.

async clients seem like they might help.

I did have one thing that's confusing me. Why do we ever get a Dask array back in get_data_chunk? Shouldn't that always be a chunk for a single key, so should it just be an ndarray?

andersy005 · 2020-03-04T00:38:08Z

While looking at the dashboard for the server, I am noticing that only worker is active. It is likely that #12 didn't fully address the asynchronous fetching:

andersy005 · 2020-03-04T00:39:28Z

Not sure if this is a sign that we still need an async client on the server side

andersy005 · 2020-03-04T01:44:32Z

I did have one thing that's confusing me. Why do we ever get a Dask array back in get_data_chunk? Shouldn't that always be a chunk for a single key, so should it just be an ndarray?

I believe we are passing an entire variable array to get_data_chunk(), and this could be a numpy or dask array:

https://github.com/jhamman/xpublish/blob/d59221e45d2a46bb922e4d2dac0180bab7c53294/xpublish/rest.py#L101-L104

jhamman · 2020-03-11T21:19:05Z

xref: https://stackoverflow.com/questions/60492963/computing-dask-array-chunks-asynchronously-dask-fastapi

jhamman mentioned this issue Feb 25, 2020

Binder #6

Merged

andersy005 mentioned this issue Feb 27, 2020

Add more tests #10

Merged

lsetiawan mentioned this issue Mar 3, 2020

Fix async clash with dask #12

Merged

jhamman closed this as completed in #12 Mar 3, 2020

jhamman reopened this Mar 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support async key retrieval #7

Support async key retrieval #7

jhamman commented Feb 25, 2020

andersy005 commented Feb 26, 2020

jhamman commented Feb 27, 2020

TomAugspurger commented Feb 27, 2020

jhamman commented Feb 27, 2020

TomAugspurger commented Feb 29, 2020 •

edited

andersy005 commented Mar 4, 2020

andersy005 commented Mar 4, 2020

andersy005 commented Mar 4, 2020

jhamman commented Mar 11, 2020

Support async key retrieval #7

Support async key retrieval #7

Comments

jhamman commented Feb 25, 2020

andersy005 commented Feb 26, 2020

jhamman commented Feb 27, 2020

TomAugspurger commented Feb 27, 2020

jhamman commented Feb 27, 2020

TomAugspurger commented Feb 29, 2020 • edited

andersy005 commented Mar 4, 2020

andersy005 commented Mar 4, 2020

andersy005 commented Mar 4, 2020

jhamman commented Mar 11, 2020

TomAugspurger commented Feb 29, 2020 •

edited