Automatic retrieval of multiple regions #100

facusapienza21 · 2020-07-09T05:57:42Z

the problem

The current implementation of icepyx is such that we need to access different regions independently, login into Earthdata, and then download individual h5 files for each request. Even when it is possible to use a for loop to make all these requests (see https://github.com/ICESAT-2HackWeek/data-access/blob/master/ICESat-2Hackweek_tutorial_locations.ipynb), this has the difficulties of (i) having to login into Earthdata for each individual request, (ii) having to download each request in individuals h5 files.

a sketch of a solution

We (@fperez, @lheagy, @espg, @tsnow03, @mrsiegfried, @alicecima, @jonathan-taylor) think that there is a way to partially bypass (i) if we store our credentials, but even so, we have to make multiple calls to NSIDC, which is time-consuming, and it does not solve (ii). Then, the bottleneck appears to be at the NSIDC API level (@asteiker May have some ideas here?), not just in the icepyx code.

A different workflow could be something like this:

import icepyx
icepyx.login(email, password)
request_list = []
for lat, lon, dat:
    request_list.append(
        icepyx.request(polygon(lat, lon, date))
    )
# this should do some smart parsing - figuring out 
# which files have common data 
data = icepyx.request(request_list)
# first loop: metadata query 
#   loop over and figure out which h5 files are needed
# then only request needed files

Here, we download the required h5 files with one single call to NSIDC and this is implemented efficiently such that different regions could be stored in the same h5 file. This will be an important contribution to the case where we want to take a look at ATL03 data in many different localized regions in a large area without having to retrieve the full dataset for the large area.

I am aware that there are many challenges in solving this problem, but it could be a great contribution to icepyx and I am happy to help on this front.

JessicaS11 added enhancement New feature or request help wanted Extra attention is needed labels Jul 9, 2020

JessicaS11 assigned facusapienza21 and asteiker Jul 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatic retrieval of multiple regions #100

Automatic retrieval of multiple regions #100

facusapienza21 commented Jul 9, 2020 •

edited

Loading

Automatic retrieval of multiple regions #100

Automatic retrieval of multiple regions #100

Comments

facusapienza21 commented Jul 9, 2020 • edited Loading

the problem

a sketch of a solution

facusapienza21 commented Jul 9, 2020 •

edited

Loading