Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create widget for more interactive variable subsetting #52

Open
JessicaS11 opened this issue May 15, 2020 · 7 comments
Open

create widget for more interactive variable subsetting #52

JessicaS11 opened this issue May 15, 2020 · 7 comments
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed IS2HW_2022 Potential project contributions for the 2022 ICESat-2 hackweek participants

Comments

@JessicaS11
Copy link
Member

Variable subsetting is fully implemented, but the long list of available variables is cumbersome to interact with (especially for new ICESat-2 data users). JupyterHub enables the creation of widgets, providing users a more interactive interface. This issue marks our desire to create a variable selection widget within icepyx to make the process of variable selection more interactive.

@JessicaS11 JessicaS11 added enhancement New feature or request help wanted Extra attention is needed labels May 15, 2020
@weiji14
Copy link
Member

weiji14 commented May 19, 2020

Something like XrViz would be nice. It's for visualizing n-dimensional xarray data, and built on top of HvPlot and Panel. Example taken from their github page:

XrViz dashboard

@JessicaS11
Copy link
Member Author

Thanks @weiji14 for pointing us towards this example!

@tsutterley
Copy link
Member

@JessicaS11 I can definitely help out with this

@JessicaS11
Copy link
Member Author

@tsutterley Excellent! Can one of you (@tsutterley, @liuzheng-arctic , @weiji14) take the lead on this?

Regardless, I'd like to continue the discussion here to build a brief summary/description of what the specific goals and steps to get there might be, including some notes about how the new functionality would fit within the existing code and what capabilities, if any, might need to be added to make this work possible. Thanks!

@weiji14
Copy link
Member

weiji14 commented Jun 3, 2020

Question. Do we want the variable subsetting to happen before downloading from NSIDC, or after, as in when the file is already available locally (and we just want to select a few variables out).

Either way, the goal might be to have a user friendly way for people to find the data they need, in a visually pleasing way. It would be nice to leverage existing tools like the Intake GUI built on Panel and Bokeh (or at least use them as guides) to do this work, rather than reinventing the wheel from scratch. But I'm sure there's other creative ways to do the same thing 😄

As an aside, I know NSIDC has an OPeNDAP Hyrax server at https://n5eil02u.ecs.nsidc.org/opendap/ATLAS/ATL06.003/ (requires login), which would support variable subsetting (for the subset before download case). The benefit of OPeNDAP is that we won't have to hardcode all of the variables for each of the ATLxx datasets since the metadata is already there (although there is room for improvement). The main issue with that currently is that it's tricky getting the authentication to work properly (see pydap/pydap#188).

@JessicaS11
Copy link
Member Author

The short answer to your question is: "yes". Ideally, we'd like people to take advantage of subsetting before downloading so that they obtain smaller files (and shift that level of processing to the NSIDC API). Obviously, you may still end up with downloaded files with a few more variables than you will ultimately end up using (or likely want to have loaded in to local memory), so having subsetting after the fact is also useful. To that end, the variables class of icepyx has been set up to easily handle both. The class object contains both available and wanted lists of variables, and functions to add/remove from the wanted list. The difference will be where the available list of variables comes from, and which instance of the variables class you are interacting with through the parent icesat2data object. Currently, there is an attribute for the pre-download variables (region_a.order_vars) and one for variables from a file (region_a.file_vars), but the latter has not actually been implemented fully yet (for instance, we still need to add functionality to bring in local files).

I don't know much about Intake GUI, but I agree that we should leverage existing tools and it looks like a great one to use. My hope was that by setting things up as a variables class it would be relatively easy to plug into a GUI to give users a prettier way of interacting with the variables list. I know captoolkit also has some command line functionality for reducing hdf5 files, so my hope is that we can leverage those scripts to do the local file trimming, with the GUI providing a non-command line method of feeding the required inputs into those scripts.

I'd encourage you to have a dig through the icepyx source code, if you haven't already, and check out the associated subsetting example notebook.

It sounds like a good next step might be to try playing with integrating the variables class into an Intake GUI. Is that something you would feel comfortable taking on, @weiji14 ? I could add you to the organization's developer team so that you can create a branch to push to. I know @liuzheng-arctic is also very interested in working on this, but both he and I are going to be pretty busy the next two weeks with the University of Washington e-Science Institute's ICESat-2 Cryospheric Hackweek, so please forgive us if we're slow to respond!

@asteiker
Copy link
Collaborator

asteiker commented Jun 3, 2020

Variable subsetting (before download) is supported through the NSIDC API that icepyx is currently leveraging, and this is something that at least from NSIDC's perspective we promote and support due to the reduction in data volume and processing steps required on the user end. Yes NSIDC also supports our OPeNDAP Hyrax server across our DAAC data holdings but there are current limitations as far as int64 support (though the OPeNDAP team is actively working on that fix) so I encourage icepyx to leverage our API with more robust and heavily tested service options.

@JessicaS11 JessicaS11 added the IS2HW_2022 Potential project contributions for the 2022 ICESat-2 hackweek participants label Mar 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed IS2HW_2022 Potential project contributions for the 2022 ICESat-2 hackweek participants
Projects
None yet
Development

No branches or pull requests

5 participants