-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
create widget for more interactive variable subsetting #52
Comments
Thanks @weiji14 for pointing us towards this example! |
@JessicaS11 I can definitely help out with this |
@tsutterley Excellent! Can one of you (@tsutterley, @liuzheng-arctic , @weiji14) take the lead on this? Regardless, I'd like to continue the discussion here to build a brief summary/description of what the specific goals and steps to get there might be, including some notes about how the new functionality would fit within the existing code and what capabilities, if any, might need to be added to make this work possible. Thanks! |
Question. Do we want the variable subsetting to happen before downloading from NSIDC, or after, as in when the file is already available locally (and we just want to select a few variables out). Either way, the goal might be to have a user friendly way for people to find the data they need, in a visually pleasing way. It would be nice to leverage existing tools like the Intake GUI built on Panel and Bokeh (or at least use them as guides) to do this work, rather than reinventing the wheel from scratch. But I'm sure there's other creative ways to do the same thing 😄 As an aside, I know NSIDC has an OPeNDAP Hyrax server at https://n5eil02u.ecs.nsidc.org/opendap/ATLAS/ATL06.003/ (requires login), which would support variable subsetting (for the subset before download case). The benefit of OPeNDAP is that we won't have to hardcode all of the variables for each of the ATLxx datasets since the metadata is already there (although there is room for improvement). The main issue with that currently is that it's tricky getting the authentication to work properly (see pydap/pydap#188). |
The short answer to your question is: "yes". Ideally, we'd like people to take advantage of subsetting before downloading so that they obtain smaller files (and shift that level of processing to the NSIDC API). Obviously, you may still end up with downloaded files with a few more variables than you will ultimately end up using (or likely want to have loaded in to local memory), so having subsetting after the fact is also useful. To that end, the variables class of icepyx has been set up to easily handle both. The class object contains both available and wanted lists of variables, and functions to add/remove from the wanted list. The difference will be where the available list of variables comes from, and which instance of the variables class you are interacting with through the parent icesat2data object. Currently, there is an attribute for the pre-download variables ( I don't know much about Intake GUI, but I agree that we should leverage existing tools and it looks like a great one to use. My hope was that by setting things up as a variables class it would be relatively easy to plug into a GUI to give users a prettier way of interacting with the variables list. I know captoolkit also has some command line functionality for reducing hdf5 files, so my hope is that we can leverage those scripts to do the local file trimming, with the GUI providing a non-command line method of feeding the required inputs into those scripts. I'd encourage you to have a dig through the icepyx source code, if you haven't already, and check out the associated subsetting example notebook. It sounds like a good next step might be to try playing with integrating the variables class into an Intake GUI. Is that something you would feel comfortable taking on, @weiji14 ? I could add you to the organization's developer team so that you can create a branch to push to. I know @liuzheng-arctic is also very interested in working on this, but both he and I are going to be pretty busy the next two weeks with the University of Washington e-Science Institute's ICESat-2 Cryospheric Hackweek, so please forgive us if we're slow to respond! |
Variable subsetting (before download) is supported through the NSIDC API that icepyx is currently leveraging, and this is something that at least from NSIDC's perspective we promote and support due to the reduction in data volume and processing steps required on the user end. Yes NSIDC also supports our OPeNDAP Hyrax server across our DAAC data holdings but there are current limitations as far as int64 support (though the OPeNDAP team is actively working on that fix) so I encourage icepyx to leverage our API with more robust and heavily tested service options. |
Variable subsetting is fully implemented, but the long list of available variables is cumbersome to interact with (especially for new ICESat-2 data users). JupyterHub enables the creation of widgets, providing users a more interactive interface. This issue marks our desire to create a variable selection widget within icepyx to make the process of variable selection more interactive.
The text was updated successfully, but these errors were encountered: