## Subsetting ICESat-2 Data with the NSIDC Subsetter
### How to Use the NSIDC Subsetter Example Notebook
This notebook illustrates the use of icepyx for subsetting ICESat-2 data ordered through the NSIDC DAAC. We'll show how to find out what subsetting options are available and how to specify the subsetting options for your order.

For more information on using icepyx to find, order, and download data, see our complimentary [ICESat-2_DAAC_DataAccess_Example Notebook](https://github.com/icesat2py/icepyx/blob/master/doc/examples/ICESat-2_DAAC_DataAccess_Example.ipynb).

Questions? Be sure to check out the FAQs throughout this notebook, indicated as italic headings.

#### Credits
* notebook by: Jessica Scheick and Zheng Liu
* some source material: [NSIDC Data Access Notebook](https://github.com/ICESAT-2HackWeek/ICESat2_hackweek_tutorials/tree/master/03_NSIDCDataAccess_Steiker) by Amy Steiker and Bruce Wallin

### _What is SUBSETTING anyway?_

Anyone who's worked with geospatial data has probably encountered subsetting. Typically, we search for data wherever it is stored and download the chunks (aka granules, scenes, passes, swaths, etc.) that contain something we are interested in. Then, we have to extract from each chunk the pieces we actually want to analyze. Those pieces might be geospatial (i.e. an area of interest), temporal (i.e. certain months of a time series), and/or certain variables. This process of extracting the data we are going to use is called subsetting.

In the case of ICESat-2 data coming from the NSIDC DAAC, we can do this subsetting step on the data prior to download, reducing our number of data processing steps and resulting in smaller, faster downloads and storage.

### Import packages, including icepyx

In [4]:
from icepyx import icesat2data as ipd

import numpy as np
import xarray as xr
import pandas as pd

import h5py
import os,json
from pprint import pprint

In [56]:
%load_ext autoreload
from icepyx import icesat2data as ipd
%autoreload 2
#in order to use "as ipd", you have to use autoreload 2, which will automatically reload any module not excluded by being imported with %aimport -[module]

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


### Create an icesat2data object and log in to Earthdata

For this example, we'll be working with a sea ice dataset (ATL09) for an area along West Greenland (Disko Bay).

In [5]:
region_a = ipd.Icesat2Data('ATL09',[-55, 68, -48, 71],['2019-02-22','2019-02-28'], \
                           start_time='00:00:00', end_time='23:59:59')



In [None]:
region_a.earthdata_login('username','email')

In [6]:
region_a.earthdata_login('jessica.scheick','jessica.scheick@maine.edu')

Earthdata Login password:  ········


### Discover Subsetting Options

You can see what subsetting options are available for a given dataset by calling `show_custom_options()`. The options are presented as a series of headings followed by available values in square brackets. Headings are:
* **Subsetting Options**: whether or not temporal and spatial subsetting are available for the dataset
* **Data File Formats (Reformatting Options)**: return the data in a format other than the native hdf5 (submitted as a key=value kwarg to `order_granules(format='NetCDF4-CF')`)
* **Data File (Reformatting) Options Supporting Reprojection**: return the data in a reprojected reference frame. These will be available for gridded ICESat-2 L3B datasets.
* **Data File (Reformatting) Options NOT Supporting Reprojection**: data file formats that cannot be delivered with reprojection
* **Data Variables (also Subsettable)**: a dictionary of variable name keys and the paths to those variables available in the dataset

In [7]:
region_a.show_custom_options(dictview=True)

Subsetting options
[{'id': 'ICESAT2',
  'maxGransAsyncRequest': '2000',
  'maxGransSyncRequest': '100',
  'spatialSubsetting': 'true',
  'spatialSubsettingShapefile': 'true',
  'temporalSubsetting': 'true',
  'type': 'both'}]
Data File Formats (Reformatting Options)
['TABULAR_ASCII', 'NetCDF4-CF', 'NetCDF-3']
Reprojection Options
[]
Data File (Reformatting) Options Supporting Reprojection
['TABULAR_ASCII', 'NetCDF4-CF', 'NetCDF-3', 'No reformatting']
Data File (Reformatting) Options NOT Supporting Reprojection
[]
Data Variables (also Subsettable)
{'a_m1': ['ancillary_data/atmosphere/a_m1'],
 'a_m2': ['ancillary_data/atmosphere/a_m2'],
 'aclr_true': ['profile_1/high_rate/aclr_true',
               'profile_2/high_rate/aclr_true',
               'profile_3/high_rate/aclr_true'],
 'aclr_use_atlas': ['ancillary_data/atmosphere/aclr_use_atlas'],
 'alpha_day_pce1': ['ancillary_data/atmosphere/alpha_day_pce1'],
 'alpha_day_pce2': ['ancillary_data/atmosphere/alpha_day_pce2'],
 'alpha_day_pce3'

By default, spatial and temporal subsetting based on your initial inputs is applied to your order unless you specify `subset=False` to `order_granules()` or `download_granules()`. Additional subsetting options must be specified as keyword arguments to the order/download functions.

Although some file format conversions and reprojections are possible using the `format`, `projection`,and `projection_parameters` keywords, the rest of this tutorial will focus on variable subsetting, which is provided with the `Coverage` keyword.

### _Why do I have to provide spatial bounds to icepyx even if I don't use them to subset my data order?_

Because they're needed for the metadata search. The spatial information you provide is what determines which granules might contain data over your area of interest. Even if you download entire granules, you still need to provide some limits on what geographic area you'd like data for.

## About Data Variables in an Icesat2Data object
There are two possible variable parameters associated with each ```icesat2data``` object.
1. `order_vars`, which is for interacting with variables during data querying, ordering, and downloading activities. `order_vars.wanted` holds the user's list to be submitted to the NSIDC subsetter and download a smaller, reproducible dataset.
2. `file_vars`, which is for interacting with variables associated with local files [not yet implemented].

Each variables parameter (which is actually an associated Variables class object) has methods to:
* get available variables, either available from the NSIDC or the file (`get_avail()` method).
* append new variables to the wanted list (`append()` method).
* remove variables from the wanted list (`remove()` method).

Each variables instance also has a set of attributes, including `avail` and `wanted` to indicate the list of variables that is available (unmutable, or unchangeable, as it is based on the input dataset specifications or files) and the list of variables that the user would like extracted (updateable with the `append` and `remove` methods), respectively. We'll showcase the use of all of these methods and attributes below.

### ICESat-2 data variables
ICESat-2 data is natively stored in a nested file format called hdf5. Much like a directory-file system on a computer, each variable (file) has a unique path through the heirarchy (directories) within the file. Thus, some variables (e.g. 'lat', 'lon') have multiple paths (one for each of the six beams in most datasets). 

To increase readability, some display options (2 and 3, below) show the 200+ variable + path combinations as a dictionary where the keys are variable names and the values are the paths to that variable.

### Determine what variables are available
There are multiple ways to get a complete list of available variables.

1. `region_a.order_vars.avail`, a list of all valid path+variable strings
2. `region_a.show_custom_options(dictview=True)`, all available subsetting options
3. `region_a.order_vars.parse_var_list(region_a.order_vars.avail)`, a dictionary of variable:paths key:value pairs

In [8]:
region_a.order_vars.avail

['ds_surf_type',
 'ancillary_data/atlas_sdp_gps_epoch',
 'ancillary_data/control',
 'ancillary_data/data_end_utc',
 'ancillary_data/data_start_utc',
 'ancillary_data/end_cycle',
 'ancillary_data/end_delta_time',
 'ancillary_data/end_geoseg',
 'ancillary_data/end_gpssow',
 'ancillary_data/end_gpsweek',
 'ancillary_data/end_orbit',
 'ancillary_data/end_region',
 'ancillary_data/end_rgt',
 'ancillary_data/granule_end_utc',
 'ancillary_data/granule_start_utc',
 'ancillary_data/qa_at_interval',
 'ancillary_data/release',
 'ancillary_data/start_cycle',
 'ancillary_data/start_delta_time',
 'ancillary_data/start_geoseg',
 'ancillary_data/start_gpssow',
 'ancillary_data/start_gpsweek',
 'ancillary_data/start_orbit',
 'ancillary_data/start_region',
 'ancillary_data/start_rgt',
 'ancillary_data/version',
 'ancillary_data/atmosphere/aclr_use_atlas',
 'ancillary_data/atmosphere/alpha_day_pce1',
 'ancillary_data/atmosphere/alpha_day_pce2',
 'ancillary_data/atmosphere/alpha_day_pce3',
 'ancillary_dat

### _Why not just download all the data and subset locally? What if I need more variables/granules?_

Taking advantage of the NSIDC subsetter is a great way to reduce your download size and thus your download time and the amount of storage required, especially if you're storing your data locally during analysis. By downloading your data using `icepyx`, it is easy to go back and get additional data with the same, similar, or different parameters (e.g. you can keep the same spatial and temporal bounds but change the variable list). Related tools (e.g. [`captoolkit`](https://github.com/fspaolo/captoolkit)) will let you easily merge files if you're uncomfortable merging them during read-in for processing.

### Building your wanted variable list

Now that you know which variables are available for your dataset, you need to build a list of the ones you'd like included in your dataset. There are several options for generating your initial list as well as modifying it, giving the user complete control over the list submitted.

The options for building your initial list are:
1. Use a default list for the dataset (not yet fully implemented across all datasets. Have a default variable list for your field/dataset? Submit a pull request or post it as an issue on GitHub!)
2. Provide a list of variable names
3. Provide a list of profiles/beams or other path keywords, where "keywords" are simply the unique subdirectory names contained in the full variable paths of the dataset. A full list of available keywords for the dataset is displayed in the error message upon entering `keywords=['']` into the function (see below for an example)

Note: all datasets have a short list of "mandatory" variables/paths (containing spacecraft orientation and time information needed to convert the data's `delta_time` to a readable datetime) that are automatically added to any built list. If you have any recommendations for other variables that should always be included (e.g. uncertainty information), please let us know!

Examples of using each method to build and modify your wanted variable list are below.

In [9]:
region_a.order_vars.wanted

In [10]:
region_a.order_vars.append(defaults=True)
pprint(region_a.order_vars.wanted)

{'apparent_surf_reflec': ['profile_1/high_rate/apparent_surf_reflec',
                          'profile_2/high_rate/apparent_surf_reflec',
                          'profile_3/high_rate/apparent_surf_reflec'],
 'atlas_sdp_gps_epoch': ['ancillary_data/atlas_sdp_gps_epoch'],
 'bsnow_con': ['profile_1/high_rate/bsnow_con',
               'profile_1/low_rate/bsnow_con',
               'profile_2/high_rate/bsnow_con',
               'profile_2/low_rate/bsnow_con',
               'profile_3/high_rate/bsnow_con',
               'profile_3/low_rate/bsnow_con'],
 'bsnow_dens': ['profile_1/high_rate/bsnow_dens',
                'profile_2/high_rate/bsnow_dens',
                'profile_3/high_rate/bsnow_dens'],
 'bsnow_h': ['profile_1/high_rate/bsnow_h',
             'profile_1/low_rate/bsnow_h',
             'profile_2/high_rate/bsnow_h',
             'profile_2/low_rate/bsnow_h',
             'profile_3/high_rate/bsnow_h',
             'profile_3/low_rate/bsnow_h'],
 'bsnow_od': ['profile_1/h

The keywords available for this dataset are shown in the error message upon entering a blank keyword_list, as seen in the next cell.

In [11]:
region_a.order_vars.append(keyword_list=[''])

ValueError: Invalid keyword: . Please select from this list: ancillary_data, atmosphere, bckgrd_atlas, high_rate, low_rate, none, orbit_info, profile_1, profile_2, profile_3, quality_assessment

### Modifying your wanted variable list

Generating and modifying your variable request list, which is stored in `region_a.order_vars.wanted`, is controlled by the `append` and `remove` functions that operate on `region_a.order_vars.wanted`. The input options to `append` are as follows (the full documentation for this function can be found by executing `help(region_a.order_vars.append)`).
* `defaults` (default False) - include the default variable list for your dataset (not yet fully implemented for all datasets; please submit your default variable list for inclusion!)
* `inclusive` (default False) - include ALL path/variable combinations that include ANY of the given variable, keyword, or beam list inputs. For instance, if you want to add longitude for only profile 1, you would specify inputs to `append` as `(var_list=['longitude'], beam_list=['profile_1'])`. If you instead specified `(var_list=['longitude'], beam_list=['profile_1'], inclusive=True)`, you would add ALL `longitude` variables and ALL path/variable combinations that included `profile_1`.

##### <Zheng comments:
x1. I added some short title for each examples. Hopefully, they are what you want to showcase. Remove them if you feel they are redundant.   
x2. Ex. 1: `defaults=False` is not necessary unless you want to show it is the default value. 
x3. Ex. 5: typo? remove high rate or low rate?
x4. Ex. 6&7: I added `sc_orient_time` to the nec_list in the code (maybe consider moving this list away as well?), it should be used together with `sc_orient` together to determine the spacecraft orientation. How about using `rgt` (the reference ground track number here instead? I also added a little more explanation at the beginning of Ex. 6. 
5. Ex. 8: as you mentioned, this results can be achived by calling append twice. To me, it is less confusing that way. I like the idea to remove the `inclusive` option, if that is the direction you want to go. If you decide to keep this option in the end, we have to demonstrate how it should be used and what happens if it is accidentaly turn on.  
6. I like the idea of showing all the keyword the users can choose from. We should mention that the printed list of keys in the error message above includes both keywords and beam/profile ids. In future, we may want to use a function to show the options rather than using the error message. 
7. I moved my comments regarding the mandatory bounding box here. Just to keep things together. Can we change the code a little bit so that if without any bbox input, the default will cover the whole globe? It is probably more user friendly than asking the user to provide the bbox for that to download the whole granule.  
8. Another code related question. The function `order_granules` still needs the kwarg input `Coverage`. Can we put it under the hood as well and use the flag `subset=True` instead? Or do you want to keep the possibility open that users may construct their own Coverage string?
             
            
             
##### Zheng comments>
    
  
* `var_list` (default None) - list of variables (entered as strings)
* `beam_list` (default None) - list of beams/profiles (entered as strings)
* `keyword_list` (default None) - list of keywords (entered as strings); use `keyword_list=['']` to obtain a list of available keywords

Similarly, the options for `remove` are:
* `all` (default False) - reset `region_a.order_vars.wanted` to None
* `inclusive` (as above)
* `var_list` (as above)
* `beam_list` (as above)
* `keyword_list` (as above)

In [12]:
region_a.order_vars.remove(all=True)
pprint(region_a.order_vars.wanted)

None


### Examples
Below are a series of examples to show how you can use `append` and `remove` to modify your wanted variable list. For clarity, `region_a.order_vars.wanted` is cleared at the start of many examples. However, multiple `append` and `remove` commands can be called in succession to build your wanted variable list (see Examples 3+)

#### Example 1: choose variables
Add all `latitude` and `longitude` variables

In [13]:
region_a.order_vars.append(var_list=['latitude','longitude'])
pprint(region_a.order_vars.wanted)

{'atlas_sdp_gps_epoch': ['ancillary_data/atlas_sdp_gps_epoch'],
 'data_end_utc': ['ancillary_data/data_end_utc'],
 'data_start_utc': ['ancillary_data/data_start_utc'],
 'end_delta_time': ['ancillary_data/end_delta_time'],
 'granule_end_utc': ['ancillary_data/granule_end_utc'],
 'granule_start_utc': ['ancillary_data/granule_start_utc'],
 'latitude': ['profile_1/high_rate/latitude',
              'profile_1/low_rate/latitude',
              'profile_2/high_rate/latitude',
              'profile_2/low_rate/latitude',
              'profile_3/high_rate/latitude',
              'profile_3/low_rate/latitude'],
 'longitude': ['profile_1/high_rate/longitude',
               'profile_1/low_rate/longitude',
               'profile_2/high_rate/longitude',
               'profile_2/low_rate/longitude',
               'profile_3/high_rate/longitude',
               'profile_3/low_rate/longitude'],
 'sc_orient': ['orbit_info/sc_orient'],
 'start_delta_time': ['ancillary_data/start_delta_time']}


#### Example 2: specify beams/profiles and variable
Add `latitude` for only `profile_1` and `profile_2`

In [57]:
region_a.order_vars.remove(all=True)
pprint(region_a.order_vars.wanted)

None


In [58]:
var_dict = region_a.order_vars.append(beam_list=['profile_1','profile_2'], var_list=['latitude'])
pprint(region_a.order_vars.wanted)

{'atlas_sdp_gps_epoch': ['ancillary_data/atlas_sdp_gps_epoch'],
 'data_end_utc': ['ancillary_data/data_end_utc'],
 'data_start_utc': ['ancillary_data/data_start_utc'],
 'end_delta_time': ['ancillary_data/end_delta_time'],
 'granule_end_utc': ['ancillary_data/granule_end_utc'],
 'granule_start_utc': ['ancillary_data/granule_start_utc'],
 'latitude': ['profile_1/high_rate/latitude',
              'profile_1/low_rate/latitude',
              'profile_2/high_rate/latitude',
              'profile_2/low_rate/latitude'],
 'sc_orient': ['orbit_info/sc_orient'],
 'start_delta_time': ['ancillary_data/start_delta_time']}


#### Example 3: add/remove selected beams+variables
Add `latitude` for `profile_3` and remove it for `profile_2`

In [59]:
region_a.order_vars.append(beam_list=['profile_3'],var_list=['latitude'])
region_a.order_vars.remove(beam_list=['profile_2'], var_list=['latitude'])
pprint(region_a.order_vars.wanted)

{'atlas_sdp_gps_epoch': ['ancillary_data/atlas_sdp_gps_epoch'],
 'data_end_utc': ['ancillary_data/data_end_utc'],
 'data_start_utc': ['ancillary_data/data_start_utc'],
 'end_delta_time': ['ancillary_data/end_delta_time'],
 'granule_end_utc': ['ancillary_data/granule_end_utc'],
 'granule_start_utc': ['ancillary_data/granule_start_utc'],
 'latitude': ['profile_1/high_rate/latitude',
              'profile_1/low_rate/latitude',
              'profile_3/high_rate/latitude',
              'profile_3/low_rate/latitude'],
 'sc_orient': ['orbit_info/sc_orient'],
 'start_delta_time': ['ancillary_data/start_delta_time']}


#### Example 4: `keyword_list`
Add `latitude` for all profiles and with keyword `low_rate`

In [60]:
region_a.order_vars.append(var_list=['latitude'],keyword_list=['low_rate'])
pprint(region_a.order_vars.wanted)

{'atlas_sdp_gps_epoch': ['ancillary_data/atlas_sdp_gps_epoch'],
 'data_end_utc': ['ancillary_data/data_end_utc'],
 'data_start_utc': ['ancillary_data/data_start_utc'],
 'end_delta_time': ['ancillary_data/end_delta_time'],
 'granule_end_utc': ['ancillary_data/granule_end_utc'],
 'granule_start_utc': ['ancillary_data/granule_start_utc'],
 'latitude': ['profile_1/high_rate/latitude',
              'profile_1/low_rate/latitude',
              'profile_3/high_rate/latitude',
              'profile_3/low_rate/latitude',
              'profile_2/low_rate/latitude'],
 'sc_orient': ['orbit_info/sc_orient'],
 'start_delta_time': ['ancillary_data/start_delta_time']}


#### Example 5: target a specific variable + path
Remove `'profile_1/high_rate/latitude'` (but keep `'profile_3/high_rate/latitude'`)

In [61]:
region_a.order_vars.remove(beam_list=['profile_1'], var_list=['latitude'], keyword_list=['high_rate'])
pprint(region_a.order_vars.wanted)

ValueError: list.remove(x): x not in list

#### Example 6: add variables not specific to beams/profiles
Add `rgt` under `orbit_info`.

In [62]:
region_a.order_vars.append(keyword_list=['orbit_info'],var_list=['rgt'])
pprint(region_a.order_vars.wanted)

{'atlas_sdp_gps_epoch': ['ancillary_data/atlas_sdp_gps_epoch'],
 'data_end_utc': ['ancillary_data/data_end_utc'],
 'data_start_utc': ['ancillary_data/data_start_utc'],
 'end_delta_time': ['ancillary_data/end_delta_time'],
 'granule_end_utc': ['ancillary_data/granule_end_utc'],
 'granule_start_utc': ['ancillary_data/granule_start_utc'],
 'latitude': ['profile_1/low_rate/latitude',
              'profile_3/high_rate/latitude',
              'profile_3/low_rate/latitude',
              'profile_2/low_rate/latitude'],
 'rgt': ['orbit_info/rgt'],
 'sc_orient': ['orbit_info/sc_orient'],
 'start_delta_time': ['ancillary_data/start_delta_time']}


#### Example 7: add all variables+paths of a group (using `inclusive` flag)
In addition to adding specific variables and paths, we can filter all variables with a specific keyword as well. Here, we add all variables under `orbit_info`. Note that paths already in `region_a.order_vars.wanted`, such as `'orbit_info/rgt'`, are not duplicated.

In [63]:
region_a.order_vars.append(keyword_list=['orbit_info'],inclusive=True)
pprint(region_a.order_vars.wanted)

{'atlas_sdp_gps_epoch': ['ancillary_data/atlas_sdp_gps_epoch'],
 'crossing_time': ['orbit_info/crossing_time'],
 'cycle_number': ['orbit_info/cycle_number'],
 'data_end_utc': ['ancillary_data/data_end_utc'],
 'data_start_utc': ['ancillary_data/data_start_utc'],
 'end_delta_time': ['ancillary_data/end_delta_time'],
 'granule_end_utc': ['ancillary_data/granule_end_utc'],
 'granule_start_utc': ['ancillary_data/granule_start_utc'],
 'lan': ['orbit_info/lan'],
 'latitude': ['profile_1/low_rate/latitude',
              'profile_3/high_rate/latitude',
              'profile_3/low_rate/latitude',
              'profile_2/low_rate/latitude'],
 'orbit_number': ['orbit_info/orbit_number'],
 'rgt': ['orbit_info/rgt'],
 'sc_orient': ['orbit_info/sc_orient'],
 'sc_orient_time': ['orbit_info/sc_orient_time'],
 'start_delta_time': ['ancillary_data/start_delta_time']}


#### Example 8: add all possible values for variables+paths (using `inclusive` flag)
Append all `longitude` paths and all variables/paths with keyword `high_rate`

In [64]:
region_a.order_vars.append(var_list=['longitude'],keyword_list=['high_rate'], inclusive=True)
pprint(region_a.order_vars.wanted)

{'atlas_sdp_gps_epoch': ['ancillary_data/atlas_sdp_gps_epoch'],
 'crossing_time': ['orbit_info/crossing_time'],
 'cycle_number': ['orbit_info/cycle_number'],
 'data_end_utc': ['ancillary_data/data_end_utc'],
 'data_start_utc': ['ancillary_data/data_start_utc'],
 'end_delta_time': ['ancillary_data/end_delta_time'],
 'granule_end_utc': ['ancillary_data/granule_end_utc'],
 'granule_start_utc': ['ancillary_data/granule_start_utc'],
 'lan': ['orbit_info/lan'],
 'latitude': ['profile_1/low_rate/latitude',
              'profile_3/high_rate/latitude',
              'profile_3/low_rate/latitude',
              'profile_2/low_rate/latitude'],
 'longitude': ['profile_1/high_rate/longitude',
               'profile_2/high_rate/longitude',
               'profile_3/high_rate/longitude'],
 'orbit_number': ['orbit_info/orbit_number'],
 'rgt': ['orbit_info/rgt'],
 'sc_orient': ['orbit_info/sc_orient'],
 'sc_orient_time': ['orbit_info/sc_orient_time'],
 'start_delta_time': ['ancillary_data/start_delta

#### Example 9: remove all variables+paths associated with a beam (using `inclusive` flag)
Remove all paths for `profile_1` and `profile_3`

In [43]:
region_a.order_vars.remove(beam_list=['profile_1','profile_3'], inclusive=True)
pprint(region_a.order_vars.wanted)

TypeError: argument of type 'NoneType' is not iterable

#### Example 10: generate a default list for the rest of the tutorial
Generate a reasonable variable list prior to download

In [44]:
region_a.order_vars.remove(all=True)
region_a.order_vars.append(defaults=True)
pprint(region_a.order_vars.wanted)

{'apparent_surf_reflec': ['profile_1/high_rate/apparent_surf_reflec',
                          'profile_2/high_rate/apparent_surf_reflec',
                          'profile_3/high_rate/apparent_surf_reflec'],
 'atlas_sdp_gps_epoch': ['ancillary_data/atlas_sdp_gps_epoch'],
 'bsnow_con': ['profile_1/high_rate/bsnow_con',
               'profile_1/low_rate/bsnow_con',
               'profile_2/high_rate/bsnow_con',
               'profile_2/low_rate/bsnow_con',
               'profile_3/high_rate/bsnow_con',
               'profile_3/low_rate/bsnow_con'],
 'bsnow_dens': ['profile_1/high_rate/bsnow_dens',
                'profile_2/high_rate/bsnow_dens',
                'profile_3/high_rate/bsnow_dens'],
 'bsnow_h': ['profile_1/high_rate/bsnow_h',
             'profile_1/low_rate/bsnow_h',
             'profile_2/high_rate/bsnow_h',
             'profile_2/low_rate/bsnow_h',
             'profile_3/high_rate/bsnow_h',
             'profile_3/low_rate/bsnow_h'],
 'bsnow_od': ['profile_1/h

## Applying variable subsetting to your order and download

In order to have your wanted variable list included with your order, you must pass it as a keyword argument to the `order_granules` or `download_granules` (which calls `order_granules` under the hood if you have not already placed your order) function.

In [None]:
region_a.order_granules(Coverage=region_a.order_vars.wanted)

In [None]:
region_a.download_granules('/home/jovyan/icepyx/dev-notebooks/vardata') # <-- you do not need to include the 'Coverage' kwarg to
                             # download if you have already submitted it with your order

### _Why does the subsetter say no matching data was found?_
Sometimes, chunks (granules) returned in our initial search end up not containing any data in our specified area of interest. This is because the initial search is completed using summary metadata for a chunk. You've likely encountered this before when viewing available imagery online: your spatial search turns up a bunch of images with only a few border or corner pixels, maybe even in no data regions, in your area of interest. Thus, when you go to extract the data from the area you want (i.e. spatially subset it), you don't get any usable data from that image.

## Check the variable list in your downloaded file

Compare the available variables associated with the full dataset relative to those in your downloaded data file.

In [None]:
# put the full filepath to a data file here. You can get this in JupyterHub by navigating to the file,
# right clicking, and selecting copy path. Then you can paste the path in the quotes below.
fn = ''

#### Check the downloaded dataset
Get all `latitude` variables in your downloaded file:

In [None]:
varname = 'latitude'

varlist = []
def IS2h5walk(vname, h5node):
    if isinstance(h5node, h5py.Dataset):
        varlist.append(vname)
    return 

with h5py.File(fn,'r') as h5pt:
    h5pt.visititems(IS2h5walk)
    
for tvar in varlist:
    vpath,vn = os.path.split(tvar)
    if vn==varname: print(tvar) 

#### Compare to the variable paths available in the original data

In [None]:
region_a.order_vars.parse_var_list(region_a.order_vars.avail)[0][varname]