# Introduction

This notebook is for accessing MODIS data related to snow cover using the Python package earthaccess. We will be accessing the MOD10A1F which is a Level 3 product giving snow cover at 500m resolution. The "F" means that grid cells in MOD10A1 (original snow cover product) that are obscured by cloud cover are filled by retaining clear-sky views of the surface from previous days. More on this product can be found [here](https://nsidc.org/data/mod10a1f/versions/61). Much of this code and its documentation is adapted from the [data access tutorials](https://github.com/snowex-hackweek/website-2024/tree/main/book/tutorials/Data_access) from the 2024 UW Hackweek. 

----------------------------------------------------------------------------------------------------------
Notebook by Lexi Arlen

August 2024

In [8]:
# For searching and accessing NASA data
import earthaccess

# For reading data, analysis and plotting
import xarray as xr
import hvplot.xarray

import pprint  # For nice printing of python objects

import logging

# Authentication
To access data you need to log in with your earthdata account (you may search without logging in). If you don't have one, you can make an account [here](https://urs.earthdata.nasa.gov/users/new).

In [11]:
auth = earthaccess.login()

Enter your Earthdata Login username:  alexisarlen
Enter your Earthdata password:  ········


Next, we'll search for datasets containing the MODIS snow cover product *MOD10A1F*.

# Querying available datasets

In [12]:
query = earthaccess.search_datasets(keyword="MOD10A1F",)

Print the first 10 lines from the query, comprising the first two datasets.

In [15]:
for collection in query[:10]:
    pprint.pprint(collection.summary(), sort_dicts=True, indent=4)
    print('')  # Add a space between collections for readability

{   'concept-id': 'C1646609734-NSIDC_ECS',
    'file-type': "[{'FormatType': 'Native', 'Format': 'HDF-EOS2', "
                 "'FormatDescription': 'HTTPS'}]",
    'get-data': [   'http://nsidc.org/daac/subscriptions.html',
                    'https://n5eil01u.ecs.nsidc.org/MOST/MOD10A1F.061/',
                    'https://search.earthdata.nasa.gov/search?q=MOD10A1F+V061',
                    'https://nsidc.org/data/data-access-tool/MOD10A1F/versions/61/'],
    'short-name': 'MOD10A1F',
    'version': '61'}

{   'concept-id': 'C2909924695-NSIDC_ECS',
    'file-type': "[{'FormatType': 'Native', 'Format': 'NetCDF', "
                 "'FormatDescription': 'HTTPS'}]",
    'get-data': [   'https://n5eil01u.ecs.nsidc.org/MOST/NSIDC-0791.001/',
                    'https://search.earthdata.nasa.gov/search?q=NSIDC-0791+V001',
                    'https://nsidc.org/data/data-access-tool/NSIDC-0791/versions/1/'],
    'short-name': 'NSIDC-0791',
    'version': '1'}



For each collection, `summary` returns a subset of fields from the collection metadata and Unified Metadata Model (UMM) entry.

- `concept-id` is an unique identifier for the collection that is composed of a alphanumeric code and the provider-id for the DAAC.
- `short-name` is the name of the dataset that appears on the dataset set landing page. For ICESat-2, `ShortNames` are generally how different products are referred to.
- `version` is the version of each collection.
- `file-type` gives information about the file format of the collection files.
- `get-data` is a collection of URL that can be used to access data, dataset landing pages, and tools.  

For _cloud-hosted_ data, there is additional information about the location of the S3 bucket that holds the data and where to get credentials to access the S3 buckets.  In general, you don't need to worry about this information because `earthaccess` handles S3 credentials for you.  Nevertheless it may be useful for troubleshooting. If you only want to search for data in the cloud, you add the additional argument `cloud_hosted=True`.

We see that the first dataset in the query is the one we're looking for. Now, lets access the data. 

# Accessing Data
Above is a query for datatsets, but in general, we want to look over a specefic timeframe and region. The temporal range is identified with standard date strings, and latitude-longitude corners of a bounding box is specified.  Polygons and points, as well as shapefiles can also be specified.

In [None]:
results = earthaccess.search_data(
    short_name = 'ATL06',
    version = '006',
    cloud_hosted = True,
    bounding_box = (-134.7,58.9,-133.9,59.2),
    temporal = ('2020-03-01','2020-04-30'),
    count = 100
)