# Accessing and plotting NASA Earthdata

This notebook was generated by NASA Earthdata Search and serves as an example of authenticating and accessing NASA data and plotting a variable using the `earthaccess` and `xarray` python libraries.

When running the notebook in the cloud, the notebook is intended to be run in the Amazon Web Services (AWS) __us-west-2__ region, where all NASA data are hosted in the Earthdata Cloud.

The notebook was generated using:
- __Collection__: [{{ collectionTitle }}]({{baseUrl}}/search/granules?p={{ collectionId }})
- __Granule__: [{{ granuleTitle }}]({{baseUrl}}/search/granules/granule-details?p={{ granuleId }})
- __Variable__: {{ variable }}

{{#if boundingBox}}
- __Bounding Box__: {{ boundingBox.minLon }}, {{ boundingBox.minLat }}, {{ boundingBox.maxLon }}, {{ boundingBox.maxLat }}
{{/if}}

[View the Earthdata Search query applied at the time this notebook was generated]({{ referrerUrl }}).

*Generated by NASA [Earthdata Search]({{baseUrl}}) on {{generatedTime}}*


## Overview

In the notebook, you will see the following:
- Authenticating and accessing NASA data using `earthaccess`
- Analyzing data using `xarray`
- Plotting data using `matplotlib` via the `xarray.Dataset.plot()`

For information about Jupyter Notebooks, visit the [Jupyter Notebook docs](https://jupyter-notebook.readthedocs.io/en/latest/).

To learn about working with NASA Earthdata in the cloud, visit the [Openscapes NASA Earthdata Cloud Cookbook](https://nasa-openscapes.github.io/earthdata-cloud-cookbook/).

### Requirements

- Usage of this notebook requires installation of Python 3.10 or higher. To learn how to install Python, visit the [Python documentation](https://docs.python.org/3/).
- Usage of this notebook requires an Earthdata Login account. To register, create an account the [Earthdata Login registration page](https://urs.earthdata.nasa.gov/users/new).

### Version information

This notebook was developed and tested using the following versions:

<table style="display: block">
    <tr>
        <th style="text-align: left">Package</th>
        <th style="text-align: left">Version</th>
    </tr>
    <tr>
        <td>Python</td>
        <td>3.12.4</td>
    </tr>
    <tr>
        <td>Jupyter</td>
        <td>4.0.11</td>
    </tr>
    <tr>
        <td>earthaccess</td>
        <td>0.11.0</td>
    </tr>
    <tr>
        <td>xarray</td>
        <td>2024.10.0</td>
    </tr>
    <tr>
        <td>cartopy</td>
        <td>0.24.1</td>
    </tr>
    <tr>
        <td>h5netcdf</td>
        <td>1.3.0</td>
    </tr>
    <tr>
        <td>netCDF4</td>
        <td>1.7.1.post2</td>
    </tr>
</table>


### Troubleshooting

If an error occurs when running a code block, compare the versions in the [Version information](#Version-information) section above to the output from the [Install dependencies](#Install-dependencies) section below to verify the correct package versions are installed in your Python environment. Running the notebook with different package versions may result in  unexpected behavior when running the code blocks.

---

## Install dependencies

Start by installing the required Python dependencies. This should only need to be done once, unless additional dependencies are required.

In [None]:
%pip install earthaccess xarray matplotlib cartopy h5netcdf netCDF4

## Import dependencies

Now that the required dependencies have been downloaded and installed, they must be imported prior to use. This should only need to be done once, unless additional dependancies have been added.

In [2]:
import earthaccess
import xarray as xr
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
from matplotlib.ticker import AutoLocator

## Authenticate with Earthdata Login using the earthaccess `earthaccess.login()`

In order to download and access files using earthaccess, you must first authenticate using `earthaccess.login()`. This step may require a Earthdata Login account.

Find more information about `earthaccess.login()` in the [earthaccess.login documentation](https://earthaccess.readthedocs.io/en/latest/user-reference/api/api/#earthaccess.api.login).

Check out the [earthaccess quick start guide](https://earthaccess.readthedocs.io/en/latest/quick-start/) for additional information about authentication strategies.

In [3]:
auth = earthaccess.login()

if not auth.authenticated:
    auth = earthaccess.login(strategy="all")

if auth.authenticated:
    print("Authentication successful!")
else:
    print("Authentication unsuccessful")

Authentication successful!


## Open the file with earthaccess `earthaccess.open()`

Once authenticated, the `earthaccess.open()` function can be used to download the file and create a dataset that can be used by `xarray` for analysis. 

Find more information about `earthaccess.open()` and its parameters in the [earthaccess.login documentation](https://earthaccess.readthedocs.io/en/latest/user-reference/api/api/#earthaccess.api.login).

In [None]:
# Search for the granule by the name and provider
search_result = earthaccess.search_data(granule_name="20241017090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1", provider="POCLOUD")

# Get the data for the granule using earthaccess.open
files_array = earthaccess.open(search_result)

# Get the first file in the array which will be the granule
file = files_array[0]

file

## Open the dataset using `xarray.open_datatree()`

After locating and downloading the data, you can now use `xarray.open_datatree()` to open the dataset created by  `earthaccess`. `xarray.DataTree` is a class available in `xarray`, that enables the working with [hierarchical data](https://docs.xarray.dev/en/stable/user-guide/hierarchical-data.html). Each group in an input file has a node, where variables within that group are represented using the standard `xarray.Dataset` class.

Find more information about `xarray.open_datatree()` and its parameters in the [xarray.open_datatree documentation](https://docs.xarray.dev/en/stable/generated/xarray.open_datatree.html).

In [None]:
# Open the dataset with xarray
dt = xr.open_datatree(
    file, # The granule dataset from earthaccess
    engine='h5netcdf', # The engine requried to open NetCDF files
)

dt

## Select a subset of the data using `xarray.DataTree.sel()`

The `xarray.DataTree.sel()` function can be used to return a new dataset which has been indexed to a specific bounding area. For large datasets, this can result in improved performance when doing analysis and plotting.

If a bounding box is applied in Earthdata Search when generating this notebook, the bounding box coordinates will be used below.

Find more information about `xarray.DataTree.sel()` and its parameters in the [xarray.DataTree.sel documentation](https://docs.xarray.dev/en/latest/generated/xarray.DataTree.sel.html).

In [None]:
# Define the bounding area
{{#unless boundingBox}}# Select the data all data by setting coordinate variable declarations for the entire globe. These values can be updated to subset the data to a smaller area of interest. The values can be set manually or by using a bounding box before generating a notebook in Earthdata Search.
min_lon = -90
min_lat = -180
max_lon =  90
max_lat =  180{{/unless}}
{{#if boundingBox}}# Select the data within the bounding box applied in Earthdata Search at the time of generation.
min_lon = {{boundingBox.minLon}}
min_lat = {{boundingBox.minLat}}
max_lon = {{boundingBox.maxLon}}
max_lat = {{boundingBox.maxLat}}

# To select data for the granule encompassing the entire globe, remove the variables above and uncomment the following variable declarations for the coordinate points.
# min_lon = -90
# min_lat = -180
# max_lon =  90
# max_lat =  180{{/if}}

# Create a new dataset using the bounding area
dt = dt.sel(lat=slice(min_lat, max_lat), lon=slice(min_lon, max_lon))

dt

## Generate a plot using `xarray.DataArray.plot()` for the the variable `analysed_sst`

Using the `xarray.DataArray.plot()` function provided by `xarray`, plot the variable `analysed_sst`. 

Find more information about `xarray.DataArray.plot()` and the options available in `xarray`, visit the [xarray.DataArray.plot documentation](https://docs.xarray.dev/en/latest/generated/xarray.DataArray.plot.html) or the [Datasets](https://docs.xarray.dev/en/latest/user-guide/plotting.html#datasets) and [Maps](https://docs.xarray.dev/en/latest/user-guide/plotting.html#datasets) sections of the [xarray Plotting User Guide](https://docs.xarray.dev/en/latest/user-guide/plotting.html).

In [None]:
# Plot the data
p = dt["{{variable}}"].plot(
    subplot_kws=dict(projection=ccrs.PlateCarree()), # Set the projection
)

# Set the spatial extent and projection of the plot using the provided bounding box
p.axes.set_extent([min_lon, max_lon, min_lat, max_lat], crs=ccrs.PlateCarree())

# Add coastlines to the plot
p.axes.coastlines()

# Add ticks to the x and y axis
p.axes.set_xticks([min_lon, max_lon], crs=ccrs.PlateCarree())
p.axes.set_yticks([min_lat, max_lat], crs=ccrs.PlateCarree())

plt.show()

## Close the xarray dataset using `xarray.Dataset.close()`

In order to free up the resources used by xarray, the dataset should be closed.

Find more information about `xarray.Dataset.close()` in the [xarray.Dataset.close documentation](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.close.html).

In [None]:
dt.close()