# Open Science in Action Tutorials: 

## NASA Earthdata Access in the Cloud Using Open-source Libraries

### Summary:

Abstract for reference:

As one of the largest Earth science data repositories, the NASA Earth Observing System Data and Information System (EOSDIS) archives and freely distributes roughly 32 petabytes of satellite, aircraft, and field data, supporting a diverse Earth science user community. These data volumes continue to grow as new, high-resolution remote sensing missions launch in the coming years, requiring new data management approaches to support and reduce barriers to scientific research. To address these needs and advance open science data systems and the data users they support, NASA EOSDIS data and associated discovery and access tools are migrating to the cloud. The EOSDIS Distributed Active Archive Centers (DAACs) are working collaboratively to support researchers as they migrate to a cloud-based data workflow, developing educational materials to teach these new skills using open source programming languages and libraries. This tutorial will walk through some of these new supporting resource materials on how to discover, access, and work with NASA Earthdata in the cloud, highlighting efficient and reproducible pathways to scientific analysis made possible by the cloud and open source technologies.

### Objectives:

### Acknowledgements:
Co-Authors: 
NASA Openscapes Project: Co-hosted by NASA's PO.DAAC, NSIDC DAAC, LP.DAAC, with support from ASDC DAAC, GES DISC
Cloud computing infrastructure by 2i2c

---

# Introduction to NASA Earthdata and Cloud migration


Background info on cloud migration, AWS diagrams, etc.

## Tutorial Use Case:

### Outline:

Combine GeoTIFF and Zarr basic access/usage in the cloud
1. Discover data of interest in Earthdata Search (2 datasets)
2. S3 access of GeoTIFF
3. S3 access of Zarr (via Harmony)
4. open/plot

* Harmonized Landsat Sentinel-2 (HLS) Operational Land Imager Surface Reflectance and TOA Brightness Daily Global 30m v2.0 (L30) ([10.5067/HLS/HLSL30.002](https://doi.org/10.5067/HLS/HLSL30.002))

* Monthly sea surface height from ECCO V4r4 (10.5067/ECG5D-SSH44). The data are provided as a time series of monthly netCDFs on a 0.5-degree latitude/longitude grid. (From NetCDF notebook: We will access the data from inside the AWS cloud (us-west-2 region, specifically) and load a time series made of multiple netCDF datasets into a single xarray dataset. This approach leverages S3 native protocols for efficient access to the data).


## Requirements

AWS instance running in us-west 2

Earthdata Login

.netrc file


## Earthdata Search exploration to get s3 URLs

* Reference other tutorials that demonstrate cmr-stac and cmr access

## Import packages

In [15]:
# from COG simple notebook

import os
import requests 
import boto3
from osgeo import gdal
import rasterio as rio
from rasterio.session import AWSSession
import rioxarray
import hvplot.xarray
import holoviews as hv

In [17]:
# from Harmony notebook

from harmony import BBox, Client, Collection, Request, LinkType
from harmony.config import Environment
import requests
from pprint import pprint
import datetime as dt
import s3fs
import xarray as xr

## S3 access of GeoTIFF

* gdal
* rioxarray
* hvplot 

## S3 access of Zarr

* Harmony
* Mention STAC outputs
* Zarr
* xarray - hvplot 