# LDAS Cloud Use Case

## Demonstrate differences in soil moisture between LDAS model output and CYGNSS satellite retrievals in the Earthdata Cloud

### Audience
Soil moisture researchers who want to see differences between model- and satellite-based products

### Goal
Compare soil moisture from LDAS (NLDAS, GLDAS models) and CYGNSS (satellite retrieval) in different regions of the world (Souther Great Plains, East Africa) using in-region Earthdata Cloud access.

### Assumptions
- A variable of interest is identified
- Data collections of interest are identified
- User has procured access to computing resources in the AWS us-west-2 region

### Preconditions
- User has access to computing resources in the AWS us-west-2 region
- LDAS and CYGNSS data collections are available in the Earthdata Cloud
- Identify area(s) of interest for area-averaging
    - Southern Great Plains: 100-95 W and 32-37 N
    - East Africa: 43-48 E and 5-10 N
    

<p></p>

<div style="background:#fc9090;border:1px solid #cccccc;padding:5px 10px;"><big><b>Note:  </b>This notebook <em><strong>will only run in an environment with <a href="https://disc.gsfc.nasa.gov/information/glossary?keywords=%22earthdata%20cloud%22&amp;title=AWS%20region">us-west-2 AWS access</a></strong></em>.</big></div>

## Prerequisites

This notebook was written using Python 3.8, and requires these libraries and files: 
- xarray
- cartopy
- matplotlib
- S3FS
  - S3FS documentation: https://s3fs.readthedocs.io/en/latest/install.html\
- netrc file with valid Earthdata Login credentials
- Approval to access the GES DISC archives with your Earthdata credentials (https://disc.gsfc.nasa.gov/earthdata-login)

## Import Libraries

In [2]:
from netrc import netrc
from subprocess import Popen
from platform import system
from getpass import getpass
import os
import requests
import xarray as xr
import s3fs

## Credentials

### Two credentials are required for in-cloud direct S3 access of Earthdata:
- Earthdata Login username and password
- Temporary S3 access credential for NASA DAAC archives

The S3 credential is generated by a lambda function that uses the Earthdata Login credentials provided in your <code>.netrc</code> file to create an access key ID, secret access key, and session token for accessing GES DISC S3 buckets. **This token will only last for one hour**, and if time expires, the kernel will need to be reset and the following cell run again.

### STOP: Do you have your Earthdata Login credentials stored in the root directory of this compute system?

If no, run the below cell to store your Earthdata username and password in a <code>.netrc</code> file.
    
If yes, proceed to the next cell.
    


In [3]:
## You may skip this cell if you have already stored your Earthdata Login credentials

urs = 'urs.earthdata.nasa.gov'    # Earthdata URL endpoint for authentication
prompts = ['Enter NASA Earthdata Login Username: ',
           'Enter NASA Earthdata Login Password: ']

netrc_name = ".netrc"

# Determine if netrc file exists, and if so, if it includes NASA Earthdata Login Credentials
try:
    netrcDir = os.path.expanduser(f"~/{netrc_name}")
    netrc(netrcDir).authenticators(urs)[0]

# Below, create a netrc file and prompt user for NASA Earthdata Login Username and Password
except FileNotFoundError:
    homeDir = os.path.expanduser("~")
    Popen('touch {0}{2} | echo machine {1} >> {0}{2}'.format(homeDir + os.sep, urs, netrc_name), shell=True)
    Popen('echo login {} >> {}{}'.format(getpass(prompt=prompts[0]), homeDir + os.sep, netrc_name), shell=True)
    Popen('echo \'password {} \'>> {}{}'.format(getpass(prompt=prompts[1]), homeDir + os.sep, netrc_name), shell=True)
    # Set restrictive permissions
    Popen('chmod 0600 {0}{1}'.format(homeDir + os.sep, netrc_name), shell=True)

In [5]:
gesdisc_s3 = "https://data.gesdisc.earthdata.nasa.gov/s3credentials"

# Define a function for S3 access credentials
def begin_s3_direct_access(url: str=gesdisc_s3):
    response = requests.get(url).json()
    return s3fs.S3FileSystem(key=response['accessKeyId'],
                             secret=response['secretAccessKey'],
                             token=response['sessionToken'],
                             client_kwargs={'region_name':'us-west-2'})

fs = begin_s3_direct_access(gesdisc_s3)

# Check that the file system is intact as an S3FileSystem object, which means that token is valid
# Common causes of rejected S3 access tokens include incorrect passwords stored in the netrc file, or a non-existent netrc file
type(fs)

s3fs.core.S3FileSystem

## Access Data

## Process Data

## Derive area-averages

## Plot differences