# Choose Your Own Adventure: Direct S3 Access

### Use this notebook to practice skills and techniques you learned in in [`UWG-F2F_S3_Bucket_Access.ipynb`](https://github.com/nasa/gesdisc-cloud-tutorials/blob/main/GES_DISC_Cloud_Notebooks/Cloud_Workshop/UWG/UWG-F2F_S3_Bucket_Access.ipynb). 


### We provide the necessary code in the first few blocks. Make sure to run these first before experimenting with data search and access. 



### Import libraries

We included some essentials here, but add to the list if you need additional Python libraries.

In [5]:
import requests
import xarray as xr
import s3fs
import os
from netrc import netrc

# Need more Python libraries? Add them below!
#

### Generate *OR* check for a `.netrc` file

If you successfully ran [`UWG-F2F_S3_Bucket_Access.ipynb`](https://github.com/nasa/gesdisc-cloud-tutorials/blob/main/GES_DISC_Cloud_Notebooks/Cloud_Workshop/UWG/UWG-F2F_S3_Bucket_Access.ipynb), this code will run with no errors.  

In [6]:
urs = 'urs.earthdata.nasa.gov'    # Earthdata URL endpoint for authentication
prompts = ['Enter NASA Earthdata Login Username: ',
           'Enter NASA Earthdata Login Password: ']

netrc_name = ".netrc"

# Determine if netrc file exists, and if so, if it includes NASA Earthdata Login Credentials
try:
    netrcDir = os.path.expanduser(f"~/{netrc_name}")
    netrc(netrcDir).authenticators(urs)[0]

# Below, create a netrc file and prompt user for NASA Earthdata Login Username and Password
except FileNotFoundError:
    homeDir = os.path.expanduser("~")
    Popen('touch {0}{2} | echo machine {1} >> {0}{2}'.format(homeDir + os.sep, urs, netrc_name), shell=True)
    Popen('echo login {} >> {}{}'.format(getpass(prompt=prompts[0]), homeDir + os.sep, netrc_name), shell=True)
    Popen('echo \'password {} \'>> {}{}'.format(getpass(prompt=prompts[1]), homeDir + os.sep, netrc_name), shell=True)
    # Set restrictive permissions
    Popen('chmod 0600 {0}{1}'.format(homeDir + os.sep, netrc_name), shell=True)

### **YOUR TURN!** 

### Which data collection(s) do you want to work with? How would you like to obtain S3 URLs?

Choose your method of obtaining S3 URLs for your data collection of interest. 

1. Using the "Download Data" option in [Earthdata Search](https://search.earthdata.nasa.gov/search?ff=Available%20from%20AWS%20Cloud&fdc=Goddard%2BEarth%2BSciences%2BData%2Band%2BInformation%2BServices%2BCenter%2B%2528GES%2BDISC%252) and selecting files from the "AWS S3 Access" tab
2. Using the CMR API to perform a collection and/or granule search
3. Using the GES DISC "Subset/Get Data" dialog box *(feature coming soon)*
4. Using knowledge of the structure of granule IDs and the S3 Bucket/Object Prefix of a particular collection

In [11]:
# Try it out here!
#







# Hint: If you want to make a list in Python, add your items inside square brackets, separate them with commas and name your list!
# i.e. groceries = ['bread', 'cheese', 'milk']

### Got your single granule S3 URL or list of S3 URLs?

### Choose an appropriate DAAC S3 credential endpoint and start up your S3 File System session!

You can find a list of DAAC S3 credential endpoints here:
- GES DISC: https://data.gesdisc.earthdata.nasa.gov/s3credentials
- GHRC: https://data.ghrc.earthdata.nasa.gov/s3credentials
- LP DAAC: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials
- NSIDC: https://data.nsidc.earthdatacloud.nasa.gov/s3credentials
- ORNL DAAC: https://data.ornldaac.earthdata.nasa.gov/s3credentials
- PO.DAAC: https://archive.podaac.earthdata.nasa.gov/s3credentials

In [12]:
# Which DAAC S3 credential endpoint to use? Choose from the list and paste the URL below
# auth_link = "__paste URL here ___"

# Define a function that starts an S3 File System session
def begin_s3_direct_access(url: str=auth_link):
    print(auth_link)
    response = requests.get(url).json()
    return s3fs.S3FileSystem(key=response['accessKeyId'],
                             secret=response['secretAccessKey'],
                             token=response['sessionToken'])

# Use the function and specify the auth_link to open the S3 File System
fs = begin_s3_direct_access(auth_link)

NameError: name 'auth_link' is not defined

### Open up one, or multiple, granules using Python's xarray




In [13]:
# See if you can open a single granule using xarray's open_dataset() function





In [None]:
# Now see if you can write a way to open *multiple* granules using xarray's open_mfdataset() function










# Hint: Each granule still needs to be opened with the S3 File System before operation by the xarray open_mfdataset function. However, this input can still be a list.
# Here is how you can iterate a function and write the output to a list
# outputs_list = [function() for item in input_list]