> From the PO.DAAC Cookbook, to access the GitHub version of the notebook, follow [this link](https://github.com/podaac/tutorials/blob/master/notebooks/SearchDownload_SWOTviaCMR.ipynb).

# Search and Download SWOT Data via `earthaccess`
#### *Author: Cassandra Nickles, PO.DAAC*

## Summary
This notebook will find and download beta pre-validated SWOT hydrology data programmatically via earthaccess python library. For more information about earthaccess visit: https://nsidc.github.io/earthaccess/

## Requirements
### 1. Compute environment 
This tutorial can be run in the following environments:
- **Local compute environment** e.g. laptop, server: this tutorial can be run on your local machine

### 2. Earthdata Login

An Earthdata Login account is required to access data, as well as discover restricted data, from the NASA Earthdata system. Thus, to access NASA data, you need Earthdata Login. Please visit https://urs.earthdata.nasa.gov to register and manage your Earthdata Login account. This account is free to create and only takes a moment to set up.

### Import libraries

In [1]:
import geopandas as gpd
import glob
from pathlib import Path
import pandas as pd
import os
import zipfile
import earthaccess
from earthaccess import Auth, DataCollections, DataGranules, Store

In this notebook, we will be calling the authentication in the below cell.

In [2]:
auth = earthaccess.login() 

EARTHDATA_USERNAME and EARTHDATA_PASSWORD are not set in the current environment, try setting them or use a different strategy (netrc, interactive)
You're now authenticated with NASA Earthdata Login
Using token with expiration date: 12/22/2023
Using .netrc file for EDL


### Search for SWOT data links
We want to find the SWOT files for a particular pass over North America. 

Each dataset has it's own unique shortname, which can be used to search earthaccess. Shortnames can be found on dataset landing pages or [Earthdata Search](https://search.earthdata.nasa.gov/search) Collections.

#### SWOT Level 2 KaRIn High Rate Version 1.1 Datasets from calibration phase, 4/8 through 4/22:

- **Water Mask Pixel Cloud NetCDF** - SWOT_L2_HR_PIXC_1.1 (DOI: 10.5067/SWOT-PIXC-1.1)
- **Water Mask Pixel Cloud Vector Attribute NetCDF** - SWOT_L2_HR_PIXCVec_1.1 (DOI: 10.5067/SWOT-PIXCVEC-1.1)
- **River Vector Shapefile** - SWOT_L2_HR_RiverSP_1.1 (DOI: 10.5067/SWOT-RIVERSP-1.1)  
- **Lake Vector Shapefile** - SWOT_L2_HR_LakeSP_1.1 (DOI: 10.5067/SWOT-LAKESP-1.1)
- **Raster NetCDF** - SWOT_L2_HR_Raster_1.1 (DOI: 10.5067/SWOT-RASTER-1.1)

Let's start our search for River Vector Shapefiles in April 2023 with a particular pass, pass 013. SWOT files come in "reach" and "node" versions in the same collection, here we want the 10km reaches rather than the nodes. We will also only get files for North America, or 'NA' and call out a specific pass number that we want.

In [3]:
#Retrieves granule from the day we want, in this case by passing to `earthdata.search_data` function the data collection shortname, temporal bounds, filter by wildcards, and for restricted data one must specify the search count
results = earthaccess.search_data(short_name = 'SWOT_L2_HR_RIVERSP_1.1', 
                                  temporal = ('2023-04-08 00:00:00', '2023-04-25 23:59:59'),
                                  granule_name = '*Reach*_013_NA*', # here we filter by Reach files (not node), pass #13 and continent code=NA
                                  count=2000) #for restricted datasets, need to specify count number (1-2000)

Granules found: 17


During the fast sampling orbit for SWOT, the same passes were observed daily, thus 18 files makes sense! During the science orbit, a pass will be repeated once every 21 days. A particular location may have different passes observe it within the 21 days, however. See the [SWOT swath visualizer](https://swot.jpl.nasa.gov/mission/swath-visualizer/) for your location!

### Download the Data into a folder

In [9]:
earthaccess.download(results, "./datasets/data_downloads/SWOT_files")
folder = Path("./datasets/data_downloads/SWOT_files")

 Getting 17 granules, approx download size: 0.04 GB


QUEUEING TASKS | :   0%|          | 0/17 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/17 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/17 [00:00<?, ?it/s]

### Shapefiles come in a .zip format, and need to be unzipped in the existing folder

In [7]:
for item in os.listdir(folder): # loop through items in dir
    if item.endswith(".zip"): # check for ".zip" extension
        zip_ref = zipfile.ZipFile(f"{folder}/{item}") # create zipfile object
        zip_ref.extractall(folder) # extract file to dir
        zip_ref.close() # close file

In [8]:
os.listdir(folder)

['SWOT_L2_HR_RiverSP_Reach_484_013_NA_20230408T071821_20230408T071832_PIB0_01.dbf',
 'SWOT_L2_HR_RiverSP_Reach_484_013_NA_20230408T071821_20230408T071832_PIB0_01.prj',
 'SWOT_L2_HR_RiverSP_Reach_484_013_NA_20230408T071821_20230408T071832_PIB0_01.shp',
 'SWOT_L2_HR_RiverSP_Reach_484_013_NA_20230408T071821_20230408T071832_PIB0_01.shp.xml',
 'SWOT_L2_HR_RiverSP_Reach_484_013_NA_20230408T071821_20230408T071832_PIB0_01.shx',
 'SWOT_L2_HR_RiverSP_Reach_484_013_NA_20230408T071821_20230408T071832_PIB0_01.zip',
 'SWOT_L2_HR_RiverSP_Reach_485_013_NA_20230409T070859_20230409T070910_PIB0_01.dbf',
 'SWOT_L2_HR_RiverSP_Reach_485_013_NA_20230409T070859_20230409T070910_PIB0_01.prj',
 'SWOT_L2_HR_RiverSP_Reach_485_013_NA_20230409T070859_20230409T070910_PIB0_01.shp',
 'SWOT_L2_HR_RiverSP_Reach_485_013_NA_20230409T070859_20230409T070910_PIB0_01.shp.xml',
 'SWOT_L2_HR_RiverSP_Reach_485_013_NA_20230409T070859_20230409T070910_PIB0_01.shx',
 'SWOT_L2_HR_RiverSP_Reach_485_013_NA_20230409T070859_20230409T07091