# How to work with AρρEEARS Cloud Optimized GeoTIFF (COG) outputs

## Summary  

This tutorial demonstrates how to access AρρEEARS Cloud Optimized GeoTIFF (COG) outputs in AWS. NASA's Application for Extracting and Exploring Analysis Ready Samples ([AρρEEARS](https://appeears.earthdatacloud.nasa.gov/)) is deployed in NASA's Earthdata Cloud space located in **AWS us-west 2**. This enables the user working from cloud instances deployed in **AWS us-west 2** to access outputs directly from an AWS S3 bucket. In this tutorial, we will walk through the process of submitting an area sample and accessing a Cloud Optimized GeoTIFF (COG) outputs from AppEEARS.

This tutorial highlights the Dixie Fire, the second-largest fire in California history. According to [CalFire](https://www.fire.ca.gov/incidents/2021/7/13/dixie-fire/), the fire has started on July 13, 2021 and burned more than 963,276 acres. On August 18, the Dixie Fire merged with the Morgan Fire, which had been started by lightning August 12, close to Lassen National Park. The fire was one hundred percent contained by October 2021.    

## Requirements  

- Earthdata Login Authentication is required to uses the AρρEEARS API and to access AρρEEARS outputs directly.  

## Learning Objectives  

- Learn how to access AρρEEARS Cloud Optimized GeoTIFF (COG) outputs


## Tutorial Outline 

1. Setting Up  
2. Submit an area request in AppEEARS  
3. Extract the Direct S3 links  
4. Create a boto3 Refreshable Session  
5. Single COG File In-Region Direct S3 Access   
6. Multiple COG File In-Region Direct S3 Access  
7. Explore the EVI Time Series   


## 1. Set up

Import the required packages.

In [None]:
import requests
import earthaccess
import getpass, pprint, time, os, cgi, json
import geopandas 
import datetime
import os
import json
from netrc import netrc
import time
from datetime import datetime, timezone
import rioxarray
import xarray
import hvplot.xarray
import holoviews
import geoviews
import rasterio 
from rasterio.plot import show
import pandas
import warnings
import numpy as np
import numpy.ma as ma
import matplotlib.pyplot as plt
warnings.filterwarnings('ignore')

import ipywidgets as widgets
from IPython.display import display
import folium
from folium import plugins
import branca.colormap as cm
from matplotlib import colors as colors

To successfully run this tutorial, it is required to create a **.netrc** file in your home directory. The function `_validate_netrc` defined in `aws_session` checks if a properly formatted netrc file exists in your home directory. If the netrc file does not exist, it will prompt you for your Earthdata Login username and password and will create a netrc file. Please see the **Prerequisites** section in [**README.md**](../README.md). 

In [None]:
auth = earthaccess.login()

In [None]:
roi = geopandas.read_file('../../data/co_agriculture.geojson')
roi

Get the center coordinates from the input geojson. This is used later when plotting 

In [None]:
x_center = roi.centroid.x
y_center = roi.centroid.y

## 2. Submit an area request in AρρEEARS  
In this step, we are going to submit an area request with GeoTIFF as an output format. You can also submit this request using [AρρEEARS Graphic User Interface (GUI)](https://appeears.earthdatacloud.nasa.gov/task/area) and upload the JSON file provided in the repository (AppEEARS-Data-Resources/Data/Dixie-Fire-request.json). If you have your completed request, save your `task_id` to a variable, skip this step, and move to the next step of tutorial.  

Assign the AρρEEARS API endpoint to a variable. 

In [None]:
appeears_API_endpoint = 'https://appeears.earthdatacloud.nasa.gov/api/'

A **Bearer Token** is needed to submit requests to the AρρEEARS API. To generated a token, a `POST` request containing Earthdata Login credentials stored in the **.netrc** file is submitted to the [`login`](https://appeears.earthdatacloud.nasa.gov/api/#authentication) service. 

In [None]:
login_req = requests.post(f'{appeears_API_endpoint}login', auth = (auth.username,auth.password))
login_req

In [None]:
token = login_req.json()['token']                      # Save login token to a variable
head = {'Authorization': 'Bearer {}'.format(token)}    # Create a header to store token information, needed to submit a request

Next, compile a JSON object with the request parameters. The Dixie fire started on July 13, 2021, however we're going to extended the search query to include two years to see the time series. A GeoJSON of Region of Interest(ROI) including Lassen National Park region, CA can be downloaded from the repository. For this tutorial, we are requesting the `_500m_16_days_EVI` layer from `MOD13A1.061` to see how Enhanced Vegetation Indices (EVI) varies before and after the fire event. Learn more about the MODIS Vegetation Indices 16-Day Version 6.1 product [here](https://doi.org/10.5067/MODIS/MOD13A1.061). Below we define the AρρEEARS search parameters.  

In [None]:
product_req = requests.get(f'{appeears_API_endpoint}product').json()

In [None]:
product_eco = [x for x in product_req if 'ECOSTRESS' in x['Platform']]    # Get ECOSTRESS product information

In [None]:
layer_req = requests.get(f'{appeears_API_endpoint}product/ECO4ESIALEXI.001').json()

In [None]:
layer_req.keys()    # These are the layer names

In [None]:
task_name = "Colorado_Drought"
task_type = 'area'                  # Type of task, area or point
proj = 'geographic'                 # Set output projection 
outFormat = 'geotiff'               # Set output file format type
startDate = '06-01'            # Start of the date range for which to extract data: MM-DD-YYYY
endDate = '06-30'              # End of the date range for which to extract data: MM-DD-YYYY
yearRange = [2020,2021]
ROI =  roi.to_json()
prodLayer = [{'layer': 'EVAPORATIVE_STRESS_INDEX_ALEXI_ESIdaily', 'product': 'ECO4ESIALEXI.001'}]

In [None]:
task = {
    'task_type': task_type,
    'task_name': task_name,
    'params': {
         'dates': [
         {
             'startDate': startDate,
             'endDate': endDate,
             'recurring': True,
             'yearRange': yearRange
         }],
         'layers': prodLayer,
         'output': {
                 'format': {
                         'type': outFormat}, 
                         'projection': proj},
         'geo': json.loads(ROI),
    }
}

Next, submit the AρρEEARS request using `post` function from `requests` library.

In [None]:
task_response = requests.post(f'{appeears_API_endpoint}task', json=task, headers=head).json()    # Post json to the API task service, return response as json
task_response                                                                                    # Print task response

The `task_id` will be needed to get status information about the request and to later find the AρρEEARS outputs for the request. We will save the `task_id` to a variable and wait until our request is processed and complete. 

In [None]:
task_id = task_response['task_id']
task_id

In [None]:
# Ping API until request is complete, then continue to Section 3
while requests.get(f'{appeears_API_endpoint}task/{task_id}', headers=head).json()['status'] != 'done':
    print(requests.get(f'{appeears_API_endpoint}task/{task_id}', headers=head).json()['status'])
    time.sleep(60)
print(requests.get(f'{appeears_API_endpoint}task/{task_id}', headers=head).json()['status'])

## 3. Access Request Results

Once our outputs are ready, we can get the bundle information for the files included in the outputs. If you submitted your request using AρρEEARS GUI, assign your sample's `task_id` to the variable `task_id` below. 

In [None]:
#task_id = 'fdd28cde-de2b-40b4-b3f9-edf33f585649'

`requests.get` is used to get information about our bundle. The bundle information includes `s3_url` in addition to the other information such as output `file_name`, `file_id`, and `file_type`.  

Each output file can be downloaded using the `file_id` (see section 4 in [AppEEARS_API_Area.ipynb](AppEEARS_API_Area.ipynb). Since AρρEEARS outputs are stored in an S3 bucket, outputs can also be accessed using `S3_url` if you are working from an cloud instance in **AWS us-west-2**. 

In [None]:
bundle = requests.get(f'{appeears_API_endpoint}bundle/{task_id}', headers=head).json()  # Call API and return bundle contents for the task_id as json
#bundle

In [None]:
files = {x['file_id']:x['file_name'] for x in bundle['files'] if 'ESIdaily' in x['file_name'] and '.tif' in x['file_name']}
files

Download files to data directory

In [None]:
if not os.path.exists('data'):
    os.makedirs('data')

In [None]:
for f in files:
    dl = requests.get(f'{appeears_API_endpoint}bundle/{task_id}/{f}', headers=head, stream=True, allow_redirects = 'True')    # Get a stream to the bundle file
    if files[f].endswith('.tif'):
        filename = files[f].split('/')[1]
    else:
        filename = files[f] 
    filepath = os.path.join('data', filename)                          # Create output file path
    with open(filepath, 'wb') as f:                                    # Write file to dest dir
        for data in dl.iter_content(chunk_size=8192): f.write(data)

In [None]:
file_list_2020 = [x for x in os.listdir('data') if 'ECO4ESIALEXI' in x and 'doy2020' in x]
file_list_2020.sort()
file_list_2020

In [None]:
file_list_2021 = [x for x in os.listdir('data') if 'ECO4ESIALEXI' in x and 'doy2021' in x]
file_list_2021.sort()
file_list_2021

In [None]:
match_list = []

for file1 in file_list_2020:
    ndt1 = norm_year(get_datetime(file1))
    for file2 in file_list_2021:
        ndt2 = norm_year(get_datetime(file2))
        td = abs(ndt1 - ndt2)
        if td.total_seconds()/60 < 60:
            match_list.append([file1, file2])
        else:
            continue

print(f'{len(match_list)} matching scenes')

In [None]:
#https://github.com/royalosyin/Overlay-GeoTiff-Raster-with-nodata-On-Interactive-Map/blob/master/scripts/ex2-Overlay%20Raster%20with%20nodata%20on%20Interactive%20Map%20with%20Folium.ipynb

vmin = 0
vmax = 1

#colormap = cm.linear.RdBu_11.scale(vmin, vmax)
colormap = cm.linear.magma.scale(vmin, vmax)
colormap

In [None]:
def mapvalue2color(value, cmap): 
    """
    Map a pixel value of image to a color in the rgba format. 
    As a special case, nans will be mapped totally transparent.
    
    Inputs
        -- value - pixel value of image, could be np.nan
        -- cmap - a linear colormap from branca.colormap.linear
    Output
        -- a color value in the rgba format (r, g, b, a)    
    """
    if np.isnan(value):
        return (1, 0, 0, 0)
    else:
        return colors.to_rgba(cmap(value), 0.1)

In [None]:
def plot_dual_map(file1, file2):
    with rasterio.open(file1) as src1, rasterio.open(file2) as src2:
        data1 = src1.read(1)
        data2 = src2.read(1)

        meta1 = src1.meta
        meta2 = src2.meta

        #cmap_data1 = colorize(data1, 9999.0, cmap='viridis')
        #cmap_data2 = colorize(data2, 9999.0, cmap='viridis')

        #m = folium.plugins.DualMap(location=[src1.bounds[1], src2.bounds[0]], zoom_start=10)
        m = folium.plugins.DualMap(location=[y_center, x_center], zoom_start=10, tiles='Esri.WorldImagery')
        folium.GeoJson(roi).add_to(m.m1)
        folium.raster_layers.ImageOverlay(image=data1, bounds=[[src1.bounds[1], src1.bounds[0]],[src1.bounds[3], src1.bounds[2]]], colormap=lambda value: mapvalue2color(value, colormap), opacity=0.7).add_to(m.m1)
        folium.GeoJson(roi).add_to(m.m2)
        folium.raster_layers.ImageOverlay(image=data2, bounds=[[src2.bounds[1], src2.bounds[0]],[src2.bounds[3], src2.bounds[2]]], colormap=lambda value: mapvalue2color(value, colormap), opacity=0.7).add_to(m.m2)

        folium.LayerControl().add_to(m)

        display(m)

In [None]:
[m for m in enumerate(match_list)]

In [None]:
loc=5

infile1 = (f'data/{match_list[loc][0]}')
infile2 = (f'data/{match_list[loc][1]}')

plot_dual_map(infile1, infile2)