# Create TROPESS CRIS-JPSS1 Day Range Plots

## Overview
This notebook will allow you to download [TROPESS](https://tes.jpl.nasa.gov/tropess) CRIS-JPSS1 data and then use that data to create multi-day plots (e.g., weekly, monthly) for a given date range.  
The following suite of plots will be created from standard and summary data products (note that summary products for each species are not
always available):

| Species | Standard Product | Summary Product |
| :------ | :------ | :------ |
| Carbon Monoxide (CO) | TRPSDL2COCRS1FS.1 | TRPSYL2COCRS1FS.1 |
| Amonia (NH3) | TRPSDL2NH3CRS1FS.1 | TRPSYL2NH3CRS1FS.1 |
| Ozone (O3) | TRPSDL2O3CRS1FS.1 | TRPSYL2O3CRS1FS.1 |
| Methane (CH4) | TRPSDL2CH4CRS1FS.1 | TRPSYL2CH4CRS1FS.1 |
| Peroxyacetyl Nitrate (PAN) | TRPSDL2PANCRS1FS.1 | TRPSYL2PANCRS1FS.1 |
| Atmospheric Temperature (TATM) | TRPSDL2TATMCRS1FS.1 | |
| Water (H2O) | TRPSDL2H2OCRS1FS.1 | |
| Deuterated Water Vapor (HDO) | TRPSDL2HDOCRS1FS.1 |  |

In the table above, the "short name" of the data products is listed (e.g., "TRPSYL2O3CRS1FS.1").  Short names are assigned by a NASA DAAC (in this case, the GES-DISC)
for each data products as a way to lookup or refer to a product with out using the product's full long name (e.g., "TROPESS CrIS-JPSS1 L2 Ozone for Forward Stream, Summary Product V1").  You'll see these short names in the code below when there are blocks looping through the different products.

## Requirements

This notebook uses the [GES-DISC's](https://disc.gsfc.nasa.gov) simple subset service to retrieve requested data.  It requires a [NASA Earthdata login](https://urs.earthdata.nasa.gov/).  Please make sure you have one before attempting to run this notebook.  Also make sure you've setup a `.netrc` file in your home directory with your NASA Earthata login information.  You can use the follow steps to generate that file, if needed:

```sh
cd ~
touch .netrc
echo "machine urs.earthdata.nasa.gov login uid_goes_here password password_goes_here" > .netrc
chmod 0600 .netrc
```

## Import Libraries 

Standard (i.e., available through pip) libraries `datetime`, `requests`, and `os` are imported below.  You will also need to clone or install the `tropess-website-plots` library if you haven't already.  It is available at [https://github.com/NASA-TROPESS/tropess-website-plots](https://github.com/NASA-TROPESS/tropess-website-plots).

In [None]:
import datetime as dt
import requests
import os

from datetime import date
from tropessplots.io.cris import read_l2summary, read_l2standard
from tropessplots.website_plots.cris import plot_daily_overview

%load_ext autoreload
%autoreload 2

## Setup Global Variables

Please edit the variables in the next block as needed.  We'll do some checks and conversions after those variables are
set to reformat the date so we can use it in later blocks and to create any directories that don't exist.  We also have a 
list of all the data products we want to plot.

In [None]:
# The dates you want to plot in YYYYMMDD format, using a start and end date.
START_DATE = '20250901'
END_DATE = '20250905'

# Where you want to store the data you download.
DOWNLOAD_DIRECTORY = '/tmp/download'

# Where you want to store the plot outputs.
PLOT_DIRECTORY = '/tmp/download'

# A list of the species you want to plot.  All species are listed here
# by default.  Edits to this list (additions or subtractions) will be 
# reflected in the SHORT_NAME_LIST variable below.  
SPECIES_ARRAY = ['CO', 'NH3', 'O3', 'CH4', 'PAN', 'TATM', 'H2O', 'HDO']

In [None]:
# The short names of the data products you want to plot.  These are automatically
# generated from SPECIES_ARRAY above.  Do not manually edit this list.
# We will use SPECIES_DICT to keep track of the shortnames for each species, and 
# eventually add the files we find for each species.

SPECIES_DICT = {}

if 'CO' in SPECIES_ARRAY:
    SPECIES_DICT['CO'] = {}
    SPECIES_DICT['CO']['standard']      = 'TRPSDL2COCRS1FS.1'
    SPECIES_DICT['CO']['summary']       = 'TRPSYL2COCRS1FS.1'
if 'NH3' in SPECIES_ARRAY:
    SPECIES_DICT['NH3'] = {}
    SPECIES_DICT['NH3']['standard']      = 'TRPSDL2NH3CRS1FS.1'
    SPECIES_DICT['NH3']['summary']       = 'TRPSYL2NH3CRS1FS.1'
if 'O3' in SPECIES_ARRAY:
    SPECIES_DICT['O3'] = {}
    SPECIES_DICT['O3']['standard']      = 'TRPSDL2O3CRS1FS.1'
    SPECIES_DICT['O3']['summary']       = 'TRPSYL2O3CRS1FS.1'
if 'CH4' in SPECIES_ARRAY:
    SPECIES_DICT['CH4'] = {}
    SPECIES_DICT['CH4']['standard']     = 'TRPSDL2CH4CRS1FS.1'
    SPECIES_DICT['CH4']['summary']      = 'TRPSYL2CH4CRS1FS.1'
if 'PAN' in SPECIES_ARRAY:
    SPECIES_DICT['PAN'] = {}
    SPECIES_DICT['PAN']['standard']     = 'TRPSDL2PANCRS1FS.1'
    SPECIES_DICT['PAN']['summary']      = 'TRPSYL2PANCRS1FS.1'
if 'TATM' in SPECIES_ARRAY:
    SPECIES_DICT['TATM'] = {}
    SPECIES_DICT['TATM']['standard']    = 'TRPSDL2TATMCRS1FS.1'
if 'H2O' in SPECIES_ARRAY:
    SPECIES_DICT['H2O'] = {}
    SPECIES_DICT['H2O']['standard']     = 'TRPSDL2H2OCRS1FS.1'
if 'HDO' in SPECIES_ARRAY:
    SPECIES_DICT['HDO'] = {}
    SPECIES_DICT['HDO']['standard']     = 'TRPSDL2HDOCRS1FS.1'

# Formatting dates for use later
start_date_object = dt.datetime.strptime(START_DATE, '%Y%m%d')
end_date_object = dt.datetime.strptime(END_DATE, '%Y%m%d')
iso_start_date = start_date_object.strftime('%Y-%m-%dT00:00:00.000Z')
iso_end_date = end_date_object.strftime('%Y-%m-%dT23:59:59.000Z')

# Create directories if they don't already exist
os.makedirs(DOWNLOAD_DIRECTORY, exist_ok=True)
os.makedirs(PLOT_DIRECTORY, exist_ok=True)

## Search for Data

Now that we are setup, we're going to start our work by downloading data from the GES-DISC.  We'll
make a request to the GES-DISC service to search for the data product and date range we want.
Then we'll check to see if the serivce has found what we are looking for.  If not, we'll keep
waiting until it's found (if it takes too long though, we'll error out).  Once the service returns with a 
list of files we requested, we'll download and plot them.

### Function to Find Files
This is a function we'll keep using as we plot data.  We'll define it here and use it later, in case you are trying to plot multiple species.  It will take the short name, start date, and end date as inputs and retun a list of files.

In [None]:
def findFiles(short_name, iso_start_date, iso_end_date):
    files = []

    r = requests.post('https://disc.gsfc.nasa.gov/service/subset/jsonwsp', 
                    json={"methodname": "subset", "args": {"role":"subset","start":iso_start_date, 
                                                        "end":iso_end_date,
                                                        "data":[{"datasetId":short_name.replace('.', '_')}]}}, 
                    headers={"Accept": "application/json, text/plain, */*", 
                            "Content-type": "application/json;charset=utf-8"})
    r.status_code
    info = r.json()
    status_counter = 0
    breaker_counter = 0

    # Get the status of the request; if not done, try again a few times
    while status_counter == 0:
        r = requests.post('https://disc.gsfc.nasa.gov/service/subset/jsonwsp', 
                    json={"methodname": "GetStatus", "args": {"jobId": info['result']['jobId'], "sessionId": info['result']['sessionId'] }, 
                                                                "type": "jsonwsp/request", "version": "1.0"}, 
                    headers={"Accept": "application/json, text/plain, */*", 
                            "Content-type": "application/json;charset=utf-8"})
        status_info = r.json()
        if status_info['result']['Status'] == 'Succeeded' and status_info['result']['PercentCompleted'] == 100:    
            status_counter = 1
        elif breaker_counter > 50:
            print('Not finding any files for this range %s to %s' % (iso_start_date, iso_end_date))
            break
        else:
            breaker_counter += 1

    r = requests.post('https://disc.gsfc.nasa.gov/service/subset/jsonwsp', 
                        json={"methodname": "GetResult", "args": {"jobId": info['result']['jobId'], "sessionId": info['result']['sessionId'] }, 
                                                                        "type": "jsonwsp/request", "version": "1.0"}, 
                        headers={"Accept": "application/json, text/plain, */*", 
                                    "Content-type": "application/json;charset=utf-8"})
    link_info = r.json()

    # Extract the individual URLs of each file
    for this_info in link_info['result']['items']:
        if short_name in this_info['label']:
            print('Found file %s' % this_info['link'])
            files.append(this_info['link'])

    return files

### Generate the plots
Here, we'll do the work we need to do to generate a plot for each species for the given time range.  First we will search for data, generating a list of files and downloading them.  Then, we'll read in any standard and summary files to generate the plots.

In [None]:
for species in SPECIES_DICT.keys():
        # Send a request to the GES-DISC to search for files for the day
        # Again, you do need a NASA Earthdata account and a .netrc file in your home
        # directory to eventually download data found in these requests.

        # Find the files        
        SPECIES_DICT[species]['standard_file_list'] = findFiles(SPECIES_DICT[species]['standard'], iso_start_date, iso_end_date)
        if 'summary' in SPECIES_DICT[species]:
            SPECIES_DICT[species]['summary_file_list'] = findFiles(SPECIES_DICT[species]['summary'], iso_start_date, iso_end_date)

        # Download the standard files
        for this_file in SPECIES_DICT[species]['standard_file_list']:
            r = requests.get(this_file, allow_redirects=True)
            open(DOWNLOAD_DIRECTORY + '/' + this_file.split('/')[-1], 'wb').write(r.content)
            print('Downloaded %s' % this_file.split('/')[-1])

        # Download the summary files
        if 'summary' in SPECIES_DICT[species]:
            for this_file in SPECIES_DICT[species]['summary_file_list']:
                r = requests.get(this_file, allow_redirects=True)
                open(DOWNLOAD_DIRECTORY + '/' + this_file.split('/')[-1], 'wb').write(r.content)
                print('Downloaded %s' % this_file.split('/')[-1])  
    
        figure_file = PLOT_DIRECTORY + '/TROPESS_CrIS-JPSS1_' + species + '_' + START_DATE + '-' + END_DATE + '.png'

        local_standard_list = [DOWNLOAD_DIRECTORY + '/' + item.split('/')[-1] for item in SPECIES_DICT[species]['standard_file_list']]
        if 'summary' in SPECIES_DICT[species]:
            local_summary_list = [DOWNLOAD_DIRECTORY + '/' + item.split('/')[-1] for item in SPECIES_DICT[species]['summary_file_list']]
        
        # Read data
        if species != 'TATM' and species != 'H2O' and species != 'HDO':
            l2summary = read_l2summary(files=local_summary_list,
                                    verbose=0)
        l2standard = read_l2standard(files=local_standard_list,
                                    verbose=0)
        # Run plotting routine
        plot_daily_overview(l2summary=l2summary,
                            l2standard=l2standard,
                            file_out=figure_file)
        print('Produced plot %s' % figure_file)