# Working with GOSIF data
[GOSIF](https://doi.org/10.3390/rs11050517) is a science data product generated by Dr. Jingfeng Xiao's group that estimates SIF with global coverage at 0.05° (~6km/pixel) resolution on an 8-day cadence. The estimates of SIF are derived from a data-based approach that combines data from the Moderate Resolution Imaging Spectroradiometer (MODIS) instruments onboard the Terra and Aqua spacecraft with OCO-2 SIF measurements and MERRA-2 meteorological model data. MODIS data from Terra and Aqua are an invaluable resource for climate analysis because they provide a 25-year record of daily global coverage imaging across 36 spectral bands. The researchers combined these data with OCO-2 SIF soundings and MERRA-2 model outputs to train a Cubist regression tree model that can predict SIF for a MODIS 0.05° grid. Importantly, GOSIF is able to provide these predictions from 2000 up to the present (although data is presently available up to 2023), meaning it includes a 14-year period before OCO-2 even launched. 

In this exercise, we will download and view SIF data, then compare its accuracy with direct SIF soundings from OCO-2 or OCO-3. Afterwards, we will see how GOSIF can be used in analysis.

In [None]:
from http.server import HTTPServer, SimpleHTTPRequestHandler
from IPython.display import IFrame
import os
import socket
import sys
import threading

# Add src directory containing helper code to sys.path
sys.path.append(os.path.abspath("../src"))

from pysif import convert_geotiff_to_png, download_unpack_gosif

## I. Downloading GOSIF granules from UNH
First, we will download a GOSIF granule from the University of New Hampshire (UNH) data store maintained by Dr. Xiao's research group. GOSIF products are created at annual, monthly, and 8-day time steps. If you give the function in the cell below a year value only (i.e., no month or day) it will download the annual product for that year, if available. Similarly, providing a year and a month will download the monthly product, and a year, month and day together will download the closest 8-day product. 

In [None]:
# Download June 2020 Monthly Average GOSIF data
year = 2020
month = 6
output_dir = "data/gosif/"

gosif_geotiff = download_unpack_gosif(year, month, output_dir)

## II. Transforming a GOSIF granule into a format suitable for viewing
UNH provides GOSIF in GeoTIFF format, a common file format for geospatial data. While it is possible to view colormapped GeoTIFF files in GIS software like QGIS, its default encoding is greyscale with no transparency layer for regions with no data, such as over oceans and waterways. We will therefore convert the granule you downloaded in the previous step into PNG format with a colormap "baked in", meaning the SIF grid points will be quantized to 8-bit. This PNG will be much easier to view in the map viewer in the next step.

In [None]:
# The threshold and scale factor parameters come from the documentation: https://data.globalecology.unh.edu/data/GOSIF_v2/Fair_Data_Use_Policy_and_Readme_GOSIF_v2.pdf
# 32767 = water bodies, 32766 = snow/ice
data_threshold = 32765
# This value tells our code the conversion between pixel values in the GeoTIFF images to units of W/m^2/sr/μm
gosif_scale_factor = 0.0001

# Filenames for converting the geotiff to png
gosif_fname_noext = os.path.splitext(gosif_geotiff)[0]
gosif_png         = gosif_fname_noext + ".png"

# The vmax chosen in this example is equivalent to 0.8 W/m^2/sr/μm
convert_geotiff_to_png(
    gosif_geotiff,
    gosif_png,
    vmax=8000,
    # Uncomment to bound the data to the CONUS
    # bounds={"left": -130, "bottom": 22, "right": -65, "top": 50},
    threshold=data_threshold,
    scale_factor=gosif_scale_factor
)

Now we will open the converted image in the imaging viewing webapp. If you would like to view this visualization in a separate tab, open the link that will be printed when you run this cell.

In [None]:
def is_port_in_use(port):
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
        try:
            s.bind(('', port))
            return False
        except socket.error:
            return True


def run_server(port):
    if is_port_in_use(port):
        return None
    server_address = ('', port)
    httpd = HTTPServer(server_address, SimpleHTTPRequestHandler)
    thread = threading.Thread(target=httpd.serve_forever)
    thread.daemon = True
    thread.start()
    return httpd

# Start the server
port = 5500
run_server(port)

# gosif.html loaded on an http server, then displayed in an iframe. You can also
# load the page separately in your browser to view with the full window size.
url = f"http://localhost:{port}/gosif.html?file={gosif_png}"
print(f"You can also view this map by copying this address into a new tab: {url}")
IFrame(src=url, width=1200, height=800)

## III. Using GOSIF data to study climate disruptions
Now that we have walked through downloading and displaying a GOSIF product, let's discuss how this data can be used to study the impacts of climate disruptions on agriculture and ecosystems. In particular, we will look at the impact of the [2019 Midwestern Floods](https://en.wikipedia.org/wiki/2019_Midwestern_U.S._floods) on agriculture in the Corn Belt of the United States, an event that was examined in [Yin et al., 2020](https://doi.org/10.1029/2019AV000140). The study used TROPOMI SIF spatially aggregated to a county level, but in this exercise we will use GOSIF data and compare the results to those found by the paper.

The basic steps we will follow are as follows:
1. Download monthly averaged GOSIF data from June - September of 2018 and 2019. The 2018 data will act as a control to compare against 2019, when the flooding occurred.
2. Convert all the data to use the same color scale, from $0.0$ to $0.8$ $W/m^2/sr/μm$, so that data can be compared visually. We will also bound the data to just the US Midwest region of interest.
3. Plot the data in an interactive widget to see side-by-side comparisons.
4. Compute the year over year (YoY) % change in SIF for the corn belt 

In [None]:
# Download the granules from the study period
output_dir = "data/gosif/animation/"
dates: list[tuple[int, int]] = []
#for year in [2018, 2019]:
#    for month in range(6, 10):
#        dates.append((year, month))
for year in [2018, 2019]:
    for doy in range(73, 298, 8):
        dates.append((year, doy))


gosif_geotiffs: str = []
for date_tuple in dates:
    yr = date_tuple[0]
    # mt = date_tuple[1]
    dy = date_tuple[1]
    gosif_geotiffs.append(download_unpack_gosif(yr, day=dy, output_dir=output_dir))

In [None]:
# The threshold and scale factor parameters come from the documentation: https://data.globalecology.unh.edu/data/GOSIF_v2/Fair_Data_Use_Policy_and_Readme_GOSIF_v2.pdf
# 32767 = water bodies, 32766 = snow/ice
data_threshold = 32765
# This value tells our code the conversion between pixel values in the GeoTIFF images to units of W/m^2/sr/μm
gosif_scale_factor = 0.0001
study_area = {"left": -102, "bottom": 31, "right": -80.5, "top": 49}

gtiff_dir = os.path.dirname(gosif_geotiffs[0])
png_dir = os.path.join(gtiff_dir, "pngs/")

for gtiff in gosif_geotiffs:
    # Filenames for converting the geotiff to png
    fname_noext = os.path.splitext(os.path.basename(gtiff))[0]
    gpng        = os.path.join(png_dir, fname_noext + ".png")

    # The vmax chosen in this example is equivalent to 0.8 W/m^2/sr/μm
    convert_geotiff_to_png(
        gtiff,
        gpng,
        vmin=0,
        vmax=8000,
        bounds=study_area,
        threshold=data_threshold,
        scale_factor=gosif_scale_factor
    )

In [None]:
def is_port_in_use(port):
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
        try:
            s.bind(('', port))
            return False
        except socket.error:
            return True


def run_server(port):
    if is_port_in_use(port):
        return None
    server_address = ('', port)
    httpd = HTTPServer(server_address, SimpleHTTPRequestHandler)
    thread = threading.Thread(target=httpd.serve_forever)
    thread.daemon = True
    thread.start()
    return httpd

# Start the server
port = 5500
run_server(port)

# gosif.html loaded on an http server, then displayed in an iframe. You can also
# load the page separately in your browser to view with the full window size.
url = f"http://localhost:{port}/gosif.html?file={gpng}"
print(f"You can also view this map by copying this address into a new tab: {url}")
IFrame(src=url, width=1200, height=800)