# GOSIF Data Exploration
GOSIF is a science data product developed by UNH (get doi) that estimates Solar-induced fluorescence (SIF) at a 6km spatial resolution with global coverage on an 8-day temporal cadence. This product is generated by using a Cubist linear regression model trained on OCO-2 SIF soundings, MODIS Terra and Aqua corrected reflectance data, and MERRA-2 climate reanalysis model outputs. The statistical techniques used for generating this data will not be discussed in detail here, but we will guide you through downloading, viewing, and performing analysis with GOSIF data. Additionally, we will compare monthly-averaged OCO-2 and OCO-3 data with GOSIF granules and perform a regression analysis.

In [1]:
from http.server import HTTPServer, SimpleHTTPRequestHandler
from IPython.display import IFrame
import json
import matplotlib.pyplot as plt
import matplotlib.colors as colors
import numpy as np
import rasterio
from rasterio.plot import show
import socket
import threading

## II. Transforming a GOSIF granule into a format suitable for viewing
UNH provides GOSIF in GeoTIFF format, a common file format for geospatial data. While it is possible to view colormapped GeoTIFF files in GIS software like QGIS, its default encoding is greyscale with no transparency layer for regions with no data, such as over oceans and waterways. We will therefore convert the granule you downloaded in the previous step into PNG format with a colormap "baked in", meaning the SIF grid points will be quantized to 8-bit. This PNG will be much easier to view in the map viewer in the next step.

In [2]:
def convert_geotiff_to_png(geotiff_path, output_png_path, output_metadata_path, threshold=32760):
    with rasterio.open(geotiff_path) as src:
        data = src.read(1)  # Read the first band
        
        # Get metadata for georeferencing
        metadata = {
            "bounds": src.bounds._asdict(),
            "width": src.width,
            "height": src.height,
            "crs": src.crs.to_string()
        }
        
        mask = data > threshold
        valid_data = np.ma.masked_array(data, mask)
        vmin = np.nanmin(valid_data)
        vmax = np.nanmax(valid_data[valid_data <= threshold])
        norm_data = colors.Normalize(vmin=vmin, vmax=vmax)

        dpi = 360
        width_inches = src.width / dpi
        height_inches = src.height / dpi

        # Create a masked version where values > threshold will be transparent
        cmap = plt.cm.viridis.copy()
        cmap.set_bad(alpha=0)  # Set masked values to be transparent
        
        fig = plt.figure(figsize=(width_inches, height_inches), dpi=dpi)
        ax = plt.Axes(fig, [0, 0, 1, 1])  # No margins
        ax.set_axis_off()
        fig.add_axes(ax)
        ax.imshow(valid_data, cmap=cmap, norm=norm_data, interpolation="nearest", aspect="auto")
        plt.savefig(
            output_png_path, 
            dpi=dpi,
            bbox_inches="tight", 
            pad_inches=0, 
            transparent=True
        )
        plt.close(fig)
        
        # Save the metadata as JSON
        with open(output_metadata_path, "w") as f:
            json.dump(metadata, f)
        
        print(f"Converted {geotiff_path} to {output_png_path} with metadata at {output_metadata_path}")

# Example usage
convert_geotiff_to_png("data/GOSIF_2020.M06.tif", "data/GOSIF_2020.M06.png", "data/GOSIF_2020.M06_metadata.json")

Converted data/GOSIF_2020.M06.tif to data/GOSIF_2020.M06.png with metadata at data/GOSIF_2020.M06_metadata.json


Now we will open the converted image in the imaging viewing webapp. If you would like to view this visualization in a separate tab, open the link that will be printed when you run this cell.

In [5]:
def is_port_in_use(port):
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
        try:
            s.bind(('', port))
            return False
        except socket.error:
            return True


def run_server(port=8000):
    if is_port_in_use(port):
        return None
    server_address = ('', port)
    httpd = HTTPServer(server_address, SimpleHTTPRequestHandler)
    thread = threading.Thread(target=httpd.serve_forever)
    thread.daemon = True
    thread.start()
    return httpd

# Start the server
port = 5500
run_server(port)

# gosif.html loaded on an http server, then displayed in an iframe. You can also
# load the page separately in your browser to view with the full window size.
url = f"http://localhost:{port}/gosif.html"
print(f"You can also view this map by copying this address into a new tab: {url}")
IFrame(src=url, width=1200, height=800)

You can also view this map by copying this address into a new tab: http://localhost:5500/gosif.html
