## Run this notebook

You can launch this notebook in the US GHG Center JupyterHub by clicking the link below.

[Launch in the US GHG Center JupyterHub (requires access)](https://hub.ghg.center/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2FUS-GHG-Center%2Fghgc-docs&urlpath=lab%2Ftree%2Fghgc-docs%2Fuser_data_notebooks%2Fcasagfed-carbonflux-monthgrid-v3_User_Notebook.ipynb&branch=main)


## Approach

1. Identify available dates and temporal frequency of observations for a given collection using the GHGC API `/stac` endpoint. The collection processed in this notebook is the Land-Atmosphere Carbon Flux data product.
2. Pass the STAC item into the raster API `/collections/{collection_id}/items/{item_id}/tilejson.json` endpoint.
3. Using `folium.plugins.DualMap`, visualize two tiles (side-by-side), allowing time point comparison.
4. After the visualization, perform zonal statistics for a given polygon.


## About the Data

This dataset presents a variety of carbon flux parameters derived from the Carnegie-Ames-Stanford-Approach – Global Fire Emissions Database version 3 (CASA-GFED3) model. The model’s input data includes air temperature, precipitation, incident solar radiation, a soil classification map, and a number of satellite derived products. All model calculations are driven by analyzed meteorological data from NASA’s Modern-Era Retrospective analysis for Research and Application, Version 2 (MERRA-2). The resulting product provides monthly, global data at 0.5 degree resolution from January 2003 through December 2017. It includes the following carbon flux variables expressed in units of kilograms of carbon per square meter per month (kg Carbon m²/mon) from the following sources: net primary production (NPP), net ecosystem exchange (NEE), heterotrophic respiration (Rh), wildfire emissions (FIRE), and fuel wood burning emissions (FUEL). This product and earlier versions of MERRA-driven CASA-GFED carbon fluxes have been used in a number of atmospheric CO₂ transport studies, and through the support of NASA’s Carbon Monitoring System (CMS), it helps characterize, quantify, understand and predict the evolution of global carbon sources and sinks.


# Installing the Required Libraries

Required libraries are pre-installed on the GHG Center Hub. If you need to run this notebook elsewhere, please install them with this line in a code cell:

%pip install requests, folium, rasterstats, pystac_client, pandas, matplotlib


## Querying the STAC API

Please run the next cell to import the required libraries.


In [None]:
import requests
import folium
import folium.plugins
from folium import Map, TileLayer 
from pystac_client import Client 
import branca 
import pandas as pd
import matplotlib.pyplot as plt

In [2]:
# Provide STAC and RASTER API endpoints
STAC_API_URL = "https://earth.gov/ghgcenter/api/stac"
RASTER_API_URL = "https://earth.gov/ghgcenter/api/raster"

# Please use the collection name similar to the one used in the STAC collection.
# Name of the collection for CASA GFED Land-Atmosphere Carbon Flux monthly emissions. 
collection_name = "casagfed-carbonflux-monthgrid-v3"

In [None]:
# Fetch the collection from STAC collections using the appropriate endpoint
# the 'requests' library allows a HTTP request possible
collection = requests.get(f"{STAC_API_URL}/collections/{collection_name}").json()
collection

Examining the contents of our `collection` under the `temporal` variable, we see that the data is available from January 2003 to December 2017. By looking at the `dashboard:time density`, we observe that the periodic frequency of these observations is monthly.

In [4]:
# Create a function that would search for the above data collection in the STAC API
def get_item_count(collection_id):
    count = 0
    items_url = f"{STAC_API_URL}/collections/{collection_id}/items"

    while True:
        response = requests.get(items_url)

        if not response.ok:
            print("error getting items")
            exit()

        stac = response.json()
        count += int(stac["context"].get("returned", 0))
        next = [link for link in stac["links"] if link["rel"] == "next"]

        if not next:
            break
        items_url = next[0]["href"]

    return count

In [None]:
# Apply the above function and check the total number of items available within the collection
number_of_items = get_item_count(collection_name)
items = requests.get(f"{STAC_API_URL}/collections/{collection_name}/items?limit={number_of_items}").json()["features"]
print(f"Found {len(items)} items")

In [None]:
# Examine the first item in the collection
items[0]

## Exploring Changes in Carbon Flux Levels Using the Raster API

We will explore changes in the land atmosphere Carbon flux `Heterotrophic Respiration` and examine their impacts over time. We'll then visualize the outputs on a map using `folium`.

In [7]:
# To access the year value from each item more easily, this will let us query more explicitly by year and month (e.g., 2020-02)
items = {item["properties"]["start_datetime"][:7]: item for item in items} 
# rh = Heterotrophic Respiration
asset_name = "rh"

Below, we are entering the minimum and maximum values to provide our upper and lower bounds in `rescale_values`.

In [8]:
rescale_values = {"max":items[list(items.keys())[0]]["assets"][asset_name]["raster:bands"][0]["histogram"]["max"], "min":items[list(items.keys())[0]]["assets"][asset_name]["raster:bands"][0]["histogram"]["min"]}

In [None]:
collection_id = items['2003-12']['collection']
item_id = items['2003-12']['id']
print(collection_id,item_id)


Now, we will pass the item id, collection name, and `rescaling_factor` to the `Raster API` endpoint. We will do this twice, once for December 2003 and again for December 2017, so that we can visualize each event independently.

In [None]:
color_map = "purd" # please refer to matplotlib library if you'd prefer choosing a different color ramp.
# For more information on Colormaps in Matplotlib, please visit https://matplotlib.org/stable/users/explain/colors/colormaps.html

# To change the year and month of the observed parameter, you can modify the "items['YYYY-MM']" statement
# For example, you can change the current statement "items['2003-12']" to "items['2016-10']" 
december_2003_tile = requests.get(
f"{RASTER_API_URL}/collections/{items['2003-12']['collection']}/items/{items['2003-12']['id']}/tilejson.json?"
f"&assets={asset_name}"
f"&color_formula=gamma+r+1.05&colormap_name={color_map}"
f"&rescale={rescale_values['min']},{rescale_values['max']}", 
).json()
december_2003_tile

In [None]:
# Now we apply the same process used in the previous step for the December 2017 tile
december_2017_tile = requests.get(
    f"{RASTER_API_URL}/collections/{items['2017-12']['collection']}/items/{items['2017-12']['id']}/tilejson.json?"
    f"&assets={asset_name}"
    f"&color_formula=gamma+r+1.05&colormap_name={color_map}"
    f"&rescale={rescale_values['min']},{rescale_values['max']}", 
).json()
december_2017_tile

## Visualizing Land-Atmosphere Carbon Flux (Heterotrophic Respiration)

In [None]:
# For this study we are going to compare the RH level in 2003 and 2017 over the State of Texas 
# To change the location, you can simply insert the latitude and longitude of the area of your interest in the "location=(LAT, LONG)" statement
# For example, you can change the current statement "location=(31.9, -99.9)" to "location=(34, -118)" to monitor the RH level in California instead of Texas

# Set initial zoom and center of map for CO₂ Layer
# 'folium.plugins' allows mapping side-by-side
map_ = folium.plugins.DualMap(location=(31.9, -99.9), zoom_start=6)

# The TileLayer library helps in manipulating and displaying raster layers on a map
# December 2003
map_layer_2003 = TileLayer(
    tiles=december_2003_tile["tiles"][0],
    attr="GHG",
    opacity=0.8,
    name="December 2003 RH Level",
    overlay= True,
    legendEnabled = True
)
map_layer_2003.add_to(map_.m1)


# December 2017
map_layer_2017 = TileLayer(
    tiles=december_2017_tile["tiles"][0],
    attr="GHG",
    opacity=0.8,
    name="December 2017 RH Level",
    overlay= True,
    legendEnabled = True
)
map_layer_2017.add_to(map_.m2)


# Display data markers (titles) on both maps
folium.Marker((40, 5.0), tooltip="both").add_to(map_)
folium.LayerControl(collapsed=False).add_to(map_)


# Add a legend to the dual map using the 'branca' library. 
# Note: the inserted legend is representing the minimum and maximum values for both tiles.
colormap = branca.colormap.linear.PuRd_09.scale(0, 0.3) # minimum value = 0, maximum value = 0.3 (kg Carbon/m2/month)
colormap = colormap.to_step(index=[0, 0.07, 0.15, 0.22, 0.3])
colormap.caption = 'Rh Values (kg Carbon/m2/month)'

colormap.add_to(map_.m1)


# Visualizing the map
map_

# Calculating Zonal Statistics

To perform zonal statistics, first we need to create a polygon. In this case we are creating a polygon in Texas (USA).

In [13]:
# The Area of Interest (AOI) is set to Dallas, Texas (USA)
texas_dallas_aoi = {
    "type": "Feature",
    "properties": {},
    "geometry": {
        "coordinates": [
            [
                # [longitude, latitude]
                [-96.1, 32.28],  # Southeast Bounding Coordinate
                [-96.1, 33.28],  # Northeast Bounding Coordinate
                [-97.58, 33.28], # Northwest Bounding Coordinate
                [-97.58, 32.28],  # Southwest Bounding Coordinate
                [-96.1, 32.28]   # Closing the polygon at the Southeast Bounding Coordinate
            ]
        ],
        "type": "Polygon",
    },
}

In [None]:
# We will plug in the coordinates for a location inside the the polygon and a zoom level
aoi_map = Map(
    tiles="OpenStreetMap",
    location=[
        32.81,-96.93, # coordinates for Dallas, Texas area
    ],
    zoom_start=9, # zoom in or out by increasing or decreasing the value here.
)

folium.GeoJson(texas_dallas_aoi, name="Texas, Dallas").add_to(aoi_map)
aoi_map

In [None]:
# Check the total number of items available for this location
items = requests.get(
    f"{STAC_API_URL}/collections/{collection_name}/items?limit=600"
).json()["features"]
print(f"Found {len(items)} items")

In [None]:
# Explore the first item
items[0]

In [17]:
# The bounding box should be passed to the geojson param as a geojson Feature or FeatureCollection
def generate_stats(item, geojson):
    result = requests.post(
        f"{RASTER_API_URL}/cog/statistics",
        params={"url": item["assets"][asset_name]["href"]},
        json=geojson,
    ).json()
    print(result)
    return {
        **result["properties"],
        "start_datetime": item["properties"]["start_datetime"],
    }

In [None]:
# Identify the start Date Time of the first observation in the collection
for item in items:
    print(item["properties"]["start_datetime"])
    break

With the function above, we can generate the statistics for the area of interest. Now, we are going to print the wall time - the real-world-time - using the %%time command for the entire collection!

In [None]:
%%time
stats = [generate_stats(item, texas_dallas_aoi) for item in items]

In [None]:
# Generate stats for the first item in the collection
stats[0]

In [None]:
# Create a function that goes through every single item in the collection and populates their properties - including the minimum, maximum, and sum of their values - in a table.
def clean_stats(stats_json) -> pd.DataFrame:
    df = pd.json_normalize(stats_json)
    df.columns = [col.replace("statistics.b1.", "") for col in df.columns]
    df["date"] = pd.to_datetime(df["start_datetime"])
    return df


df = clean_stats(stats)
df.head(5) # the number of granules displayed in the table can be changed by increasing or decreasing the value inserted here!

## Visualizing the Data as a Time Series

We can now explore the Heterotrophic Respiration time series (January 2003 -December 2017) available for the Dallas, Texas area. We can plot the data set using the code below:

In [None]:
fig = plt.figure(figsize=(20, 10)) #determine the width and height of the plot using the 'matplotlib' library

plt.plot(
    df["date"],
    df["max"],
    color="purple",
    linestyle="-",
    linewidth=0.5,
    label="Max monthly Carbon emissions",
)

plt.legend()
plt.xlabel("Years")
plt.ylabel("kg Carbon/m2/month")
plt.title("Heterotrophic Respiration Values for Dallas, Texas (2003-2017)")

In [None]:
# Now let's examine the Rh level for the 3rd item in the collection for Dallas, Texas area
# Keep in mind that a list starts from 0, 1, 2,... therefore items[2] is referring to the third item in the list/collection
print(items[2]["properties"]["start_datetime"]) #print the start Date Time of the third granule in the collection!

In [None]:
# Fetch the third granule in the collection and set the color scheme and rescale values. 
october_tile = requests.get(
    f"{RASTER_API_URL}/collections/{items[2]['collection']}/items/{items[2]['id']}/tilejson.json?"
    f"&assets={asset_name}"
    f"&color_formula=gamma+r+1.05&colormap_name={color_map}"
    f"&rescale={rescale_values['min']},{rescale_values['max']}",
).json()
october_tile

In [None]:
# Map the Rh level for the Dallas, Texas area for the October, 2017 timeframe
aoi_map_bbox = Map(
    tiles="OpenStreetMap",
    location=[
        32.8, # latitude
        -96.79, # longitude
    ],
    zoom_start=9,
)

map_layer = TileLayer(
    tiles=october_tile["tiles"][0],
    attr="GHG", opacity = 0.7, name="October 2017 RH Level", overlay= True, legendEnabled = True
)

map_layer.add_to(aoi_map_bbox)

# Display data marker (title) on the map
folium.Marker((40, 5.9), tooltip="both").add_to(aoi_map_bbox)
folium.LayerControl(collapsed=False).add_to(aoi_map_bbox)

# Add a legend
colormap = branca.colormap.linear.PuRd_09.scale(0, 0.3) # minimum value = 0, maximum value = 0.3 (kg Carbon/m2/month)
colormap = colormap.to_step(index=[0, 0.07, 0.15, 0.22, 0.3])
colormap.caption = 'Rh Values (kg Carbon/m2/month)'

colormap.add_to(aoi_map_bbox)

aoi_map_bbox

## Summary

In this notebook we have successfully completed the following steps for the STAC collection for CASA GFED Land-Atmosphere Carbon Flux data:

1. Install and import the necessary libraries
2. Fetch the collection from STAC collections using the appropriate endpoints
3. Count the number of existing granules within the collection
4. Map and compare the Heterotrophic Respiration (Rh) levels over the Dallas, Texas area for two distinctive years
5. Create a table that displays the minimum, maximum, and sum of the Rh values for a specified region
6. Generate a time-series graph of the Rh values for a specified region

If you have any questions regarding this user notebook, please contact us using the [feedback form](https://docs.google.com/forms/d/e/1FAIpQLSeVWCrnca08Gt_qoWYjTo6gnj1BEGL4NCUC9VEiQnXA02gzVQ/viewform).