## Run this notebook

You can launch this notbook using mybinder, by clicking the button below.

[Placeholder for Binder link]

## Approach

   1. Identify available dates and temporal frequency for a given collection
   2. Pass STAC item into raster API `/stac/tilejson.json` endpoint
   3. Get time series statistics over available time period to identify seasonal trends
   4. Visualize peak by displaying the tile in `folium`
   5. Visualize time series 
   

## About the Data

[Ocean Net Primary Production (NPP)](https://oceancolor.gsfc.nasa.gov/atbd/npp/) is the result of CO2 fixation, through photosynthesis, by marine phytoplankton which contain chlorophyll. It is the proportion of phytoplankton-sequestered carbon that enters the oceanic food web and supports a variety of marine life.  

## The Case Study - Walvis Bay, Namibia

Walvis Bay is home to Namibia's largest marine farming center and a well established commercial fishing industry. It's location in the nutrient-rich Benguela upwelling system of the Atlantic Ocean, means producers can rely on this area to cultivate an abundance of shellfish including oysters, mussels, and scallops.

Occasionally the nutrient-rich waters of the Atlantic produce higher than normal NPP levels, resulting in short-lived harmful algal blooms. This is often a result of both favorable temperatures and abundance of sufficient nutrients. The resulting algal blooms can have severe consequences causing massive fish kills, contaminating seafood with toxins and creating an unsafe environment for humans and marine life. Toxins accumulated in the shellfish organs can be subsequently transmitted to humans through consumption and resulting in serious health threats. 

In this example we explore the Ocean NPP dataset over the year 2020 to identify spatial and temporal patterns in NPP in the Walvis Bay area. 

## Querying the STAC API

In [None]:
import requests
from folium import Map, TileLayer


In [None]:
# Provife STAC and RASTER API endpoints
STAC_API_URL = "https://staging-stac.delta-backend.com"
RASTER_API_URL = "https://staging-raster.delta-backend.com"

# Declare collection of interest - Ocean NPP 
collection_name = "MO_NPP_npp_vgpm"

In [None]:
#Fetch STAC collection
collection = requests.get(f"{STAC_API_URL}/collections/{collection_name}").json()
collection

In [None]:
# Verify frequency of data available
collection["dashboard:time_density"]

In [None]:
# Get collection summary
collection["summaries"]

Great, we can explore the year 2020 time series. Let's create a bounding box to explore the Walvis Bay area of interest (AOI) in Namibia

In [None]:
# Walvis Bay, Namibia
walvis_bay_aoi = {
    "type": "Feature",
    "properties": {},
    "geometry": {
        "coordinates": [
          [
            [
              13.686159004559698,
              -21.700046934333145
            ],
            [
              13.686159004559698,
              -23.241974326585833
            ],
            [
              14.753560168039911,
              -23.241974326585833
            ],
            [
              14.753560168039911,
              -21.700046934333145
            ],
            [
              13.686159004559698,
              -21.700046934333145
            ]
          ]
        ],
        "type": "Polygon"
      }
    }

Let's visualize the AOI we have just created using `folium` 

In [None]:
# We'll plug in the coordinates for a location
# central to the study area and a reasonable zoom level

import folium
m = Map(
    tiles="OpenStreetMap", 
    location=[
         -22.421460,
         14.268801,
        ], zoom_start=8)

folium.GeoJson(walvis_bay_aoi, name="Walvis Bay").add_to(m)
m

Returning back to our STAC API requests, let's check how many total items are available. 

In [None]:
# Check total number of items available
items = requests.get(f"{STAC_API_URL}/collections/{collection_name}/items?limit=100").json()["features"]
print(f"Found {len(items)} items")

This makes sense is our collection is monthly, so we should have 12 total items.

In [None]:
# Explore one item to see what it contains
items[0]

Now that we have explored the collection metadata by querying the STAC API, we can use the RASTER API to access the data itself.

In [None]:
# the bounding box should be passed to the geojson param as a geojson Feature or FeatureCollection
def generate_stats(item, geojson):
    result = requests.post(
        f"{RASTER_API_URL}/cog/statistics", 
        params={"url": item["assets"]["cog_default"]["href"]},
        json=geojson
    ).json()    
    return {
        **result["properties"], "start_datetime": item["properties"]["start_datetime"]
    }

In [None]:
%%time 
stats = [generate_stats(item, walvis_bay_aoi) for item in items]

With the function provided above, we can generate statistics for our AOI. In the example below, we'll explore sample statistics available from one of the tiles. 

In [None]:
stats[0]

In [None]:
import pandas as pd

def clean_stats(stats_json) -> pd.DataFrame:
    df = pd.json_normalize(stats_json)
    df.columns = [col.replace("statistics.1.", "") for col in df.columns]
    df["date"] = pd.to_datetime(df["start_datetime"])
    return df

df = clean_stats(stats)

## Visualizing the Data as a Time Series

We can now explore the full time series available (January-December 2020) for the Walvis Bay area of Namibia. We can plot the data set using the code below: 

In [None]:
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(20,10))

plt.plot(df["date"], df["mean"], 'black', label="Mean monthly Ocean NPP values")

plt.fill_between(df["date"], df["mean"] + df["std"], df["mean"] - df["std"], facecolor="lightgray", interpolate=False, label="+/- one standard devation")

plt.plot(df["date"], df["min"], color='blue', linestyle="-", linewidth=0.5, label="Min monthly NPP values")
plt.plot(df["date"], df["max"], color='red', linestyle="-", linewidth=0.5, label="Max monhtly NPP values")

plt.legend()
plt.title("Ocean NPP Values for Walvis Bay, Namibia (2020)")

Here, we observe the seasonal variability in oceanic NPP for the Walvis Bay area. The larger peaks in the max values suggests the intensity of these events may vary spatially. Let's explore one of the time steps (e.g., October) where there are higher maximum monthly NPP values to see if this is the case.

**Important note**: Keep in mind that the size and extent of your AOI will influence the 'signal' of your time series. If the phenomena you are investigating displays greater spatial variability a larger AOI will provide more 'noise' making it more difficult to detect.

## Visualizing the Raster Imagery

Let's first explore a single tile during one of the relative peaks in October, where we observe an increased sustained peak in NPP values.

In [None]:
print(items[2]['properties']['start_datetime'])

In [None]:
# Looking at just a single image (one time-stamp in )
item = items[2]

In [None]:
rescale_values = collection["summaries"]["cog_default"]
rescale_values

In [None]:
tiles = requests.get(
    f"{RASTER_API_URL}/stac/tilejson.json?collection={item['collection']}&item={item['id']}"
    "&assets=cog_default"
    "&color_formula=gamma+r+1.05&colormap_name=viridis"
    f"&rescale={rescale_values['min']},{rescale_values['max']}", 
).json()
tiles

In [None]:
# Use bbox initial zoom and map
# Set up a map located w/in event bounds
import folium
m = Map(
    tiles="OpenStreetMap", 
    location=[
         -22.421460,
         14.268801,
        ], zoom_start=8)

map_layer = TileLayer(
    tiles=tiles["tiles"][0],
    attr="VEDA",
)

map_layer.add_to(m)

m

From the image above, we see higher NPP values (displayed in teal) located in and around Walvis Bay and the surrounding shorelines - highlighting areas of concern for the local shellfish industry. 

## Visualizing the raster time  series
Now we will look at each of the raster tiles that make up this time series to explore the spatial and temporal patterns of NPP observed in Walvis Bay throughout 2020.

In [None]:
import matplotlib.pyplot as plt

for item in items:
    tiles = requests.get(
        f"{RASTER_API_URL}/stac/tilejson.json?collection={item['collection']}&item={item['id']}"
        "&assets=cog_default"
        "&color_formula=gamma+r+1.05&colormap_name=viridis"
        f"&rescale={rescale_values['min']},{rescale_values['max']}", 
        ).json()
    print(tiles['tiles'])


We can use the GIF generation example in the documentation [here](https://nasa-impact.github.io/veda-docs/example-notebooks/gif-generation.html#the-cogcrop-endpoint) to help visualize the raster images as a timeseries over 2020. 

In [None]:
# get PNG bytes from API
import tempfile
from IPython.display import display, Image

COG_DEFAULT = [
    x for x in requests.get(f"{STAC_API_URL}/collections").json()["collections"] if x["id"] == "MO_NPP_npp_vgpm"
][0]["summaries"]["cog_default"]

for item in items:
    image_bytes = requests.post(
        f"{RASTER_API_URL}/cog/crop", 
        params={
            "format": "png",
            "height": 512, 
            "width": 512, 
            "url":items[0]["assets"]["cog_default"]["href"],
            "rescale": f"{COG_DEFAULT['min']},{COG_DEFAULT['max']}",
            "colormap_name": "viridis"
        },
        json=walvis_bay_aoi
    ).content

    # Write to temporary file in order to display
    f = tempfile.NamedTemporaryFile(suffix=".png") 
    f.write(image_bytes)   

    # display PNG!
    display(Image(filename=f.name, height=512, width=512))
    
    #currently is overwriting each temp file, need to revise

In [None]:
import glob
import os
import tempfile
import time

from concurrent.futures import ThreadPoolExecutor
from IPython.display import display, Image
from gif_generation_dependencies.helper_functions import generate_frame

Approach 1: 

In [None]:
COG_DEFAULT = [
    x for x in requests.get(f"{STAC_API_URL}/collections").json()["collections"] if x["id"] == "MO_NPP_npp_vgpm"
][0]["summaries"]["cog_default"]

with tempfile.TemporaryDirectory() as tmpdirname:
    start = time.time()

    args = (
        (
            item, 
            walvis_bay_aoi, 
            tmpdirname, 
            "tif", 
            "folium",
            {
                "rescale":f"{COG_DEFAULT['min']},{COG_DEFAULT['max']}",
                "colormap_name":"viridis"
            }
        ) for item in items
    )
    
    with ThreadPoolExecutor(max_workers=10) as executor: 
        result = list(executor.map(lambda a: generate_frame(*a), args))
    
    end = time.time()

    print(f"Gather frames: {round((end-start), 2)} seconds")

    # Note: I'm searching for `*.png` files instead of *.tif files because the webdriver screenshot
    # of the folium map interface is exported in png format (this also helps reduce the size of
    # the final gif )
    imgs = [Image.open(f) for f in sorted(glob.glob(os.path.join(tmpdirname, "*.png")))]
    imgs[0].save(fp="./output_with_osm_basemap.gif", format='GIF', append_images=imgs[1:], save_all=True, duration=300, loop=0)

display.Image(filename="./output_with_osm_basemap.gif")





In [None]:
with tempfile.TemporaryDirectory() as tmpdirname:
    generate_frame(items[0], 
                walvis_bay_aoi, 
                tmpdirname, 
                "tif", 
                "folium",
                {
                    "rescale":f"{COG_DEFAULT['min']},{COG_DEFAULT['max']}",
                    "colormap_name":"viridis"
                }
    )


items[0]

Approach 2: 

In [None]:
COG_DEFAULT = [
    x for x in requests.get(f"{STAC_API_URL}/collections").json()["collections"] if x["id"] == "MO_NPP_npp_vgpm"
][0]["summaries"]["cog_default"]

# get PNG bytes from API
image_bytes = requests.post(
    f"{RASTER_API_URL}/cog/crop", 
    params={
        "format": "png",
        "height": 512, 
        "width": 512, 
        "url":items[0]["assets"]["cog_default"]["href"],
        "rescale": f"{COG_DEFAULT['min']},{COG_DEFAULT['max']}",
        "colormap_name": "viridis"
    },
    json=walvis_bay_aoi
).content

# Write to temporary file in order to display
f = tempfile.NamedTemporaryFile(suffix=".png") 
f.write(image_bytes)   

# display PNG!
display.Image(filename=f.name, height=512, width=512)

In [None]:
# temporary directory to hold PNGs
with tempfile.TemporaryDirectory() as tmpdirname:
    start = time.time()

    args = ((
            item, # stac item
            walvis_bay_aoi, # aoi to crop
            tmpdirname, # tmpdir (optional)
            "tif", 
            "folium",
            {
                "rescale":f"{COG_DEFAULT['min']},{COG_DEFAULT['max']}",
                "colormap_name":"viridis"
            } # visualization parameters 
    ) for item in items )

    with ThreadPoolExecutor(max_workers=10) as executor: 
        result = list(executor.map(lambda a: generate_frame(*a), args))

    end = time.time()

    print(f"Gather frames: {round((end-start), 2)} seconds")

    imgs = (Image.open(f) for f in sorted(glob.glob(os.path.join(tmpdirname, "*.png"))))
    
    img = next(imgs)  # extract first image from iterator
    img.save(fp="./output.gif", format='GIF', append_images=imgs, save_all=True, duration=300, loop=0)

display.Image(filename="./output.gif")







In this case study we have successfully visualized the spatial and temporal variability of NPP values the Benguela Current, which displays a seasonal pattern of peaking in the winter months when favorable temperatures and nutrient conditions are present. 

To Do: 
* work on revised rescale of raster values, perhaps with quantiles if possible - can't do as `cog_default` only has `min` and `max` values. Would need to revisit using pythonic steps instead of RASTER API
* create static grid of maps to display side-by-side all 12 time steps (zoomed in on Walvis Bay)