## Exploring PACE OCI vegetation indexes
In this tutorial, we will work with Land data product suite from the PACE Ocean Color Instrument (OCI) to explore variations in hyperspectral vegetation indexes trhough time. Specifically, we’ll recreate and investigate the RGB composite map shown below, where three pigment-sensitive vegetation indices are mapped to color channels:

- Red: Modified Anthocyanin Reflectance Index (mARI)
- Green: Chlorophyll Index Red Edge (CIRE)
- Blue: Carotenoid Content Index (Car)


<figure>
    <img src="img/pace_global.PNG" alt="" width="500">
    <figcaption style="font-style: italic; margin-bottom: 40px;"> </figcaption>
</figure> 

#### Hyperspectral vegetation indexes 
The PACE OCI instrument is the first to provide global, high-temporal-resolution hyperspectral data over land. This capability enables time series of vegetation indices that go beyond traditional broadband metrics like NDVI and EVI, allowing us to target specific plant pigments such as anthocyanins, chlorophyll, and carotenoids.

Since such dense spectral data have never been available at this spatial and temporal scale, we now have a unique opportunity to assess the global applicability of hyperspectral vegetation indices and improve our understanding of the conditions, limitations, and caveats that affect where and when they should be used. 

The three indexes we will use in this tutorial are:

---

##### 1. Modified Anthocyanin Reflectance Index (mARI)

$$
\text{mARI} = \left( \frac{1}{\rho_{550}} - \frac{1}{\rho_{705}} \right) \cdot \rho_{800}
$$

---

##### 2. Chlorophyll Index Red Edge (CIRE)

$$
\text{CIRE} = \left( \frac{\rho_{800}}{\rho_{705}} \right) - 1
$$

---

##### 3. Carotenoid Content Index (Car)

$$
\text{Car} = \left( \frac{1}{\rho_{495}} - \frac{1}{\rho_{705}} \right) \cdot \rho_{800}
$$



#### Australia Case Study

In the visualization above, the central desert regions of Australia appear magenta, indicating high values of both mARI (red) and Car (blue), despite being sparsely vegetated area. This raises important questions about what these indices are capturing in arid environments and how they should be used in a global context.

---

#### Objectives

This notebook will:

1. Demonstrate two useful visualization techniques for PACE land data:
   - An RGB animation of three vegetation indices over time.
   - An interactive viewer that lets users click on the map to explore temporal dynamics of a selected vegetation index.
2. Investigate the regions of Australia for spectral signals contributing to the magenta display. 



### Set up 

In [None]:
# import packages 
import earthaccess
import xarray as xr
import hvplot.xarray
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import math
import pandas as pd

### 1. Query land data products

We will use `earthaccess` to find the PACE vegetation index monthly composite products over Australia between March 2024 and March 2025. The shortname for these products is `PACE_OCI_L3M_LANDVI`.

In [2]:
# authenticate earth access
auth = earthaccess.login(persist=True)

In [3]:
# select time span and study area
tspan = ("2024-03-01", "2025-03-31")
bbox = (123.983, -27.459, 127.156, -22.571)

In [4]:
results_land = earthaccess.search_data(
    short_name="PACE_OCI_L3M_LANDVI",
    temporal=tspan,
    bounding_box=bbox,
    granule_name="*.Day.*0p1deg*",  # Daily, 8-day or monthly: Day, 8D or MO | Resolution: 0p1deg or 0.4km
)

We will use the `earthaccess.open` function to stream the vegetation index products returned by our query, allowing us to read the data directly without downloading the full files.

In [5]:
paths = earthaccess.open(results_land)
#paths

QUEUEING TASKS | :   0%|          | 0/378 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/378 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/378 [00:00<?, ?it/s]

Now we open the files for all dates using `xarray.Dataset` and we will concatenate the data by date. This will allow us to plot information for several dates at once.  

In [6]:
dataset_land = xr.open_mfdataset(paths,
    combine="nested",
    concat_dim="date" )


### 2. Create animation of false color display

We will change the range of values for the three vegetation indexes to the ones used in the intial display. These ranges were manually selected to optimize the display and highlight meaningful patterns in the data.

In [7]:
dataset_land["mari"] = dataset_land["mari"].clip(
    min=1.3,
    max=2.0
)
dataset_land["cire"] = dataset_land["cire"].clip(
    min=0.5,
    max=2.0
)
dataset_land["car"] = dataset_land["car"].clip(
    min=1.3,
    max=6.2
)

In [9]:
# the land products come with a whole suite of vegetation indices 
# we will remove the ones we no longer need
dataset_veg = dataset_land.drop_vars(
    ["palette", "ndvi", "evi", "ndwi", "ndii", "cci", "ndsi", "pri"]
)

In [10]:
# normalize our dataset
dataset_v_norm = dataset_veg.astype(np.float64)
dataset_v_norm = (
    (dataset_veg - dataset_veg.min())
    / (dataset_veg.max() - dataset_veg.min())
)

data_land = dataset_v_norm.to_dataarray()
dataset_v_norm.to_dataarray()

Unnamed: 0,Array,Chunk
Bytes,27.37 GiB,2.00 MiB
Shape,"(3, 378, 1800, 3600)","(1, 1, 512, 1024)"
Dask graph,18144 chunks in 3484 graph layers,18144 chunks in 3484 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 27.37 GiB 2.00 MiB Shape (3, 378, 1800, 3600) (1, 1, 512, 1024) Dask graph 18144 chunks in 3484 graph layers Data type float32 numpy.ndarray",3  1  3600  1800  378,

Unnamed: 0,Array,Chunk
Bytes,27.37 GiB,2.00 MiB
Shape,"(3, 378, 1800, 3600)","(1, 1, 512, 1024)"
Dask graph,18144 chunks in 3484 graph layers,18144 chunks in 3484 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Now we will select the three vegetation indexes that are in the original image 

In [11]:
plant_pigments = data_land.sel(
variable = ['mari',  'cire', 'car']
)


#### Clip to Australia
Although we queried the earth access with the australia bounding box, our images are not cropped to that area because the L3 products are global. Here, will clip the products to the bounding box of australia.

In [12]:
min_lon, max_lat, max_lon, min_lat = bbox
australia = plant_pigments.sel(lat=slice(min_lat, max_lat), lon=slice(min_lon, max_lon))

Next we will create a few functions to generate an animation of this data.

In [None]:
num_channels, num_dates, lat_dim, lon_dim = australia.shape

assert num_channels == 3, "Expected 3 variables for RGB"

# Normalize data to [0, 1] for imshow (if needed)
def normalize_rgb(rgb_array):
    """Normalize to [0, 1] per channel if values are not already in that range"""
    rgb_norm = np.empty_like(rgb_array, dtype=np.float32)
    for c in range(3):
        channel = rgb_array[c]
        min_val = np.nanmin(channel)
        max_val = np.nanmax(channel)
        rgb_norm[c] = (channel - min_val) / (max_val - min_val + 1e-8)
    return rgb_norm

# Create figure and axes
fig, ax = plt.subplots(figsize=(8, 6))

# Get first frame RGB image
rgb0 = normalize_rgb(australia[:, 0, :, :])
rgb0 = np.moveaxis(rgb0, 0, -1)  # (3, H, W) → (H, W, 3)

im = ax.imshow(rgb0, animated=True)
title = ax.set_title("Month 3")

# Animation update function
def update(frame):
    rgb = normalize_rgb(australia[:, frame, :, :])
    rgb = np.moveaxis(rgb, 0, -1)  # (3, H, W) → (H, W, 3)
    im.set_data(rgb)
    title.set_text(f"Month {frame + 3}")
    return [im, title]

ani = animation.FuncAnimation(fig, update, frames=num_dates, blit=True, interval=500)
# save the animation 
#ani.save('australia_rgb_animation.gif', writer='pillow', fps=2)

plt.show()

### 3. Create an interactive plot to visualize index time series 

Combining spatial and veg index time series information into a single visualization can be a useful tool for exploring land patterns. Using the streams module from Holoviews we can link a spatial map to a plots of veg indexes.

In [12]:
ds = dataset_v_norm

First we create a map of our RGB dataset to create a spatial plot for us to click on. 

In [None]:
# === RGB composite map ===
plant_pigments = ds.to_dataarray().sel(
variable = ['mari',  'cire', 'car']
)
mymap = plant_pigments.hvplot.rgb(
    x='lon', y='lat', bands='variable', aspect='equal',
    frame_height=350, frame_width=550
)


This ‘map’ will be an inputs for a function to return values from the full dataset at that latitude and longitude location selected by the user. To do so we use the Point Draw tool from the holoviews library.

Click in the RGB image to add spectra to the plot. You can also click and hold the mouse button then drag previously placed points. To remove a point click and hold the mouse button down, then press the backspace key.

In [None]:
hv.extension('bokeh')

# Set point limit and color cycle
POINT_LIMIT = 10
color_cycle = hv.Cycle('Category20')
colors = [str(color_cycle.values[i]) for i in range(POINT_LIMIT)]  # ensure strings

# Get center of dataset
xmid = ds.lon.values[int(len(ds.lon) / 2)]
ymid = ds.lat.values[int(len(ds.lat) / 2)]

# First default point and color
clicked_points = ([xmid], [ymid], [0], [colors[0]])
points_df = pd.DataFrame({
    'x': clicked_points[0],
    'y': clicked_points[1],
    'id': clicked_points[2],
    'color': clicked_points[3]
})

# Create Points element with color
points = hv.Points(points_df, vdims=['id', 'color']).opts(
    color='color', size=10, tools=['hover'], line_color='gray'
)

# Create PointDraw stream
points_stream = hv.streams.PointDraw(
    data=points_df.to_dict(orient='list'),
    source=points,
    drag=True,
    num_objects=POINT_LIMIT
)


# Pointer streams
posxy = hv.streams.PointerXY(source=mymap, x=xmid, y=ymid)

# === Click spectra functions for each variable ===
def make_click_spectra(varname):
    def plot(data):
        coordinates = []
        if data is None or not any(len(d) for d in data.values()):
            coordinates.append((clicked_points[0][0], clicked_points[1][0]))
        else:
            coordinates = list(zip(data['x'], data['y']))

        plots = []
        for i, coords in enumerate(coordinates):
            x, y = coords
            data_sel = ds.sel(lon=x, lat=y, method="nearest")
            color = str(colors[i % len(colors)])

            line = data_sel.hvplot.line(
                y=varname, x="date", label=f"Point {i}"
            ).opts(line_color=color, height=300, width=650)
            plots.append(line)

            points_stream.data["id"][i] = i
            points_stream.data["color"][i] = color

        return hv.Overlay(plots).opts(title=varname.capitalize())
    return plot

# Create three DynamicMaps
mari_dmap = hv.DynamicMap(make_click_spectra('mari'), streams=[points_stream])
car_dmap = hv.DynamicMap(make_click_spectra('car'), streams=[points_stream])
cire_dmap = hv.DynamicMap(make_click_spectra('cire'), streams=[points_stream])

# === Final layout: RGB map + points on top, then all plots in a column ===
layout = (
    (mymap.opts(
        title="RGB Composite Map",
        show_legend=True,
        fontscale=1.5
    ) * points).opts(
        hv.opts.Overlay(active_tools=['point_draw'])
    ) +
    mari_dmap +
    car_dmap +
    cire_dmap
).cols(1)

layout

In [None]:
<video controls src="../vegetation_indices_animation