---
title: OSI - Oil Spill Index
subtitle: Learn how to use the Oil Spill Index (OSI) to detect oil spills using Sentine-2 data.
authors:
  - name: Juraj Zvolenský
    orcid: 0009-0000-9185-7955
    github: jzvolensky
    affiliations:
      - id: Eurac Research
        institution: Eurac Research
        ror: 01xt1w755
  - name: Michele Claus
    orcid: 0000-0003-3680-381X
    github: clausmichele
    affiliations:
      - id: Eurac Research
        institution: Eurac Research
        ror: 01xt1w755
date: 2025-01-29
thumbnail: https://raw.githubusercontent.com/EOPF-Sample-Service/eopf-sample-notebooks/refs/heads/main/notebooks/static/ESA_logo_2020_Deep.png
keywords: ["earth observation", "remote sensing"]
tags: ["template"]
releaseDate: 2025-01-29
datePublished: 2025-01-29
dateModified: 2025-06-10
github: https://github.com/EOPF-Sample-Service/eopf-sample-notebooks
license: Apache-2.0
---

```{image} ../static/ESA_EOPF_logo_2025_COLOR_ESA_blue_reduced.png
:alt: ESA EOPF Zarr Logo
:width: 250px
:align: center
```

## Table of contents

- [Introduction](#Introduction)
- [Setup](#Setup)
- [Read EOPF-Zarr](#Read-EOPF-Zarr)

(Introduction)=
## Introduction

The OSI (Oil Spill Index) uses visible Sentinel-2 bands to display oil spills over water in the costal/marine environment. The OSI is constructed by summing-up the bands representing the shoulders of absorption features of oil as numerator and the band located nearest to the absorption feature as denominator to discriminate oil spill as below.

OSI = (B03 + B04) / B02

The original idea was created by Sankaran Rajendran and is available in the Sentinel Hub documentation [here](https://custom-scripts.sentinel-hub.com/sentinel-2/oil-spill-index/).

:::{hint} Overview
**Questions**
- How to access Sentinel-2 Zarr data via STAC?
- How to calculate the OSI (Oil Spill Index) using Sentinel-2 data?
- Why is it useful?

**Objectives**
- Access Sentinel-2 Zarr data via STAC.
- Calculate the OSI (Oil Spill Index) using Sentinel-2 data.
- Understand the usefulness of the OSI for detecting oil spills.
- Learn how to visualize the OSI results.
:::

(Setup)=
## Setup
Start importing the necessary libraries

In [1]:
import matplotlib.colors as mcolors
import matplotlib.pyplot as plt
import numpy as np
import pystac_client
import xarray as xr
from pystac_client import CollectionSearch
from dask.distributed import Client, LocalCluster
from datetime import date

In [None]:
cluster = LocalCluster(processes=False)
client = cluster.get_client()
cluster

In [2]:
# Initialize the collection search
search = CollectionSearch(
    url="https://stac.core.eopf.eodc.eu/collections",  # STAC /collections endpoint
)

# Retrieve all matching collections (as dictionaries)
for collection_dict in search.collections_as_dicts():
    print(collection_dict["id"])

sentinel-2-l2a
sentinel-3-slstr-l1-rbt
sentinel-3-olci-l2-lfr
sentinel-2-l1c
sentinel-3-slstr-l2-lst
sentinel-1-l1-slc
sentinel-3-olci-l1-efr
sentinel-3-olci-l1-err
sentinel-1-l2-ocn
sentinel-1-l1-grd
sentinel-3-olci-l2-lrr




In [4]:
catalog = pystac_client.Client.open("https://stac.core.eopf.eodc.eu")
# Search with cloud cover filter
items = list(
    catalog.search(
        collections=["sentinel-2-l2a"],
        bbox = [56.9998077380456, -20.8876343730305, 58.0554474534573, -19.8927954124891],
        datetime=["2020-06-30", "2020-10-01"],
    ).items()
)
print(f"items found: {len(items)}")


items found: 19


In [5]:
# Select items based on specific dates
# a. before (17 July, 2020), b. and c. during (01 and 06 August, 2020) and c. after (05 September, 2020) the oil spill)
target_dates = {
    date(2020, 7, 17),
    date(2020, 8, 1),
    date(2020, 8, 6),
    date(2020, 9, 5)
    }
selected_items = [item for item in items if item.datetime.date() in target_dates]  # type: ignore
assert len(selected_items) == 4, f"Expected 4 items, got {len(selected_items)}"

In [6]:
selected_items

[<Item id=S2B_MSIL2A_20200905T062449_N0500_R091_T40KEC_20230328T195847>,
 <Item id=S2B_MSIL2A_20200806T062449_N0500_R091_T40KEC_20230407T081024>,
 <Item id=S2A_MSIL2A_20200801T062451_N0500_R091_T40KEC_20230413T211517>,
 <Item id=S2B_MSIL2A_20200717T062449_N0500_R091_T40KEC_20230414T094728>]

In [10]:
def open_bands_20m(item):
    href = item.assets["product"].href
    ds = xr.open_dataset(href,**item.assets["product"].extra_fields["xarray:open_datatree_kwargs"])  
    return {
        "item_id": item.id,
        "date": item.datetime.date(),
        "B02": ds["measurements_r20m_b02"],
        "B03": ds["measurements_r20m_b03"],
        "B04": ds["measurements_r20m_b04"],
        "B05": ds["measurements_r20m_b05"],
        "B06": ds["measurements_r20m_b06"],
        "B07": ds["measurements_r20m_b07"],
        "B8A": ds["measurements_r20m_b8a"],
        "B11": ds["measurements_r20m_b11"],
        "B12": ds["measurements_r20m_b12"],
    }

# Load for your selected STAC Items
band_data = [open_bands_20m(item) for item in selected_items]


In [11]:
band_data

[{'item_id': 'S2B_MSIL2A_20200905T062449_N0500_R091_T40KEC_20230328T195847',
  'date': datetime.date(2020, 9, 5),
  'B02': <xarray.DataArray 'measurements_r20m_b02' (measurements_r20m_y_20m: 5490,
                                             measurements_r20m_x_20m: 5490)> Size: 241MB
  dask.array<open_dataset-measurements_r20m_b02, shape=(5490, 5490), dtype=float64, chunksize=(915, 915), chunktype=numpy.ndarray>
  Coordinates:
      measurements_r20m_x  (measurements_r20m_x_20m) float32 22kB dask.array<chunksize=(915,), meta=np.ndarray>
      measurements_r20m_y  (measurements_r20m_y_20m) float32 22kB dask.array<chunksize=(915,), meta=np.ndarray>
  Dimensions without coordinates: measurements_r20m_y_20m, measurements_r20m_x_20m
  Attributes: (4),
  'B03': <xarray.DataArray 'measurements_r20m_b03' (measurements_r20m_y_20m: 5490,
                                             measurements_r20m_x_20m: 5490)> Size: 241MB
  dask.array<open_dataset-measurements_r20m_b03, shape=(5490, 5490), d