# Cerulean API Documentation

For most users, we recommend using the [Cerulean web application](https://cerulean.skytruth.org/), which provides a visual interface for exploring the complete set of Cerulean data layers. For users who want to directly access and download oil slick detection data, we provide programmatic free access to an OGC compliant API ([api.cerulean.skytruth.org](https://api.cerulean.skytruth.org)). Currently, only oil slick detections can be downloaded. Data used for source identification, including AIS tracks, vessel identities, and offshore oil platform locations, cannot be downloaded and can only be accessed via the Cerulean web application. API queries can be made programmatically (e.g. a curl request in Python) for direct data access and download. You can also execute API queries within a browser by pasting an API command into your browser’s address bar, which will then show the results of your query, including a helpful paginated map, or download the data directly. Below, we provide some working examples of common data queries from our API. This is only a small sample of the types of queries that are possible. To dig deeper, please see our full API docs and check out the current documentation for [tipg](https://developmentseed.org/tipg/) and [CQL-2](https://cran.r-project.org/web/packages/rstac/vignettes/rstac-02-cql2.html), both of which are used by our API.

In [None]:
from IPython.display import clear_output
!pip install contextily
clear_output()

In [None]:
import requests
import pandas as pd
from io import StringIO
import geopandas as gpd
import contextily as ctx
import matplotlib.pyplot as plt

In [None]:
def query_to_gdf_vis(data):
    gdf = gpd.GeoDataFrame.from_features(data["features"])
    gdf.crs = "EPSG:4326"
    gdf = gdf.to_crs(epsg=3857)
    ax = gdf.plot(figsize=(10, 10), alpha=0.5, edgecolor="k")
    ctx.add_basemap(
        ax, source=ctx.providers.OpenStreetMap.Mapnik, crs=gdf.crs.to_string()
    )
    plt.show()



# Example 1. Return slicks within a bounding box

For our first example, let's return slick detection data found within a specific geographic area. To do this, you can use the bounding box (bbox) pattern. For example, the below command will download model detections near the Strait of Hormuz, using this bbox parameter as input:

```
?bbox=53.6,23.6,59.9,28.1"
```




**NOTE:** In our examples we use a limit parameter to limit the number of entries returned from a query. If unspecified, all requests have a default limit value of 10: `&limit=10`. To make use of pagination, you can also use the parameter `&offset=60` to return entries starting at any arbitrary row (shown here returning from row 61 onwards).


In [None]:
example_1_url = (
    "https://api.cerulean.skytruth.org/collections/public.slick_plus/items"
    "?bbox=53.6,23.6,59.9,28.1"
    "&limit=100"
)

response = requests.get(example_1_url)
if response.status_code == 200:
    data = response.json()
else:
    print("Response failed with code", response.status_code)

In [None]:
query_to_gdf_vis(data)

# Example 2. Query by date and time range

For our next example, let’s add a datetime filter to return slick detection data from December, 2023, sorted by slick_timestamp. To do this, we specify a sorting function (`?sortby=slick_timestamp`) and provide a start and end datetime. The required date format is YYYY-MM-DDTHH:MM:SSZ, where the time is in UTC (which matches the timezone of S1 imagery naming convention).


In [None]:
example_2_url = (
    "https://api.cerulean.skytruth.org/collections/public.slick_plus/items"
    "?bbox=53.6,23.6,59.9,28.1"
    "&sortby=slick_timestamp"
    "&datetime=2023-12-01T00:00:00Z/2023-12-30T00:00:00Z"
    "&limit=100"
)
response = requests.get(example_2_url)
if response.status_code == 200:
    data = response.json()
else:
    print("Response failed with code", response.status_code)

In [None]:
query_to_gdf_vis(data)

# Example 3. Other Basic filtering

Our API also allows you to filter results using various properties of the slick detection data. For example, let’s repeat the query from example 1, but limit results to detections with a machine_confidence greater-than-or-equal-to (GTE) 60%, and an area greater than (GT) 20 square km:

In [None]:
example_3_url = (
    "https://api.cerulean.skytruth.org/collections/public.slick_plus/items"
    "?bbox=53.6,23.6,59.9,28.1"
    "&sortby=slick_timestamp"
    "&datetime=2023-12-01T00:00:00Z/2023-12-30T00:00:00Z"
    "&filter=machine_confidence GTE 0.6 AND area GT 20000000"
    "&limit=100"
)
response = requests.get(example_3_url)
if response.status_code == 200:
    data = response.json()
else:
    print("Response failed with code", response.status_code)

In [None]:
query_to_gdf_vis(data)

Note that these filter commands include spaces and abbreviated operators such as GTE (greater-than-or-equal-to), which are patterns enabled by CQL-2. There are a large number of fields available for filtering. We’ll cover a few more common examples below, but for full documentation, see our [standard API docs](https://api.cerulean.skytruth.org/).

#Example 4. Filtering by source

For higher-confidence slicks detected by Cerulean, we apply a second model that finds any vessels or offshore oil infrastructure recorded in the vicinity of those slicks. Let’s repeat our query from example 1, but limit the results to slicks with a possible vessel or infrastructure source identified nearby.

In [None]:
example_4_url = (
    "https://api.cerulean.skytruth.org/collections/public.slick_plus/items"
    "?bbox=53.6,23.6,59.9,28.1"
    "&sortby=slick_timestamp"
    "&datetime=2023-12-01T00:00:00Z/2023-12-30T00:00:00Z"
    "&filter=machine_confidence GTE 0.6 AND area GT 20000000"
    "&filter=(NOT source_type_1_ids IS NULL OR NOT source_type_2_ids IS NULL) AND cls != 1"
    "&limit=100"
)
response = requests.get(example_4_url)
if response.status_code == 200:
    data = response.json()
else:
    print("Response failed with code", response.status_code)

In [None]:
query_to_gdf_vis(data)

This one is a little complicated. Let’s break it down piece by piece:

*   `&filter=(NOT source_type_1_ids IS NULL OR NOT source_type_2_ids IS NULL)`. This command returns slicks where Cerulean has identified at least one potential source of type 1 (vessel) or type 2 (infrastructure). The syntax is a little confusing because of the double negative, but the command `NOT source_type_1_ids IS NULL` tells the API to fetch all slicks where the `source_type_1` field has at least one entry, and the command `NOT source_type_2_ids IS NULL` does the same thing for `source_type_2`.
*   `AND cls != 1`. This is a class filter that excludes all slicks of Class 1. Class 1 is “background,” which includes detections over land and other regions where oil slicks won’t plausibly occur. We recommend including this filter in most API queries.




# Example 5. Download data as a .csv or .geojson

**NOTE:** If you wanted to return the query directly as a csv for download, you would append &f=csv to the request url.

In [None]:
example_5_url = (
    "https://api.cerulean.skytruth.org/collections/public.slick_plus/items"
    "?bbox=53.6,23.6,59.9,28.1"
    "&sortby=slick_timestamp&datetime=2023-12-01T00:00:00Z/2023-12-30T00:00:00Z"
    "&filter=(NOT source_type_1_ids IS NULL OR NOT source_type_2_ids IS NULL) AND cls != 1"
    "&limit=10"
    # "&f=csv"
)

response = requests.get(example_5_url)
if response.status_code == 200:
    gdf = gpd.read_file(StringIO(response.text))
    gdf.to_csv("cerulean_data.csv", index=False)

If you prefer a geojson, you can append &f=geojson to the query instead, like this:

In [None]:
example_5_url = (
    "https://api.cerulean.skytruth.org/collections/public.slick_plus/items"
    "?bbox=53.6,23.6,59.9,28.1"
    "&sortby=slick_timestamp&datetime=2023-12-01T00:00:00Z/2023-12-30T00:00:00Z"
    "&filter=(NOT source_type_1_ids IS NULL OR NOT source_type_2_ids IS NULL) AND cls != 1"
    "&limit=10"
    "&f=geojson"
)
response = requests.get(example_5_url)
if response.status_code == 200:
    gdf = gpd.read_file(StringIO(response.text))
gdf.to_file("cerulean_data.geojson", driver="GeoJSON")

In [None]:
gdf

# Example 6. Return a specific slick by its ID

If you know which slick you want to pull from the API - let’s say it’s slick 1074498 from above - you can fetch it using a query like this:

In [None]:
example_6_url = (
    "https://api.cerulean.skytruth.org/collections/public.slick_plus/items?id=1074498"
)
response = requests.get(example_6_url)
if response.status_code == 200:
    data = response.json()
else:
    print("Response failed with code", response.status_code)

In [None]:
query_to_gdf_vis(data)

# Example 7. Return all slicks detected in a specific Sentinel-1 scene

If you want to return all slick detections in a specific Sentinel-1 scene, use a query like this:

In [None]:
example_7_url = (
    "https://api.cerulean.skytruth.org/collections/public.slick_plus/items"
    "?s1_scene_id=S1A_IW_GRDH_1SDV_20231208T002246_20231208T002311_051556_063950_7812"
)
response = requests.get(example_7_url)
if response.status_code == 200:
    data = response.json()
else:
    print("Response failed with code", response.status_code)

In [None]:
query_to_gdf_vis(data)

Now, let’s limit our results to slick detections in that Sentinel-1 scene with an area greater than 10 square km and a machine confidence greater-than-or-equal-to 50%:

In [None]:
example_7_url = (
    "https://api.cerulean.skytruth.org/collections/public.slick_plus/items"
    "?s1_scene_id=S1A_IW_GRDH_1SDV_20231208T002246_20231208T002311_051556_063950_7812"
    "&filter=machine_confidence GTE 0.5 AND area GT 1000000"
)
response = requests.get(example_7_url)
if response.status_code == 200:
    data = response.json()
else:
    print("Response failed with code", response.status_code)

In [None]:
query_to_gdf_vis(data)

#Filtering by Exclusive Economic Zone (EEZ) and Marine Protected Area (MPA)

Cerulean keeps track of the world's EEZs and MPAs using a unique AOI id that has been assigned to it. To filter slicks based on these areas of interest, you first need to find its `aoi_id` by querying the `public.aoi_eez` or `public.aoi_mpa` tables. Once you have an `aoi_id` you can find slick detections based on the queryable fields `aoi_type_1_ids` (for EEZs) or `aoi_type_2_ids` (for MPAs).

##Search for an EEZ based on MRGID (Marine Regions gazetteer id)
Let's query the `public.aoi_eez` table and explore the result to find an `aoi_id` associated with the Greek EEZ. Its MRGID is `5679`.

In [None]:
example_eez_url = (
    "https://api.cerulean.skytruth.org/collections/public.aoi_eez/items" "?mrgid=5679"
)
response = requests.get(example_eez_url)
if response.status_code == 200:
    data = response.json()

In [None]:
data["features"][0]["properties"]

Query slicks using the `aoi_id` associated with the Greek EEZ

In [None]:
aoi_id = data["features"][0]["properties"]["aoi_id"]
example_eez_slick_url = (
    "https://api.cerulean.skytruth.org/collections/public.get_slicks_by_aoi/items"
    f"?aoi_id={aoi_id}"
    "&limit=100"
    "&filter=machine_confidence GTE 0.99 AND area GT 2000000"
)
response = requests.get(example_eez_slick_url)
if response.status_code == 200:
    data = response.json()
else:
    print("Response failed with code", response.status_code)

In [None]:
query_to_gdf_vis(data)

##Search for an MPA based on WDPAID (World Database on Protected Areas id)
Let's query the `public.aoi_mpa` table and explore the result to find an `aoi_id` associated with the Great Barrier Reef. Its WDPAID is `2571`.




In [None]:
example_eez_url = (
    "https://api.cerulean.skytruth.org/collections/public.aoi_mpa/items" "?wdpaid=2571"
)
response = requests.get(example_eez_url)
if response.status_code == 200:
    data = response.json()

In [None]:
data["features"][0]["properties"]

In [None]:
aoi_id = data["features"][0]["properties"]["aoi_id"]
example_eez_slick_url = (
    "https://api.cerulean.skytruth.org/collections/public.get_slicks_by_aoi/items"
    f"?aoi_id={aoi_id}"
    "&limit=100"
    "&filter=machine_confidence GTE 0.99 AND area GT 2000000"
)
response = requests.get(example_eez_slick_url)
if response.status_code == 200:
    data = response.json()
else:
    print("Response failed with code", response.status_code)

In [None]:
query_to_gdf_vis(data)

**NOTE:** Not all geometries that show up in an MPA will be true oil detections. It is important to verify the validity of the detections using the original Sentinel-1 imagery that the data was derived from. We recommend using the Cerulean UI to do this.

# Conclusion
We hope this summary helps you get started with Cerulean’s API. This is a small sample of the data queries that are currently possible with Cerulean’s API. For full documentation, please see our [standard API docs](https://api.cerulean.skytruth.org/).