## Access public bleaching data from MERMAID API

Run binder (ctrl + click to open in new tab): [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/data-mermaid/jupyter.git/main?urlpath=git-pull%3Frepo%3Dhttps%253A%252F%252Fgithub.com%252Fdata-mermaid%252Fmermaid-api%26urlpath%3Dlab%252Ftree%252Fmermaid-api%252Fexamples%252Fsummarysites.ipynb%26branch%3Dmaster)

This is a basic demonstration of accessing the unauthenticated site-level aggregated data endpoint from the MERMAID API.

- "unauthenticated": Data retrieved from this endpoint depends on the per-protocol data sharing policies selected the the project's administrators. If a protocol's data policy is set to `private` then the number of sample units at the site will be returned, but not the actual (average) data. Use authenticated endpoints for access to data in projects to which a user belongs.
- "site-level": % cover is collected per quadrat, then averaged for the sample unit (quadrat collection): `percent_hard_avg` etc. The site summary endpoint averages these averages (e.g. `percent_hard_avg_avg`) for all sample units collected at the site. A site in MERMAID is unique to a project, i.e. there can be multiple `site`s at the same location, across projects. It is also possible for the averaged data to span multiple dates (within a project). For date-specific averages, the authenticated `sampleevent` endpoint should be used.

For documentation of most other API endpoints, see https://mermaid-api.readthedocs.io/en/latest/. Also useful is the [Insomnia](https://insomnia.rest/) collection included with the [API repository](https://mermaid-api.readthedocs.io/mermaid_api.insomnia_collection.json).

### Set up

Import required libraries and set constants (change `api.datamermaid.org` to `dev-api.datamermaid.org` if desired).

Set filters as key/value pairs, if needed. Without them all sites in MERMAID are returned. A full list of filters is available [here](https://mermaid-api.readthedocs.io/en/latest/aggregated.html#site-summary-view).

In [None]:
import pandas as pd
import requests
from pprint import pprint

API_URL = "https://api.datamermaid.org/v1"
SUMMARY_SITES = "summarysites/"

filters = {
    # "project_name": "Kenya MACMON 2016-17",
}

### Fetch data

In [None]:
session = requests.Session()
session.headers.update({"content-type": "application/json",})
url = f"{API_URL}/{SUMMARY_SITES}"

request = requests.Request("GET", url=url, params=filters)
prepped = session.prepare_request(request)
print(f"url: {prepped.url}")
response = session.send(prepped).json()
print(f"Got {response['count']} sites total")

sites = response["results"]
while response["next"]:
    request = requests.Request("GET", url=response["next"])
    prepped = session.prepare_request(request)
    print(f"next page url: {prepped.url}")
    response = session.send(prepped).json()
    sites.extend(response["results"])

### Filter data for sites containing bleaching data

The bleaching quadrat collection protocol contains two sets of observations, one that counts the number of colonies for specific taxa and their condition (`colonies_bleached`), and one that estimates percent cover per quadrat (`quadrat_benthic_percent`).

In [None]:
sites_with_bleaching_data = [
    site for site in sites
    if "colonies_bleached" in site["protocols"]
       and "quadrat_benthic_percent" in site["protocols"]
]
print(f"Sites with bleaching data: {len(sites_with_bleaching_data)}")

# uncomment to examine first two results in full
# pprint(sites_with_bleaching_data[:2])

### Create analysis-ready dataframe

This step just demmonstrates one common analysis pattern. Other site properties could be included if needed, or other non-dataframe-based ways of using the `site` objects could be employed.

In [None]:
indicators = []

for site in sites_with_bleaching_data:
    bleaching_benthic = site["protocols"]["quadrat_benthic_percent"]
    bleaching_colonies = site["protocols"]["colonies_bleached"]
    benthic = {}
    if "benthicpit" in site["protocols"]:
        benthic = site["protocols"]["benthicpit"].get("percent_cover_benthic_category_avg", {})
    if "benthiclit" in site["protocols"]:
        benthic = site["protocols"]["benthiclit"].get("percent_cover_benthic_category_avg", {})

    site_data = {
        "hard_coral_benthic": benthic.get("Hard coral"),
        "soft_coral_benthic": benthic.get("Soft coral"),
        "macroalgae_benthic": benthic.get("Macroalgae"),
        "hard_coral_bleaching": bleaching_benthic.get("percent_hard_avg_avg"),
        "soft_coral_bleaching": bleaching_benthic.get("percent_soft_avg_avg"),
        "macroalgae_bleaching": bleaching_benthic.get("percent_algae_avg_avg"),
        "percent_normal_avg": bleaching_colonies.get("percent_normal_avg"),
        "percent_pale_avg": bleaching_colonies.get("percent_pale_avg"),
        "percent_bleached_avg": bleaching_colonies.get("percent_bleached_avg"),
    }

    indicators.append(site_data)

df = pd.DataFrame(indicators)
print(df)
