# FathomNet Python API Tutorial
*So you want to use FathomNet data...*

<img src="https://raw.githubusercontent.com/fathomnet/fathomnet-logo/main/FathomNet_white_CenterText_400px.png" alt="FathomNet logo" width="200"/>

## Introduction

> `fathomnet-py` is a client-side API to help scientists, researchers, and developers interact with FathomNet data.

[![tests](https://github.com/fathomnet/fathomnet-py/actions/workflows/tests.yml/badge.svg)](https://github.com/fathomnet/fathomnet-py/actions/workflows/tests.yml)
[![Documentation Status](https://readthedocs.org/projects/fathomnet-py/badge/?version=latest)](https://fathomnet-py.readthedocs.io/en/latest/?badge=latest)

The [fathomnet-py](https://github.com/fathomnet/fathomnet-py) API offers native Python interaction with the FathomNet REST API, abstracting away the underlying HTTP requests.
This notebook is designed to walk you through some of the core functionality of the API. 

It's split into three parts:
1. [**API**](#api): API overview and data visualizations
2. [**Data**](#building-and-exploring-a-dataset-from-fathomnet): Building and exploring a dataset from FathomNet
3. [**Models**](#inference-with-a-pre-trained-model): Running images from FathomNet through a pre-trained model available on the [FathomNet Model Zoo](https://github.com/fathomnet/models)

This notebook is by no means exhaustive; it serves to show some common "recipes" for pulling down and handling FathomNet data in Python. **Full documentation for fathomnet-py is available at [fathomnet-py.readthedocs.io](https://fathomnet-py.readthedocs.io).**

[FathomNet GitHub](https://github.com/fathomnet)

### Installing `fathomnet-py`

To install fathomnet-py, you will need to have Python 3.7 or greater installed first (as of the time of writing, this notebook ships with Python 3.9). Then, from the command-line:

```bash
pip install fathomnet
```

This notebook installs fathomnet-py in the [Setup](#setup) section next, along with some relevant packages for data manipulation and visualization.

<a name="setup"></a>
## Setup

Note: this notebook assumes you are running from a colab environment. If this is not the case you may have to manually install a few packages that are pre-installed in colab such as numpy and pandas. 

First, import the auxiliary modules we need for part 1:

In [None]:
import ipywidgets as widgets  # Provides embedded widgets
import ipyleaflet  # Provides map widgets
import requests  # Manages HTTP requests
import plotly.express as px  # Generates nice plots
import random  # Generates pseudo-random numbers
from PIL import Image, ImageFont, ImageDraw  # Facilitates image operations
from io import BytesIO  # Interfaces byte data

Next, we'll install a few packages via pip:

In [None]:
!pip install -q -U fathomnet ipyleaflet

<a name="api"></a>
## The API

Now that we have fathomnet-py installed, let's see what it can do!

This section will show some of the common calls to pull down FathomNet data, and then we'll render some visualizations of the results.

### Overview

The two main parts of fathomnet-py are the **modules** and the **data classes**.

#### Modules

fathomnet-py offers a variety of modules that encapsulate their relevant API operations. In brief:

- `boundingboxes` --- find & manage bounding boxes
- `darwincore` --- list owner institutions
- `images` --- find & manage images
- `geoimages` --- query for geo-images (geographic info only of images)
- `imagesetuploads` --- find & manage image set uploads
- `regions` --- list marine regions
- `stats` --- compute summary statistics
- `tags` --- find & manage custom image tags
- `taxa` --- get taxonomic information via a taxa provider
- `users` --- manage user accounts & list contributors
- `firebase` & `xapikey` -- authenticate for write-level operations

*Note: We will repeatedly import some of these modules in the notebook to highlight what's being used in each step. In your code, you only need to import a module once.*

Each operation (API call) is represented as a function in its given module. For example, to get an image by its universally-unique identifier (UUID), we can import the `fathomnet.api.images` module and call the `find_by_uuid` function.

In [None]:
from fathomnet.api import images

example_image = images.find_by_uuid("79958ac5-832a-488c-9b48-cce7db346497")

#### Data classes

To facilitate parsing and saving FathomNet data, native Python dataclasses are provided in the `fathomnet.dto` module.

For example, we can see that the returned image from the `find_by_uuid` call above is of type `AImageDTO`.

In [None]:
type(example_image)

These native data representations make it easier to write Python programs around FathomNet data. We'll print out some of the fields here.

In [None]:
print("Image URL:", example_image.url)

print("Captured at latitude/longitude", example_image.latitude, example_image.longitude)

print("There are", len(example_image.boundingBoxes), "bounding boxes:")
for box in example_image.boundingBoxes:
    print("-", box.concept, "has area", box.width * box.height)

We can convert (serialize/deserialize) any of the FathomNet dataclasses to/from JSON or Python dictionaries. Let's print out the contents of that example image as JSON.

In [None]:
print(example_image.to_json(indent=2))

### Bar chart of concepts with the most bounding boxes

Here we will use a `boundingboxes` operation, called `count_total_by_concept`, to get a quick count of the total number of bounding boxes for every concept in FathomNet. To visualize, we'll make a bar chart of the top `N`.

⚙ Try changing the value of `N` on the right to show more concepts!

In [None]:
from fathomnet.api import boundingboxes

# Make a bar chart of the top N concepts by bounding boxes
N = 11  # @param {type:"slider", min:5, max:20, step:1}

# Get the number of bounding boxes for all concepts
concept_counts = boundingboxes.count_total_by_concept()

# Sort by number of bounding boxes
concept_counts.sort(key=lambda cc: cc.count, reverse=True)

# Get the top N concepts and their counts
concepts, counts = zip(*((cc.concept, cc.count) for cc in concept_counts[:N]))

# Make a bar chart
fig = px.bar(
    x=concepts,
    y=counts,
    labels={"x": "Concept", "y": "Bounding box count"},
    title=f"Top {N} concepts",
    text_auto=True,
)
fig.show()

### Listing images for a concept

Let's say we want to list all of the available images in FathomNet for a given concept. Here, we'll
1. List all the available concepts (again, using the `boundingboxes` module)
2. Pick one
3. Get a list of images for that concept using the `images` module

First, let's list all the available concepts in a choosable box.

We'll call the `find_concepts` function and put the results in a combo box.

⚙ **Pick a concept after running this cell!**

In [None]:
from fathomnet.api import boundingboxes

# Get a list of all concepts that have at least 1 bounding box
all_concepts = boundingboxes.find_concepts()

# Print how many there are
print("FathomNet has", len(concept_counts), "localized concepts!")

# Pick one!
concept_combo = widgets.Combobox(
    options=all_concepts,
    description="Pick one:",
    placeholder="Double-click or type here",
    ensure_option=True,
    disabled=False,
)
concept_combo

With our concept selected (if you didn't put anything, it will default to *Chionoecetes tanneri*), we can call the `images` module `find_by_concept` function to get back a list of all images containing a bounding box for that concept.

In [None]:
from fathomnet.api import images

# Get the selected concept
selected_concept = concept_combo.value or "Chionoecetes tanneri"

# List the images FathomNet for that concept
concept_images = images.find_by_concept(selected_concept)

# Print the total number
print("Found", len(concept_images), "images of", selected_concept)

This next cell will pick a random image, fetch it by its URL, and display it. 

⚙ If you want a different image, just re-run this cell.

In [None]:
# Pick a random image
random_image = concept_images[random.randrange(len(concept_images))]

# Fetch and show the image
image_data = requests.get(random_image.url).content
pil_image = Image.open(BytesIO(image_data))
display(pil_image)

Then, we'll loop over each bounding box listed and render it (drawing a box & label for it) on the image.

In [None]:
# Concept -> color mapping for bounding boxes
def color_for_concept(concept: str):
    hash = sum(map(ord, concept)) << 5
    return f"hsl({hash % 360}, 100%, 85%)"


# Draw the bounding boxes and labels on the image
draw_image = ImageDraw.Draw(pil_image)
font = ImageFont.truetype(
    "/usr/share/fonts/truetype/liberation/LiberationSans-Regular.ttf", size=18
)
for box in random_image.boundingBoxes:
    color = color_for_concept(box.concept)
    draw_image.rectangle(
        (box.x, box.y, box.x + box.width, box.y + box.height), width=3, outline=color
    )
    draw_image.text((box.x, box.y + box.height), box.concept, fill=color, font=font)

# Show the image with overlay
display(pil_image)

### Depth histogram

Let's generate a depth histogram; we'll extract the `depthMeters` field from each image (where present) and plot it.

In [None]:
# Extract the depth (in meters) from each image
depths = [
    image.depthMeters for image in concept_images if image.depthMeters is not None
]

# Make a horizontal histogram
fig = px.histogram(
    y=depths, title=f"{selected_concept} images by depth", labels={"y": "depth (m)"}
)
fig["layout"]["yaxis"]["autorange"] = "reversed"
fig.show()

### Geographic heatmap

We can use the `latitude` and `longitude` fields to georeference each image. Here, we're generating a heatmap of the images overlaid on the Esri ocean basemap.

⚙ Zoom and pan around -- although the map is centered on the Monterey Bay, see if you can find where other "hotspots" are for your concept.

In [None]:
# Extract the latitude/longitude from each image
locations = [
    (image.latitude, image.longitude)
    for image in concept_images
    if image.latitude is not None and image.longitude is not None
]

# Create a map from the Esri Ocean basemap
center = (36.807, -121.988)  # Monterey Bay
map = ipyleaflet.Map(
    basemap=ipyleaflet.basemaps.Esri.OceanBasemap, center=center, zoom=10
)
map.layout.height = "800px"

# Overlay the image locations as a heatmap
heatmap = ipyleaflet.Heatmap(locations=locations, radius=20, min_opacity=0.5)
map.add_layer(heatmap)

map

<a name="munch"></a>
## Building and exploring a dataset from FathomNet

FathomNet was built to support researchers seeking underwater imagery to train and test machine learning models. We'll demonstrate some of that functionality here. We'll build a VOC formatted dataset using the FathomNey Python API. Then we'll download a COCO formatted dataset using the `fathomnet-generate` command line tool and explore the contents programmatically.

There are loads of software tools out there to train and run deep learning models. Later in this tutorial we will use [Detectron2](https://github.com/facebookresearch/detectron2), Facebook Research's machine learning library. This is just one of many options. For example, our MBARI colleague [Danelle Cline](https://www.mbari.org/person/danelle-e-cline/) put together a great notebook demonstrating [how to set up FathomNet data to train a YOLOv5 model](https://docs.mbari.org/deepsea-ai/notebooks/fathomnet_train/).   

Before getting started well need to install some packages to this workspace

In [None]:
!pip install -q -U fathomnet pycocotools pandas plotly

Then import all the needed libraries into the workspace.

In [None]:
import matplotlib.pyplot as plt  # Plotting utilities
import requests  # Manages HTTP requests
import random  # Random number generator
import numpy as np  # Array manipulations
import pandas as pd  # More array manipulations
import plotly.express as px  # Plotting library

# Import coco dataset tools
from pycocotools.coco import COCO

# Import from pyplot and PIL for easy plotting
from PIL import Image
import skimage.io as io

<a name="voc-setup"></a>
### Manually create a VOC dataset
 
Let's say we want to train an object detector to detect *Gersemia juliepackardae* only. We don't want too much training data, so let's limit our query to just 100 images of *G. juliepackardae*.

We can find this data by specifying a *constraint object* in fathomnet-py, then calling a generalized image querying function. To do this, we need to grab `GeoImageConstraints` from the `fathomnet.dto` module:

In [None]:
from fathomnet.dto import GeoImageConstraints

Now, we can make a set of constraints for each bullet point.

In [None]:
gersemia_constraints = GeoImageConstraints(concept="Gersemia juliepackardae", limit=10)

To query for image data according to these constraints, we'll call the `fathomnet.api.images.find` function.

In [None]:
from fathomnet.api import images

gersemia_images = images.find(gersemia_constraints)
print(f"Gersemia juliepackardae images: {len(gersemia_images)}")

In order to get this data ready for training, we still need to do two things:
1. **Download** the images themselves
2. **Format** the bounding boxes into something the model can understand
3. **Structure** the directory according to the [perscribed VOC format](https://detectron2.readthedocs.io/en/latest/tutorials/builtin_datasets.html#expected-dataset-structure-for-pascal-voc).

#### Download the images

No magic here, we just need to download the images (via HTTP) to somewhere the notebook can find them.

*Note: there are more efficient ways to do this.*

In [None]:
import requests
from pathlib import Path
from progressbar import progressbar
from io import BytesIO

# Create a directory for the images
data_dir = Path("/content/gersemia_voc")
image_dir = data_dir / "JPEGImages"
image_dir.mkdir(exist_ok=True, parents=True)

# Download each image, saving each new file path to a list
image_paths = []
for image in progressbar(gersemia_images, redirect_stdout=True):
    # Format our image file name as the image UUID + .jpg
    image_path = image_dir / f"{image.uuid}.jpg"
    image_paths.append(image_path)
    if image_path.exists():  # Skip re-downloading images
        continue

    # Download the image
    image_raw = requests.get(image.url, stream=True).raw
    pil_image = Image.open(image_raw)

    # Convert to RGB (ensures consistent colorspace)
    pil_image = pil_image.convert("RGB")

    # Save the image
    pil_image.save(image_path)

#### Format the bounding boxes

We need to get the bounding boxes in a format the model can understand. Image data (`AImageDTO`) objects offer a convenience function to generate Pascal VOC annotations from their internal data. We can leverage this to quickly generate XML annotations of the form:

```xml
<annotation>
  <folder>images</folder>
  <filename>{image filename}</filename>
  <path>/content/drive/MyDrive/fathomnet-workshop-tests/images/{image filename}</path>
  <source>
    <database>FathomNet</database>
  </source>
  <size>
    <width>{image width}</width>
    <height>{image height}</height>
    <depth>3</depth>
  </size>
  <segmented>0</segmented>
  <object>
    <name>{concept}[ ({altConcept})]</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <occluded>0</occluded>
    <bndbox>
      <xmin>{x}</xmin>
      <xmax>{x + width}</xmax>
      <ymin>{y}</ymin>
      <ymax>{y + height}</ymax>
    </bndbox>
  </object>
  ...
</annotation>
```

We will likewise write up all the annotations in the preferred VOC directory structure. 

At the same time, we'll filter out any bounding boxes besides *G. juliepackardae*.

In [None]:
xml_dir = data_dir / "Annotations"
xml_dir.mkdir(exist_ok=True, parents=True)

for image, image_path in zip(gersemia_images, image_paths):
    xml_path = xml_dir / image_path.with_suffix(".xml").name
    image.boundingBoxes = list(
        filter(  # filter only Gersemia juliepackardae
            lambda box: box.concept == "Gersemia juliepackardae", image.boundingBoxes
        )
    )
    pascal_voc = image.to_pascal_voc(path=str(image_path), pretty_print=True)
    xml_path.write_text(pascal_voc)

This is great to do explicitly since it illustrates all the API tools. But you can also use the `fathomnet-generate` command line tool to make the dataset with a single line of code. Do note, however, that the following command will download **all** available images and annotation of the coral. 

In [None]:
# ! fathomnet-generate -c "Gersemia juliepackardae" --format coco --img-download '/content/gersemia_voc/JPEGImages' --output '/content/gersemia_voc/Annotations'

### Build a COCO dataset

Let's take a look at how we can leverage the Python API to download some images and bounding boxes from FathomNet. We're going to set up a training dataset that can be used for model training either in this notebook or in the Dockerized container we shared earlier. 

As we saw before, we can use the `fathomnet.api.images` module to search for images by concept. We can use that same functionality to set up a data set for model training. For an example scroll down to the final section of this notebook where we demonstrate how to use it to set up a VOC dataset.

Here we'll use the handy `fathomnet-generate` command line tool to download FathomNet data and automatically organize it according to the [Microsoft Common Objects in COntext](https://cocodataset.org/#home) (COCO) standards. 

#### COCO Formatting


COCO is a large annotated image dataset containing bounding boxes and segmentation masks for 91 categories in 100s of thousands of images. The bounding box annotations for object detection are organized into a standard format to make it easier to work with. Typically, the annotation files are distributed as json serialized nested dictionaries.

```
{
  "info": info,
  "images": [image],
  "categories": [category]
  "annotations": [annotation]
}
```

Each field contains inter-related pieces of information for your model training code to use. `info` is a dictionary that contains some metadata about the entire dataset:

```
info = {
    "year": 2023,
    "version": "0",
    "description": "Generated by FathomNet",
    "contributor": "FathomNet",
    "url": "https://fathomnet.org",
    "date_created": "2023/02/23"
}
```

The `images` field consists of a list of `image` objects that specify the file name, image dimensions, permanent url, etc. 

```
image = {
    "id": 1,
    "width": 1920,
    "height": 1080,
    "file_name": "754e6a28-a8eb-4cb3-a0b9-3f2d5daacbae.png",
    "license": 0,
    "flickr_url": "https://fathomnet.org/static/m3/staging/Doc%20Ricketts/images/0861/00_12_12_05.png",
    "coco_url": "https://fathomnet.org/static/m3/staging/Doc%20Ricketts/images/0861/00_12_12_05.png",
    "date_captured": "2016-06-16 00:00:00"
}
```

The `categories` field is a list of `category` objects organized by numeric ids.

```
category = {
  "id": 2,
  "name": "Actinernus",
  "supercategory": ""
}
```

Finally, `annotations` is a list of `annotation` objects that bring it all together. 

```
annotation = {
      "id": 1,
      "image_id": 1,
      "category_id": 2,
      "segmentation": [],
      "area": 51200.0,
      "bbox": [
        200.0,
        433.0,
        256.0,
        200.0
      ],
      "iscrowd": 0
}
```

Note that the `id` fields are specific to the list of objects. The example `annotation` object tells us that in the image associated with `image_id` 1, there a bounding box located at position `[200, 433]` with a width of `256` pixels and a heigh of `200` pixels. The label associated with that bounding box is `category_id` 2 which corresponds the anemone "Actinernus".

Whew, that is a mouthful! Fortunately we can auto populate all those fields with the `fathomnet-generate` command line tool. First, print out the docs. 


In [None]:
! fathomnet-generate 

As you can see, there are lots of options built into to the tool. Let's start with a simple query to see how it works. We'll execute the command below.

The `-c` flag accepts a comma seperated list of concepts to query for in FathomNet. The `--count` flag tells the tools to just report how many bounding boxes are associated with each concept rather than actually building the dataset. 

In [None]:
! fathomnet-generate -c "Aegina rosea" --count

Great, now we know there are about 29 bounding boxes associated with these two concents in FathomNet. This is a reasonable amount of data to download in real time for this workshop. 

To download the data we'll need to remove the `--count` flag, tell the scipt we want COCO format, and specify where we want the dataset and images. This should take about five minutes to download all the images and annotations.

In [None]:
! fathomnet-generate -c "Aegina rosea" --format coco --img-download 'demo_dataset/images' --output 'demo_dataset'

Here the `--format` flag tells the script to build a `coco` dataset. The `--image-download` and `--output` flags specify where all the output should live. When executed, the above line will make a new directory called `demo_dataset`. In the folder will be a COCO formatted dataset called `dataset.json` and a directory of `images` with 29 images with UUID file names.

#### Explore the dataset

We can use the handy `pycocotools` provided by the COCO dataset maintainers to visualize some of the images and annotations. 

In [None]:
# Create a coco dataset object
dataset = COCO("demo_dataset/dataset.json")

In [None]:
# Show all the categories names
cats = dataset.loadCats(dataset.getCatIds())
nms = [cat["name"] for cat in cats]
print("Categories: \n{}\n".format(" \n".join(nms)))

We can also see the raw category objects by just printing the cat object.

In [None]:
cats

The format conforms to the COCO standards. As a sanity check, we can then use pycocotools to see how many annotations are associated with each category. 

In [None]:
cat_ids = dataset.getCatIds(catNms=nms)
for cid in cat_ids:
    anns = dataset.getAnnIds(catIds=cid)
    print(f"{cats[cid-1]['name']} has {len(anns)} annotations")

This is what we expect based on the output of `fathomnet-generate` when we used the `--count` flag. 

Now lets grab and image and look at the bounding box. 

In [None]:
# Grab all the image ids associated with Aegina citrea
img_ids = dataset.getImgIds(catIds=cat_ids[0])

# Select one of them at random
img = dataset.loadImgs(img_ids[np.random.randint(0, len(img_ids))])[0]

# Get the annotations in the image
anns_ids = dataset.getAnnIds(imgIds=img["id"], catIds=cat_ids[0], iscrowd=None)
anns = dataset.loadAnns(anns_ids)

# ...and display it
im = io.imread(f"demo_dataset/images/{img['file_name']}")
plt.axis("off")
plt.imshow(im)
dataset.showAnns(anns, draw_bbox=True)

#### Data exploration 

Now we'll pull down a more more complicated data set to do some visualization of what is in the whole dataset. We'll use `fathomnet-generate` to get all the data collected in Monterey Bay in calendar year 2018. Note, the following command will only produce the dataset JSON document, we won't download the actual images to save time. If in the future you want to save the images, append the `--img-download` flag as in the earlier example.

In [None]:
! fathomnet-generate --output 'demo_coco' --format coco \
    --max-latitude 37.0538 --min-latitude 36.4458 \
    --max-longitude -121.7805 --min-longitude -122.5073 \
    --start 2018-01-01 --end 2018-12-31

The command will generate a new dataset object in `demo_coco/dataset.json`. Let's see what is in it. 

In [None]:
# Create a new coco dataset object
dataset = COCO("demo_coco/dataset.json")

In [None]:
# Look at some high level information.
print(f"Number of images = {len(dataset.getImgIds())}")
print(f"Number of annotations = {len(dataset.getAnnIds())}")
print(f"Number of categories = {len(dataset.getCatIds())}")

So now we have a more complicated dataset. 154 categories with 1554 annotations. We can check what the most abundant category is.

In [None]:
all_anns = dataset.loadAnns(dataset.getAnnIds())
all_anns = pd.DataFrame(
    all_anns
)  # Make it into a pandas data frame for easy manipulation

# See what the most abundant category is
cid = all_anns["category_id"].value_counts().idxmax()
print(f"Most abundant category id = {cid}")

So that tells us what the most abundant category id is. But what organism is that? 

In [None]:
dataset.loadCats(ids=[cid])[0]["name"]

Very cool. But I'm a programmer not a marine biologist. What the heck is that thing?

In [None]:
# Get all images that contain at least one annotation with that category id
img_ids = dataset.getImgIds(catIds=cid)

# Select one of them at random
img = dataset.loadImgs(img_ids[np.random.randint(0, len(img_ids))])[0]

# Get the annotations in the image
anns_ids = dataset.getAnnIds(imgIds=img["id"], catIds=cid, iscrowd=None)
anns = dataset.loadAnns(anns_ids)

# ...and display it
im = io.imread(img["coco_url"])  # this downloads the image
plt.axis("off")
plt.imshow(im)
dataset.showAnns(anns, draw_bbox=True)

It looks like we got a big ol' coral for our most abundant critter. 

This tells us some useful stuff about the most abundant data, but how can we get a sense of the species distribution in this dataset?

In [None]:
# Add a column to our dataframe that has all the plain language names of the category ids
all_anns["name"] = all_anns["category_id"].apply(
    lambda xx: dataset.loadCats(ids=[xx])[0]["name"]
)

In [None]:
# Plot the distribution of the counts
fig = px.histogram(
    all_anns,
    x="name",
    title="Concept counts",
    labels={"name": "Concept", "count": "Number of annotations"},
)

fig.show()

That is pretty illegible. What happens if we just look at the 20 most abundant organisms?

In [None]:
# Plot the 20 most abundant
top_anns = all_anns["name"].value_counts().head(20)

fig = px.histogram(
    top_anns,
    x=top_anns.index,
    y=top_anns.values,
    title="Top 20 concepts",
    labels={"x": "Concept", "y": "Number of annotations"},
)
fig.show()

Much better. But together these two plots tells us some important things; namely, that we are dealing with a long tailed distribution. That might impact how we do our model training. 

There are all sorts of other metadata manipulations we can do to explore the structure of our data. This type of exercise is important to ensure that you know what is going into your model before training. That will help you both produce a better model and more efficently diagnose errors. 

#### Model training

In the interest of time, we will not train a model in this workshop session. We have set up a [seperate notebook](https://github.com/fathomnet/fathomnet-py/blob/detectron2-train-demo/train_demo.ipynb) that illustrates how to get COCO formatted dataset with `fathomnet-generate` and train a RetinaNet mode using `detectron2`. We hope you'll find it helpful!

### Inference with a pre-trained model

A big feature of FathomNet is the *ModelZoo*, a repository for users to share their models with the community. For the moment, [we are advising users](https://medium.com/fathomnet/how-to-upload-your-ml-model-to-fathomnet-68b933dd55bd) to upload their models on Zenodo to generate a DOI and then share them on our GitHub page.  We have provided a number of our models as a starting point. 

First install some additional packages. You may see a red ERROR message in this cell -- you can disregard it. **This will take a couple minutes.** Grab a coffee! 

In [None]:
!pip install pyyaml==5.4.1 'git+https://github.com/facebookresearch/detectron2.git'

⚠ Detectron doesn't play nice with some installed package versions; you may see a mesasge asking you to restart the runtime. Press that button, or run this cell:

In [None]:
try:
    import detectron2  # noqa: F401
except ImportError:
    print("Restarting runtime...")
    exit()

Now grab all the packages we need.

In [None]:
import torchvision  # Library of datasets, models, and image transforms
import matplotlib.pyplot as plt  # Plotting utilities
import torch  # Tensor library for manipulating large models and data
import requests  # Manages HTTP requests
import random  # Random number generator
import numpy as np  # Array manipulations

# Import key functions & modules from detectron2
from detectron2 import model_zoo
from detectron2.data import Metadata
from detectron2.utils.visualizer import Visualizer
from detectron2.config import get_cfg
from detectron2.utils.visualizer import ColorMode
from detectron2.modeling import build_model
from detectron2.checkpoint import DetectionCheckpointer
import detectron2.data.transforms as T

# Import from pyplot and PIL for easy plotting
from PIL import Image

#### Download a model from the FathomNet model zoo
For this section of the workshop, we'll download the [MBARI Benthic Supercategory Detector](https://zenodo.org/record/5571043). This Retinanet model was fine tuned with FathomNet data from a version originally trained on COCO images. To train this system, we grouped many of our fine grained classes together into 20 'supercategories' that hopefully encode some generally morphological informatoin about the group. All the training data was drawn from MBARI imagery collected in Monterey Bay. 

We will run `wget` to actually do the download. This command will let Colab download resources from a URL. We will start by getting the weights from the repository on Zenodo.

First, let's download the model weights. 

In [None]:
!wget -nc https://zenodo.org/record/5571043/files/model_final.pth 

Now we'll grab the model file that declares the structure of the model.

In [None]:
!wget -nc https://zenodo.org/record/5571043/files/fathomnet_config_v2_1280.yaml

#### Run inference
We can actually run images through our network now that we have the model architecture and the weights from training. Before we run anything we will need to load the model into memory and set several parameters that will dictate what we see in the output. 

First set the paths so the `detectron2` toolbox will know where to look for your files.

In [None]:
CONFIG_FILE = "fathomnet_config_v2_1280.yaml"  # training configuration file
WEIGHT_FILE = "model_final.pth"  # fathomnet model weights

Now set Non-Maximal Suppresion (NMS) and Score thresholds. These parameters dictate which of the proposed regions the algorithm displays. 

In [None]:
NMS_THRESH = 0.45  # Set an NMS threshold to filter all the boxes proposed by the model
SCORE_THRESH = (
    0.3  # Set the model score threshold to suppress low confidence annotations
)

You have to explicitly tell the model what the names of the classes are. The system outputs a number, not a label. You can think of this as a look-up table.

In [None]:
fathomnet_metadata = Metadata(
    name="fathomnet_val",
    thing_classes=[
        "Anemone",
        "Fish",
        "Eel",
        "Gastropod",
        "Sea star",
        "Feather star",
        "Sea cucumber",
        "Urchin",
        "Glass sponge",
        "Sea fan",
        "Soft coral",
        "Sea pen",
        "Stony coral",
        "Ray",
        "Crab",
        "Shrimp",
        "Squat lobster",
        "Flatfish",
        "Sea spider",
        "Worm",
    ],
)

With all the parameters and file paths set up, you can now point Detectron to the configurations using the `get_cfg()` function.

In [None]:
cfg = get_cfg()
cfg.merge_from_file(
    model_zoo.get_config_file("COCO-Detection/retinanet_R_50_FPN_3x.yaml")
)
cfg.merge_from_file(CONFIG_FILE)
cfg.MODEL.RETINANET.SCORE_THRESH_TEST = SCORE_THRESH
cfg.MODEL.WEIGHTS = WEIGHT_FILE

Load in all the model weights and set the thresholds. This actually instantiates the model in your workspace. The `model` object is what will ingest the images and return outputs for us to look at. 

⚠ *If this cell returns a* `RuntimeError: No CUDA GPUs are available` *you will need to update your settings. Click the Runtime dropdown menu, select "Change runtime type" and select GPU in the "Hardware accelarator" box. You will then need to rerun the detectron2 install via pip.*  

In [None]:
model = build_model(cfg)  # returns a torch.nn.Module
checkpointer = DetectionCheckpointer(model)
checkpointer.load(
    cfg.MODEL.WEIGHTS
)  # This sets the weights to the pre-trained values dowloaded from Zenodo
model.eval()  # Tell detectron that this model will only run inference

Before putting images through the network, you need to define some preprocessing steps. At training time, you might set up a series of random affine transformations to help guard against overfitting. Since this network is already trained, we just need to resize time images to a standard dimension.

In [None]:
aug = T.ResizeShortestEdge(
    short_edge_length=[cfg.INPUT.MIN_SIZE_TEST],
    max_size=cfg.INPUT.MAX_SIZE_TEST,
    sample_style="choice",
)

Finally, we need to set up an extra NMS layer since by default `detectron2` models only do intra-class comparisions between bounding boxes. We need to do another NMS run between classes. 

In [None]:
post_process_nms = torchvision.ops.nms

We'll need to grab a random (or not so random) image to run through the network.

In [None]:
from fathomnet.api import boundingboxes, images

# Get a list of all concepts
all_concepts = boundingboxes.find_concepts()

# Pick one at random, or set one yourself, e.g.:
# concept = 'Chionoecetes tanneri'
concept = all_concepts[random.randrange(len(all_concepts))]

# List the images of the concept in FathomNet
concept_images = images.find_by_concept(concept)

print(f"{len(concept_images)} images of {concept}")

Finally, you have everything loaded up to run the image through the model.

In [None]:
# Pick a random image
image = concept_images[random.randrange(len(concept_images))]

# Fetch the image
im = np.array(Image.open(requests.get(image.url, stream=True).raw))

im_height, im_width, _ = im.shape  # Grab the image dimensions

# Use detectron's visualization tool to plot the bounding boxes
v_inf = Visualizer(
    im, metadata=fathomnet_metadata, scale=1.0, instance_mode=ColorMode.IMAGE
)

# Transform the image in the desired input shape
im_transformed = aug.get_transform(im).apply_image(im)

# Actually crank it through the model
with torch.no_grad():
    im_tensor = torch.as_tensor(im_transformed.astype("float32").transpose(2, 0, 1))
    model_outputs = model(
        [{"image": im_tensor, "height": im_height, "width": im_width}]
    )[0]

# Run the second stage NMS to ensure limited interclass overlap
model_outputs["instances"] = model_outputs["instances"][
    post_process_nms(
        model_outputs["instances"].pred_boxes.tensor,
        model_outputs["instances"].scores,
        NMS_THRESH,
    )
    .to("cpu")
    .tolist()
]

# Use the visualization tool to plot the bounding boxes on top of the image
out_inf_raw = v_inf.draw_instance_predictions(model_outputs["instances"].to("cpu"))
out_pil = Image.fromarray(out_inf_raw.get_image())

# Show it
display(out_pil)

## That's all, folks!

At this point, you have
1. Used the FathomNet Python API to pull down and visualize concepts, images, and ancillary data
2. Downloaded images and bounding boxes locally (all that data is still in the notebook instance, in truth)
3. Explored the data and visualized the distribution of classes
4. Run a pre-trained model from the FathomNet model zoo

We hope this notebook has helped you understand the FathomNet Python API. Thanks for attending the workshop! 

If you have any feedback or suggestions, please open an issue on the [fathomnet-py issues page](https://github.com/fathomnet/fathomnet-py/issues). We very much appreciate your thoughts.