# OSU Small Animals 
[OSU Small Animals](https://lila.science/datasets/ohio-small-animals/) is a benchmark dataset, a static collection of images used to evaluate the capability of various ML approaches, developed from terrestrial camera trap images. This notebook contains pseudocode giving you an idea of what to do as you work to training a classifer on this labeled data. 

Unlike previous notebooks, this one is very much a skeleton with some extra information to frame your workflow. In other words, you will write most of the code yourself. You can write everything from scratch or treat this as a vibe coding session (i.e. asking your favorite LLM for help). Please ask the instructors lots of questions. 

## Common Objects in COntext format
[Common Objects in COntext (COCO)](https://link.springer.com/chapter/10.1007/978-3-319-10602-1_48) is one of the classic machine learning benchmark data (62000 citations and counting!). The material below explaining the COCO format is adapted from the [FathomNet Python API Tutorial](https://github.com/fathomnet/fathomnet-py/blob/main/examples/tutorial.ipynb). 

The original COCO dataset is a large annotated image dataset containing bounding boxes and segmentation masks for 91 categories in 100s of thousands of images. The dataset led to lots of novel AI models and the format has become standard for many python tools. Widely used architectures like Pytorch (what we've been using in this workshop) and [YOLO](https://docs.ultralytics.com/tasks/classify/) have a [prebuilt DataLoader object for COCO](https://docs.pytorch.org/vision/stable/_modules/torchvision/datasets/coco.html). COCO format supports [multiple types of annotations](https://cocodataset.org/#overview) including: classification, bounding box, segmantions, and keypoints. Regardless of type, annotations are organized into a standard format to make it easier to work with. Typically, the annotation files are distributed as json serialized nested dictionaries. The highest level looks like this:

```
{
  "info": info,
  "images": [image],
  "categories": [category]
  "annotations": [annotation]
}
```

Each field contains inter-related pieces of information for your model training code to use. `info` is a dictionary that contains some metadata about the entire dataset:

```
info = {
    "year": 2023,
    "version": "0",
    "description": "Generated by FathomNet",
    "contributor": "FathomNet",
    "url": "https://fathomnet.org",
    "date_created": "2023/02/23"
}
```

The `images` field consists of a list of `image` objects that specify the file name, image dimensions, permanent url, etc. 

```
image = {
    "id": 1,
    "width": 1920,
    "height": 1080,
    "file_name": "754e6a28-a8eb-4cb3-a0b9-3f2d5daacbae.png",
    "license": 0,
    "flickr_url": "https://fathomnet.org/static/m3/staging/Doc%20Ricketts/images/0861/00_12_12_05.png",
    "coco_url": "https://fathomnet.org/static/m3/staging/Doc%20Ricketts/images/0861/00_12_12_05.png",
    "date_captured": "2016-06-16 00:00:00"
}
```

The `categories` field is a list of `category` objects organized by numeric ids.

```
category = {
  "id": 2,
  "name": "Actinernus",
  "supercategory": ""
}
```

Finally, `annotations` is a list of `annotation` objects that bring it all together. 

```
annotation = {
      "id": 1,
      "image_id": 1,
      "category_id": 2,
      "segmentation": [],
      "area": 51200.0,
      "bbox": [
        200.0,
        433.0,
        256.0,
        200.0
      ],
      "iscrowd": 0
}
```

Note that the `id` fields are specific to the list of objects. The example `annotation` object tells us that in the image associated with `image_id` 1, there a bounding box located at position `[200, 433]` with a width of `256` pixels and a heigh of `200` pixels. The label associated with that bounding box is `category_id` 2 which corresponds the anemone "Actinernus". If there are not bounding boxes associated with the annotation, the `bbox` and `area` fields are empty. 

## Explore OSU Small Animals
Start by exploring the OSU Small Animals dataset by loading it in, trying to view images, and exploring metadata. The data can be found in `/groups/cv-workshop/ohio_small_animals` with the images in the `Images` subdirectory. The image and annotation data is in the COCO formated json file `osu-small-animals.json`.

This dataset is distributed in a modified version of COCO called [COCO Camera Traps](https://lila.science/coco-camera-traps) (CCT) format. CCT is a superset of COCO and is compatible with tools that expect COCO-formatted data. It includes additional metadata relevant to camera trap deployments, namely location. The location field is in the image field described above and can be useful for making training and validation datasets. 

You may find the [megadetector utility functions](https://github.com/agentmorris/MegaDetector/tree/main/megadetector/utils) helpful for sorting images by their metadata. We have installed those utlities in this compute environment. 

Some things you might consider doing: 
- Counting the number of annotations per class
- Select categories that have more than 100 labeled examples
- Count the number of animals and the number of different categories detected at different locations 

## Create training and validation
In earlier notebooks, you were given premade training and validation datasets. Recall, the training data is used for tuning your model and the validation is held out for testing. Come up with a training and validation split for your model. This could be purely random or based on some relevant metadata. Up to you!

## Train your model
Train a classification model based on the COCO formatted data. You can try to write a custom DataLoader to read in COCO data for classification (as opposed to object detection). Or you could try a different architecture all together. [Ultralytics](https://docs.ultralytics.com/tasks/classify/#models), for example, distributes classification models that can read in data from COCO formatted metadata (NB: this is a different framework from Pytorch, the package we were using earlier). 

## Evaluate your model
Use your validation data to measure performance. Think about how these measurements reflect how you choose to split your data. How might you expect this to work in the real world? 