# Plastic Classifier: Getting started

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/surfriderfoundationeurope/IA_Pau/blob/master/Hackaton_Surfrider_Getting_Started.ipynb)


This helper notebook is designed for surfrider Hackaton. The goal is to build a plastic classifier, as the core detector / tracker is already built (but only works for generic plastic). This notebook is designed to help you quickstart, but you may as well follow instructions directly from the [main github](https://github.com/surfriderfoundationeurope/surfnet/tree/further_research).

If you want fast training, make sure you have a good GPU: check using the command `!nvidia-smi`

In [None]:
!git clone https://github.com/surfriderfoundationeurope/surfnet.git -b further_research
%cd surfnet

In [None]:
pip install -r requirements.txt

### Getting the data

To get the images, `azcopy` and the right token are needed. The following cells enable you to do so (Downloads 5Go of data). The token here enables to access the data until Monday 31st of January 2022. Then annotations in json format are also downloaded.

It may be useful to mount a Drive if you plan to stick to Colab.

In [None]:
!wget https://aka.ms/downloadazcopy-v10-linux
!tar -xvf downloadazcopy-v10-linux

In [None]:
!azcopy_linux_amd64_10.13.0/azcopy copy --recursive 'https://dataplasticoprod.blob.core.windows.net/images2label?sp=rl&st=2022-01-24T10:34:35Z&se=2022-01-31T18:34:35Z&spr=https&sv=2020-08-04&sr=c&sig=%2FHn2D3IvAECUJ0QqPpf0Jewo7GuNaIVYf23BjVjAd3Q%3D' './'

In [None]:
!mkdir -p data/images
!mv images2label data/images/images

In [None]:
!mkdir -p data/images/annotations
!wget https://github.com/surfriderfoundationeurope/surfnet/releases/download/v01.2022/instances_train.json -P data/images/annotations/
!wget https://github.com/surfriderfoundationeurope/surfnet/releases/download/v01.2022/instances_val.json -P data/images/annotations/

## Analyse the dataset

the next following cells enable you to get a bit of information about the dataset

In [None]:
import sys
sys.path.insert(0,'/content/surfnet/src/')

In [None]:
import json 
from pycocotools.coco import COCO
import matplotlib.pyplot as plt


coco = COCO(annotation_file = './data/images/annotations/instances_train.json')

coco_categories = coco.dataset['categories'][1:]

nb_anns_per_cat = {cat['name']: len(coco.getAnnIds(catIds=[cat['id']])) for cat in coco_categories}
nb_anns_per_cat = {k:v for k,v in sorted(nb_anns_per_cat.items(), key=lambda x: x[1], reverse=True)}
cat_names = list(nb_anns_per_cat.keys())
nb_images = list(nb_anns_per_cat.values())

plt.bar(x = cat_names, height = nb_images)
plt.xticks(range(len(cat_names)), cat_names, rotation='vertical')
plt.ylabel('Number of annotations')
plt.tight_layout()
plt.autoscale(True)
plt.savefig('dataset_analysis')

In [None]:
import os
from detection.coco_utils import CocoDetectionWithExif, ConvertCocoPolysToBboxes

def get_dataset(root, image_set):
    PATHS = {
        "train": ("images", os.path.join("annotations", "instances_train.json")),
        "val": ("images", os.path.join("annotations", "instances_val.json")),
    }

    img_folder, ann_file = PATHS[image_set]
    img_folder = os.path.join(root, img_folder)
    ann_file = os.path.join(root, ann_file)

    dataset = CocoDetectionWithExif(img_folder, ann_file, transforms=ConvertCocoPolysToBboxes())

    return dataset

In [None]:
from detection.coco_utils import get_surfrider
from detection import transforms

base_size = 540
crop_size = (544, 960)
downsampling_factor = 4
num_classes = 10
path = '/content/surfnet/data/images/'

# Building a train & test dataset
train_dataset = get_dataset(path, "train")
val_dataset = get_dataset(path, "val")

Let us display a full size picture, and corresponding bounding box

In [None]:
import matplotlib.pyplot as plt
x, y = next(iter(train_dataset))
print(x.shape, y)
plt.imshow(x)