[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/kili-technology/automl/blob/main/notebooks/object_detection.ipynb)


# Objet Detection Using AutoML

In this notebook, we will see how we can simply create an object detection model with AutoML to pre-annotate our dataset on the [Kili Platform](https://cloud.kili-technology.com/label/).

## Setup API key

We first setup the api key and automl path.

In [None]:
from getpass import getpass

You can get your API key from the Kili platform and setup your environment variables.

In [None]:
KILI_URL="https://cloud.kili-technology.com/"  # If you are not using Kili SaaS, change the url to your configuration

api_endpoint = f"{KILI_URL}api/label/v2/graphql"

You can get your API key from the [Kili platform](https://cloud.kili-technology.com/label/my-account/api-key) and setup your environment variables. If you are working locally, please set your environment variables in a `.env` file. Also, if notebook is used on Colab, the Python path is redirected. 

In [None]:
%%capture
!pip install python-dotenv

In [None]:
%reload_ext dotenv
%dotenv

In [None]:
from IPython import get_ipython
import os

if "google.colab" in str(get_ipython()):
    os.environ["PYTHONPATH"] += ":/content/automl/"
    os.environ["HOME"] = "/content/drive/MyDrive/"
    api_key = getpass("Add your API Key here: ")
else:
    api_key = os.getenv("KILI_API_KEY")

## Install

We first follow the install procedure explained in the [README.md](https://github.com/kili-technology/automl/blob/main/README.md). 

In [None]:
!git clone https://github.com/kili-technology/automl.git

In [None]:
%cd automl

Install the packages. This should take less than a minute. 

In [None]:
%%capture
!git submodule update --init
!pip install torch
!pip install -e .

## Imports

In [None]:
from tqdm.autonotebook import tqdm

from kili.client import Kili

## Setup a mock Kili project

Setup the kili connection.

In [None]:
kili = Kili(api_key=api_key, api_endpoint=api_endpoint)

### Create the project

Our objective is to label plastic objects in rivers following the [Kili's Community Challenge](https://kili-technology.com/blog/kili-s-community-challenge-plastic-in-river-dataset).

First, we setup the project with the appropriate JSON interface settings. There will be 4 classes of objects to detect: PLASTIC_BAG, PLASTIC_BOTTLE, OTHER_PLASTIC_WASTE and NON_PLASTIC_WASTE.

In [None]:
json_interface = {
    "jobs": {
        "OBJECT_DETECTION_JOB": {
            "mlTask": "OBJECT_DETECTION",
            "tools": [
                "rectangle"
            ],
            "instruction": "Can you find plastic in the river?",
            "required": 1,
            "isChild": False,
            "content": {
                "categories": {
                    "PLASTIC_BAG": {
                        "name": "Plastic bag",
                        "children": []
                    },
                    "PLASTIC_BOTTLE": {
                        "name": "Plastic bottle",
                        "children": []
                    },
                    "OTHER_PLASTIC_WASTE": {
                        "name": "Other plastic waste",
                        "children": []
                    },
                    "NON_PLASTIC_WASTE": {
                        "name": "Non plastic waste",
                        "children": []
                    },
                },
                "input": "radio"
            }
        }
    }
}

In [None]:
project = kili.create_project(
        title="Plastic Detection in Rivers",
        description="Detect plastic in rivers",
        input_type="IMAGE",
        json_interface=json_interface
)

In [None]:
project_id = project["id"]

### Add assets

Next, we add some images to our project that show rivers contaminated with plastic waste.

In [None]:
assets_to_import = [
    {
        "externalId": f"{i}",
        "content": f"https://storage.googleapis.com/kili-machine-learning-automl/notebooks/plastic_in_river/image_{i}.jpg",
        "metadata": {}
    }
    for i in range(3000)
]

In [None]:
print(assets_to_import[0]["content"])

Now we send the data to our Kili project.

In [None]:
external_id_array = [a.get("externalId") for a in assets_to_import]
content_array = [a.get("content") for a in assets_to_import]
json_metadata_array = [a.get("metadata") for a in assets_to_import]
kili.append_many_to_dataset(project_id=project_id, 
                            content_array=content_array,
                            external_id_array=external_id_array, 
                            json_metadata_array=json_metadata_array)

### Add labels to assets

We add labels to half of the data to simulate a project where we haven't labeled much data and we want to predict the labels of the unlabeled data. 

In [None]:
%%capture
!wget https://storage.googleapis.com/kili-machine-learning-automl/notebooks/plastic_in_river/annotations.zip
!unzip annotations.zip -d ./annotations

In [None]:
CATEGORIES = [
    "PLASTIC_BAG",
    "PLASTIC_BOTTLE",
    "OTHER_PLASTIC_WASTE",
    "NON_PLASTIC_WASTE"
]

def get_bboxes_from_lines(lines):
    bboxes, categories = [], []
    bboxes = []
    for line in lines:
        half_width = line[3] / 2
        half_height = line[4] / 2
        bboxes.append(
            {
                "boundingPoly": [{
                    "normalizedVertices": [
                        {"x": line[1] - half_width, "y": line[2] + half_height},
                        {"x": line[1] - half_width, "y": line[2] - half_height},
                        {"x": line[1] + half_width, "y": line[2] - half_height},
                        {"x": line[1] + half_width, "y": line[2] + half_height},
                    ]}
                ],
                "categories": [{ "name": CATEGORIES[int(line[0])] }],
                "type": "rectangle",
            }
        )
    return bboxes

In [None]:
asset_ids = kili.assets(project_id=project_id, fields=["id", "externalId"], first=1500)

In [None]:
for asset_id in tqdm(asset_ids):
    external_id = int(asset_id["externalId"])
    with open(f"annotations/annotation_{external_id}.txt", "r") as f:
        lines = [line.split() for line in f.readlines()]
        lines = [[float(x) for x in line] for line in lines]

    kili_bounding_boxes = get_bboxes_from_lines(lines)
    json_response = {
        "OBJECT_DETECTION_JOB": {
            "annotations": kili_bounding_boxes
        }
    }
    kili.append_to_labels(label_asset_id=asset_id["id"],
                          json_response=json_response)

You can now click on the following link to see the assets in your project:

In [None]:
print(f"{KILI_URL}label/projects/{project_id}/menu/queue?currentPage=1&pageSize=20")

## Training the object detection NN with Kiliautoml

The following command will automatically download the labeled data in your Kili project. Then, it will choose the right model for your task, train it with this data and save it locally.

In [None]:
!kiliautoml train \
    --api-key $api_key \
    --project-id $project_id \
    --epochs 30

The results are not excellent so it would be good to have more labels to train our model. We can use the predictions of this model to facilitate the annotation.

### Send predictions

Now we can use our local trained model to predict the classes of our image assets and send the prediction scores to the project on Kili. These preannotations can then be validated or corrected by annotators.

In [None]:
!kiliautoml predict \
    --api-key $api_key \
    --project-id $project_id

Now you can ckeck that your assets have predictions on [Kili](https://cloud.kili-technology.com/)!

In [None]:
print(f"{KILI_URL}label/projects/{project_id}/menu/queue?currentPage=1&pageSize=20")