# Welcome to The Ocean Cleanup challenge!
## This is the model-based annotation notebook starter
This notebook will guide you to start leveraging predictions on the challenge. It will allow you to create your first predictions on your Kili project.

It uses **Kili autoML repository** with powerful commands to train a classification model with labeled data, predict on unlabeled ones and push these predictions to your project.

It also paves the way for you to **create your own model** and to use the **Kili Python SDK** to interact with your project and create predictions.

## Discover your Kili project and annotate some images

Each team get a personnal project. 

Your project ID can be found on the URL of the project: https://cloud.kili-technology.com/label/projects/PROJECT_ID/menu/queue


It looks like this: *clb551au002av0ky1em1w0opm*
 
On your project, you have an **ADMIN role**. This allows you to explore settings, change quality metrics, add new images, etc... but mainly, **to use the API!**

In [None]:
project_id= 'clbaro2x401kp0lxv1isf56z5'

In [None]:
KILI_URL="https://cloud.kili-technology.com/"
PROJECT_URL = f"{KILI_URL}label/projects/{project_id}"
print("Kili Project URL: ", PROJECT_URL)

Kili Project URL:  https://cloud.kili-technology.com/label/projects/clbaro2x401kp0lxv1isf56z5


You can Explore your project on the Kili app and **begin to annotate some images**.

Annotate fast because only the first 120 teams to reach 1000 annotated assets These labeled images will be useful when training your classification model 

## Prerequisite step: Connect to your OVH image

**REMINDER: Only the first 120 teams to reach 1000 annotated assets will be given an access to OVH images with one GPU.** This will allow to train models much faster! 
:
This notebook uses kiliautoML package which is not installed on Google Colab but which is on the given OVH images.

### 1- Connect to the image
To start, you can connect to your OVH image by clicking on the url that you received by mail. Then select "login with token" and enter the token that you received in the same mail.

This will connect to a jupyter lab on your brower.

### 2- Copy this notebook into your image
Once you are connected to the OVH image, you can simply download this notebook and copy paste it to your image. 

You are now ready to run the notebook with super computation power!


If all the 120 access to OVH images have already been granted, you can still run this notebook on Google colab, for this, you will have to install the kiliautoml package with with the commands below:

In [None]:
# Only if launching from Google Colab

!git clone https://github.com/kili-technology/automl.git
%cd automl
!git submodule update --init
!pip install -e .
os.environ["PYTHONPATH"] += ":/content/automl/"

Cloning into 'automl'...
remote: Enumerating objects: 4643, done.[K
remote: Counting objects: 100% (1590/1590), done.[K
remote: Compressing objects: 100% (575/575), done.[K
remote: Total 4643 (delta 1201), reused 1260 (delta 1004), pack-reused 3053[K
Receiving objects: 100% (4643/4643), 45.64 MiB | 26.17 MiB/s, done.
Resolving deltas: 100% (2700/2700), done.
/content/automl/automl
Submodule 'kiliautoml/utils/ultralytics/yolov5' (https://github.com/ultralytics/yolov5.git) registered for path 'kiliautoml/utils/ultralytics/yolov5'
Cloning into '/content/automl/automl/kiliautoml/utils/ultralytics/yolov5'...
Submodule path 'kiliautoml/utils/ultralytics/yolov5': checked out '3e858633b283767f038b4cab910a95e40fe8577b'
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Obtaining file:///content/automl/automl
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
    Preparing wheel metad

In [None]:
# If you are on an OVH image, you may need to get the latest version of automl by uncommenting and running:
# !cd /workspace/.automllib && git pull

## Setup Kili and Weight and Biases

In [None]:
from getpass import getpass
from tqdm.autonotebook import tqdm
import os
from IPython import get_ipython

### Kili


You can add and retrieve your Kili API Key [here](https://cloud.kili-technology.com/label/my-account/api-key")
Once you have it, you can export it into an environement variable called `KILI_API_KEY` or enter it each time with `getpass`

In [None]:
print("You can add a Kili API Key here: https://cloud.kili-technology.com/label/my-account/api-key")
api_key = os.getenv("KILI_API_KEY")
if api_key is None:
  api_key = getpass("Kili API Key: ")

You can add a Kili API Key here: https://cloud.kili-technology.com/label/my-account/api-key
Kili API Key: ··········


### Weight and Baises

In [None]:
!pip install wandb

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting wandb
  Downloading wandb-0.13.5-py2.py3-none-any.whl (1.9 MB)
[K     |████████████████████████████████| 1.9 MB 8.4 MB/s 
Collecting setproctitle
  Downloading setproctitle-1.3.2-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (31 kB)
Collecting sentry-sdk>=1.0.0
  Downloading sentry_sdk-1.11.1-py2.py3-none-any.whl (168 kB)
[K     |████████████████████████████████| 168 kB 55.6 MB/s 
[?25hCollecting docker-pycreds>=0.4.0
  Downloading docker_pycreds-0.4.0-py2.py3-none-any.whl (9.0 kB)
Collecting GitPython>=1.0.0
  Downloading GitPython-3.1.29-py3-none-any.whl (182 kB)
[K     |████████████████████████████████| 182 kB 56.6 MB/s 
[?25hCollecting shortuuid>=0.5.0
  Downloading shortuuid-1.0.11-py3-none-any.whl (10 kB)
Collecting pathtools
  Downloading pathtools-0.1.2.tar.gz (11 kB)
Collecting gitdb<5,>=4.0.1
  Downloading gitdb-4

In [None]:
import wandb
wandb.login()

ERROR:wandb.jupyter:Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.


<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter, or press ctrl+c to quit: 

··········


[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


True

## Train a model

The following `train` command of autoML will automatically:


1.   Download the labeled assets. **Make sure that you have already annotated some images on your project!** 
2.   Choose the right model for your task, here a classification
3.   train the model with this data
4.   save the model locally

OVH Images with dedicated GPU will help you train your model much faster!

In [None]:
!kiliautoml train

Traceback (most recent call last):
  File "/usr/bin/kiliautoml", line 33, in <module>
    sys.exit(load_entry_point('kiliautoml', 'console_scripts', 'kiliautoml')())
  File "/usr/bin/kiliautoml", line 22, in importlib_load_entry_point
    for entry_point in distribution(dist_name).entry_points
  File "/usr/lib/python3.8/importlib/metadata.py", line 503, in distribution
    return Distribution.from_name(distribution_name)
  File "/usr/lib/python3.8/importlib/metadata.py", line 177, in from_name
    raise PackageNotFoundError(name)
importlib.metadata.PackageNotFoundError: kiliautoml


In [None]:
!PYTHONPATH=/workspace/.automllib kiliautoml train \
    --api-key $api_key \
    --project-id $project_id \
    --epochs 30

Traceback (most recent call last):
  File "/usr/bin/kiliautoml", line 33, in <module>
    sys.exit(load_entry_point('kiliautoml', 'console_scripts', 'kiliautoml')())
  File "/usr/bin/kiliautoml", line 22, in importlib_load_entry_point
    for entry_point in distribution(dist_name).entry_points
  File "/usr/lib/python3.8/importlib/metadata.py", line 503, in distribution
    return Distribution.from_name(distribution_name)
  File "/usr/lib/python3.8/importlib/metadata.py", line 177, in from_name
    raise PackageNotFoundError(name)
importlib.metadata.PackageNotFoundError: kiliautoml


## create predictions

The `predict`is another all-in-one command from autoML that


1.   Download all unlabeled data
2.   predict a label on these data using the previously trained model
3.   send the predictions to the Kili project

You can use the `--dry-run` option to predict on images but do not send these predictions to your project.

In [None]:
!PYTHONPATH=/workspace/.automllib kiliautoml predict \
    --api-key $api_key \
    --project-id $project_id \
    #--dry-run

Traceback (most recent call last):
  File "/usr/bin/kiliautoml", line 33, in <module>
    sys.exit(load_entry_point('kiliautoml', 'console_scripts', 'kiliautoml')())
  File "/usr/bin/kiliautoml", line 22, in importlib_load_entry_point
    for entry_point in distribution(dist_name).entry_points
  File "/usr/lib/python3.8/importlib/metadata.py", line 503, in distribution
    return Distribution.from_name(distribution_name)
  File "/usr/lib/python3.8/importlib/metadata.py", line 177, in from_name
    raise PackageNotFoundError(name)
importlib.metadata.PackageNotFoundError: kiliautoml


You are now ready to go to your project and visualize these predictions!

In [None]:
print("Kili Project URL: ", PROJECT_URL)

## Next steps

Kili autoML is a very easy first start if you want to create predictions. You now have 2 ways to improve your predictions:


1.   Annotate more images in your projects. You now have predictions on images that you can see on the app so it will be even faster to annotate! Then train and predict again, you are building an iteratively better model and more accurate annotations!
2.   Train your own custom model, and use our [Kili Python SDK](https://python-sdk-docs.kili-technology.com/2.125/) package to interact with your project. The main functions that you need are:

    *   [assets](https://python-sdk-docs.kili-technology.com/2.125/sdk/asset/#kili.queries.asset.__init__.QueriesAsset.assets) to query and download locally the images and the labels from your project
    *   [create_predictions](https://python-sdk-docs.kili-technology.com/2.125/sdk/label/#kili.mutations.label.__init__.MutationsLabel.create_predictions) to push your predictions to your Kili project





## Use the Kili Python SDK

In [None]:
from kili.client import Kili
kili = Kili(api_key=api_key)

### Download assets locally 

In [None]:
assets = kili.assets(
    project_id=project_id,
    status_in=['LABELED'],
    download_media=True,
    local_media_dir='./data/ocean_cleanup_challenge'
    )

In [None]:
import pandas as pd
pd.DataFrame(assets).head()

### Create predictions
`kiliautoml` package uses the `create_predictions` function to push predictions to Kili. If you trained a custom model and that you have your own predictions on the previsouly download assets, you can use this function to upload your predictions!

In [None]:
def build_response(category, confidence):
  """Helper function to build the json_response at the Kili format"""
  return {
    'CLASSIFICATION_JOB': {
            'categories': [
            {
                'confidence': confidence,
                'name': category
            }
        ]
    }
}

In [None]:
kili.create_predictions(
    project_id=project_id,
    external_id_array = ['1', '2', '3'],
    model_name_array = ['test_model']*3,
    json_response_array = [build_response('MARINE_LIFE_OR_VEGETATION', 100)]*3
    )

## Documentation


For more info on how to use Kili Python SDK, please use our [documentation](https://python-sdk-docs.kili-technology.com/2.125/). The SDK allows you to do much more than creating predictions. You can prioritize assets, query your labels, update your project etc...


For a broader look on how to use Kili, you can visit the [general documentation](https://docs.kili-technology.com/docs)

# How to win the challenge ?

The Ocean Cleanup challenge is about helping to clean the ocean but it is also a competition! So how can you win prizes and Robin's gratitude ?

The main prize is given by a score of 70% of number of annotated assets and 30% of label quality. 
But there is also a bonus competition for the best predictions. At the end of the challenge, if your project has predictions, we will consider the last uplaoded predictions. We will compare them to honeypot labels or other teams manually labeled assets.

**The team with the best predictions will also get a prize!**. We will also be sensitive to your Weight and baises dashboard. At the end of the challenge, you can clean and send us on slack your Weight and Biases dashboard, this will give you a big bonus!


