# Image Classification

In this tutorial, we show you how to start with an image classification project.

You can classify any kind of asset: image, text, video, PDF, etc. For the sake of clarity, we will use images in this tutorial. Here are the main steps:

1. [What is classification](#classification)
2. [Connection to Kili](#connect)
3. [Creating the project, setting up the interface](#project)
4. [Importing data](#data)
5. [Labeling](#labeling)
6. [Exporting labels](#export)
7. [Quality management](#quality)
8. [More advanced concepts](#concepts)

# What is classification<a id='classification'></a>

The first important concept to understand is classification. The idea of classification is to assign categories to assets from a fixed set of categories. To make things more concrete, let's examine an example. Suppose that we want to classify the car shown below:

![](../img/asset_viewer.png)


First we have the asset we want to classify—in this case an image of a car—then we have the possible classes. The interface shows two possible classes: *Object A* and *Object B*. This could be customized to be for example *Car* or *Cat*; the important thing about classification is to know that we have a fixed set of possible classes.

A human can easily classify this image as a car; however, in many situations, it can be useful to automate this process. With this in mind, we could also train a machine learning model to classify our image. To train a powerful model; however, we would need large quantities of labeled data (in this case, images with classes already assigned). Kili provides an excellent way to classify your data. The next step of this tutorial will walk you through the creation of your first classification project at Kili.

# Connecting to Kili <a id='connect'></a>

The first step is to connect to the platform.

If you use the SaaS version of Kili (see [here](https://cloud.kili-technology.com/docs/hosting/saas/)), you’ll be using by default the Auth0 login identification, or your company's authentication, if it has been implemented.

<img src="../img/auth0.png" width="400" />

If you use Kili on-premises (see [here](https://cloud.kili-technology.com/docs/hosting/on-premise-entreprise/)), you will probably use our own authentication:

<img src="../img/noauth0.png" width=400 />

You need your organization admin to create your profile. Depending on the authentication implementation, you can sign up and set your password, or use the temporary one provided to you by the admin.

If everything succeeds, you should arrive at the projects page shown in the beginning of the next section.

# Creating the project <a id='project'></a>

## List of projects

You'll arrive on a list of projects.

![](../img/project_list.png)

You can refer to this [document](https://cloud.kili-technology.com/docs/concepts/definitions/) to find the definitions of key concepts at Kili. One of them is a project, which is a combination of:
- a dataset (a list of assets)
- members (project users; each can have different roles)
- an interface (describing the annotation plan)

## Create the project 

You can either create a project [from the interface](https://cloud.kili-technology.com/docs/projects/new-project/#docsNav) or from the API](https://github.com/kili-technology/kili-playground/blob/master/recipes/create_project.ipynb).


To create a project from the interface,select `Create New` from the list of projects. Type your project's name and a description, then select `Image Classification (single-class)`. Finally, select `Save` as shown below:

![](../img/getting_started/create_new_classification_project.gif)

<details>
<summary style="display: list-item;"> Follow these instructions to create a project from the API. </summary>

From the API, you can create a project with a single call, which allows you to store and share project interfaces : 
- First, [connect to Kili](https://github.com/kili-technology/kili-playground/blob/master/README.md#get-started)


```python
# Authentication
import os

# !pip install kili # uncomment if you don't have kili installed already
from kili.client import Kili

api_key = os.getenv('KILI_USER_API_KEY')
api_endpoint = os.getenv('KILI_API_ENDPOINT') # If you use Kili SaaS, use the url 'https://cloud.kili-technology.com/api/label/v2/graphql'

kili = Kili(api_key=api_key, api_endpoint=api_endpoint)
```

- Then call the method `create_project`: <a id='command'></a>
```python
kili.create_project(
    title='Project Title',
    description='Project Description',
    input_type='IMAGE',
    json_interface=interface
)
```

with `interface` such as:


```python
interface = {
  "jobRendererWidth": 0.17,
  "jobs": {
    "JOB_0": {
      "mlTask": "CLASSIFICATION",
      "required": 1,
      "content": {
        "categories": {
          "OBJECT_A": {
            "name": "Object A"
          },
          "OBJECT_B": {
            "name": "Object B"
          }
        },
        "input": "radio"
      }
    }
  }
}
```


```python
result = kili.create_project(
    title='Project Title',
    description='Project Description',
    input_type='IMAGE',
    json_interface=interface
)
print(result)
```

```python
Out: {'id': 'ckm4pmqmk0000d49k6ewu2um5'}
```
</details>

## Access your project

This creates a project with a simple interface, a radio button, and two categories: `Object A` and `Object B`.
Once logged in, you can see your project in the list of projects:

![](../img/project_in_list.png)

Click on it: you arrive on the overview of the project:

![](../img/project_overview.png)

If you want to modify or view the interface, go to the Settings tab. First, click on the Settings button in the sidebar:

<img src="../img/sidebar_settings.png" width=100/>

You can find both the form and the JSON versions of the interface:

![](../img/project_settings.png)

[Find out how to modify the interface dynamically.](https://cloud.kili-technology.com/docs/projects/customize-interface/#docsNav)

If you want to go back to the list of projects, either click on `Kili Technology` in the top bar, or click on the list of projects in the sidebar:

<img src="../img/sidebar_listprojects.png" width=100>

<details class="mydetails">
<summary style="display: list-item;"> Follow these instructions to create a project from the API. </summary>

When you run the [command](#command) to create a project, it outputs a unique identifier of the project. This identifier is used to recognize, access, and modify the project from the API.

<a id="command"></a>
```python
kili.create_project(
    title='Project Title',
    description='Project Description',
    input_type='IMAGE',
    json_interface=interface
)
```

Example of such an output:

```python
{'id': 'ckkpj7stx1bxc0jvk1gn9cu5v'}
```

Another way to get this project identifier is to look at the URL you're in:

![](../img/url_project.png)
</details>

# Importing data <a id='data'></a>

The next step is to import data.

You can import data either [from the interface](https://cloud.kili-technology.com/docs/data-ingestion/data-ingestion-made-easy/) or [from the API](https://cloud.kili-technology.com/docs/python-graphql-api/recipes/import_assets/#kili-tutorial-importing-assets). 


To import data from the interface, go to the `Dataset` tab in your project, then click on `Add New`. There you'll have two tabs. From the first tab called `Uplod Local Data`, you'll be able to select files from your local computer to upload. From the second table called `Connect Cloud Data`, you should provide a `.csv` file containing the URLs to your data stored in the cloud. These steps are shown below:

![](../img/import_assets.gif)

<details>
<summary style="display: list-item;"> Follow these instructions to import data from the API </summary>

Next, simply call this [this function](https://cloud.kili-technology.com/docs/python-graphql-api/python-api/#append_many_to_dataset):

```python
kili.append_many_to_dataset(
    project_id="ckkpj7stx1bxc0jvk1gn9cu5v", 
    content_array=["path-to-local-image OR url-to-image"],
    external_id_array=["your-identifier-of-the-image"]
)
```


```python
# Example

project_id = result['id']

kili.append_many_to_dataset(
    project_id=project_id, 
    content_array=["https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/car_2.jpg"],
    external_id_array=["car_2.jpg"]
)
```
    
```python
Out: {'id': 'ckm4pmqmk0000d49k6ewu2um5'}
```
</details>

# Labeling <a id='labeling'></a>

When you create a project, you automatically become an admin of the project. This enables you to directly label. If you want to add members to the project, follow [](https://cloud.kili-technology.com/docs/projects/settings/#manage-project-members).

## Label a specific asset

To annotate a specific asset, you can go to the Dataset tab (in the side panel):

<img src="../img/sidebar_dataset.png" width=100>

![](../img/project_dataset.png)

On the table of the assets, simply click on the line/asset (i.e., image here) you want to annotate.

## Label the first asset in the queue

Otherwise, you can start to annotate right away by selecting `Start Labeling`.

## How to label ?

You arrive on the asset `car_2.jpg`: 

![](../img/asset_viewer.png)

Select the category you want by clicking on the right radio button, or by pressing the key underlined in the class name "o" for Object A and "b" for Object B.

Then, click on submit to send the label.

If you want to select more than one class (multiclass classification), you can choose a checkbox as the input. Simply create a new classification job then select the checkbox as the input.

![](../img/input_choices.png)

If you have long lists of classes, we recommend using the dropdown inputs (either single choice or multiple choice) whether you use multiclass classification or not. For more information on classification, [click here](https://cloud.kili-technology.com/docs/interfaces-image/classification/).

<details>
<summary style="display: list-item;"> Follow these instructions to add a label from the API </summary>

For this, you need to know the identifier of the asset (image) — either from the URL when you are on an asset:

![](https://raw.githubusercontent.com/kili-technology/kili-playground/master/recipes/img/asset_id_url.png)

or, from the API, retrieving the assets of the project:


```python
assets = kili.assets(
    project_id=project_id,
    fields=['id']
)
asset_id = assets[0]['id']
print(asset_id)
```

    100%|██████████| 1/1 [00:00<00:00, 27.40it/s]

    ckm4pmuy30006d49kh0q64i0g


    



```python
kili.append_to_labels(
    json_response={'JOB_0': {'categories': [{'name': 'OBJECT_A'}]}},
    label_asset_id=asset_id,
    project_id=project_id
)
```




    {'id': 'ckm4pmzlj0009d49k1avaeubv'}
</details>

# Exporting labels <a id='export'></a>

## Through the interface

In the Dataset tab, you can export your labels. 

![](../img/dataset_labeled.png)

1. Choose your format and click on download. An asynchronous job is triggered, preparing your data. 
2. Next you will see a notification. Click on it, and click on the download button to download your data.

Notification appears | Notification list
:--:|:--:
![](../img/notification_appears.png) | <img src="../img/notification_opened.png" width=400>

If you choose Kili's classic API format, you get this file:

```json
[
  {
    "content": "https://cloud.kili-technology.com/api/label/v2/files?id=f436f198-cede-4380-a119-f5d827f8a8fa",
    "externalId": "car_2.jpg",
    "id": "ckm0ligy900uuc49k1idydxsk",
    "jsonMetadata": {},
    "labels": [
      {
        "author": {
          "email": "email of the author of the label",
          "id": "id of the author of the label",
        },
        "createdAt": "2021-03-08T14:32:09.063Z",
        "isLatestLabelForUser": true,
        "jsonResponse": {
          "JOB_0": { "categories": [{ "confidence": 100, "name": "OBJECT_A" }] }
        },
        "labelType": "DEFAULT",
        "modelName": null,
        "skipped": false
      }
    ]
  }
]
```

[For details on the data export, click here](https://cloud.kili-technology.com/docs/data-export/data-export/#docsNav)

<details>
<summary style="display: list-item;"> Follow these instructions to export labels from the API </summary>

```python
labels = kili.labels(
    project_id=project_id
)

def hide_sensitive(label):
    label['author'] = {
        'email': 'email of the author of the label',
        'id': 'identifier of the author of the label',
    }
    return label

result_hidden = [hide_sensitive(label) for label in labels]
result_hidden
```




    [{'author': {'email': 'email of the author of the label',
       'id': 'identifier of the author of the label',
      'id': 'ckm4pmzlj0009d49k1avaeubv',
      'jsonResponse': {'JOB_0': {'categories': [{'name': 'OBJECT_A'}]}},
      'labelType': 'DEFAULT',
      'secondsToLabel': 0,
      'skipped': False}]



Our API uses GraphQL; You can simply choose in the fields you want to fetch by specifying a list:


```python
labels = kili.labels(
    project_id=project_id,
    fields=['id', 'createdAt', 'labelOf.externalId']
)
assert len(labels) > 0
labels
```




    [{'labelOf': {'externalId': 'car_2.jpg'},
      'id': 'ckm4pmzlj0009d49k1avaeubv',
      'createdAt': '2021-03-11T10:10:20.984Z'}]



Of course, you have plenty more options/filters 


```python
help(kili.labels)
```

    Help on method labels in module kili.queries.label:
    
    labels(asset_id: str = None, asset_status_in: List[str] = None, asset_external_id_in: List[str] = None, author_in: List[str] = None, created_at: str = None, created_at_gte: str = None, created_at_lte: str = None, fields: list = ['author.email', 'author.id', 'id', 'jsonResponse', 'labelType', 'secondsToLabel', 'skipped'], first: int = None, honeypot_mark_gte: float = None, honeypot_mark_lte: float = None, id_contains: List[str] = None, json_response_contains: List[str] = None, label_id: str = None, project_id: str = None, skip: int = 0, skipped: bool = None, type_in: List[str] = None, user_id: str = None) method of kili.playground.Playground instance
        Get an array of labels from a project given a set of criteria
        
        Parameters
        ----------
        - asset_id : str, optional (default = None)
            Identifier of the asset.
        - asset_status_in : list of str, optional (default = None)
            Returned labels should have a status that belongs to that list, if given.
            Possible choices : {'TODO', 'ONGOING', 'LABELED', 'REVIEWED'}
        - asset_external_id_in : list of str, optional (default = None)
            Returned labels should have an external id that belongs to that list, if given.
        - author_in : list of str, optional (default = None)
            Returned labels should have a label whose status belongs to that list, if given.
        - created_at : string, optional (default = None)
            Returned labels should have a label whose creation date is equal to this date.
            Formatted string should have format : "YYYY-MM-DD"
        - created_at_gt : string, optional (default = None)
            Returned labels should have a label whose creation date is greater than this date.
            Formatted string should have format : "YYYY-MM-DD"
        - created_at_lt : string, optional (default = None)
            Returned labels should have a label whose creation date is lower than this date.
            Formatted string should have format : "YYYY-MM-DD"
        - fields : list of string, optional (default = ['author.email', 'author.id', 'id', 'jsonResponse', 'labelType', 'secondsToLabel', 'skipped'])
            All the fields to request among the possible fields for the labels.
            See [the documentation](https://cloud.kili-technology.com/docs/python-graphql-api/graphql-api/#label) for all possible fields.
        - first : int, optional (default = None)
            Maximum number of labels to return.  Can only be between 0 and 100.
        - honeypot_mark_gt : float, optional (default = None)
            Returned labels should have a label whose honeypot is greater than this number.
        - honeypot_mark_lt : float, optional (default = None)
            Returned labels should have a label whose honeypot is lower than this number.
        - id_contains : list of str, optional (default = None)
            Filters out labels not belonging to that list. If empty, no filtering is applied.
        - json_response_contains : list of str, optional (default = None)
            Returned labels should have a substring of the jsonResponse that belongs to that list, if given.
        - label_id : str
            Identifier of the label.
        - project_id : str
            Identifier of the project.
        - skip : int, optional (default = None)
            Number of labels to skip (they are ordered by their date of creation, first to last).
        - skipped : bool, optional (default = None)
            Returned labels should have a label which is skipped
        - type_in : list of str, optional (default = None)
            Returned labels should have a label whose type belongs to that list, if given.
        - user_id : str
            Identifier of the user.
        
        
        Returns
        -------
        - a result object which contains the query if it was successful, or an error message else.
        
        Examples
        -------
        >>> # List all labels of a project and their assets external ID
        >>> playground.labels(project_id=project_id, fields=['jsonResponse', 'labelOf.externalId'])
    
</details>

# Quality Management<a id='quality'></a>

To ensure your model performs well, it's essential that your annotations are good quality. Using Kili, you have two main ways to measure the quality of the annotations: consensus and honeypot. Consensus basically is the measure of agreement between annotations from different annotators. Honeypot is measured by comparing the annotations of your annotators to a specified gold standard that you should provide beforehand.

To access the quality management tab, go to “Settings” (gear icon), then “Quality Management”, as shown below:

![](../img/access_quality_management.png)

See more detailed information on each area:
- [Quality management](https://cloud.kili-technology.com/docs/quality/quality-management/#docsNav)
- Settings up quality metrics: [Consensus](https://cloud.kili-technology.com/docs/quality/consensus/#docsNav) and [Honeypot](https://cloud.kili-technology.com/docs/quality/honeypot/)

# More advanced concepts <a id='concepts'></a>

Additionally, following are more advanced features:

- [Importing predictions](https://cloud.kili-technology.com/docs/python-graphql-api/recipes/import_predictions/#docsNav)
- [Reviewing the labels](https://cloud.kili-technology.com/docs/quality/review-process/#docsNav)
- [Issue/Question system](https://cloud.kili-technology.com/docs/quality/question-issue/#docsNav)

[The full API definition can be found here](https://cloud.kili-technology.com/docs/python-graphql-api/python-api/#docsNav)