<a href="https://colab.research.google.com/github/kili-technology/kili-python-sdk/blob/master/recipes/label_parsing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# How to use the label parser

In [None]:
%pip install kili

In [None]:
from kili.client import Kili
from kili.utils.labels.parsing import ParsedLabel

In [None]:
kili = Kili()

## `.jobs` attribute

The `.jobs` attribute of a `ParsedLabel` class is a dictionary-like object that contains the parsed labels. The keys are the names of the jobs, and the values are the parsed job responses.

Let's create a simple Kili project to illustrate this.

We define a json interface for a single-class classification job, with name `CLASSIFICATION_JOB` and three categories `A`, `B` and `C`:

In [None]:
json_interface = {
    "jobs": {
        "CLASSIFICATION_JOB": {
            "content": {
                "categories": {
                    "A": {"children": [], "name": "A"},
                    "B": {"children": [], "name": "B"},
                    "C": {"children": [], "name": "C"},
                },
                "input": "radio",
            },
            "instruction": "Class",
            "mlTask": "CLASSIFICATION",
            "required": 1,
            "isChild": False,
        }
    }
}
project_id = kili.create_project(
    input_type="TEXT", json_interface=json_interface, title="Label parsing tutorial"
)["id"]

We also upload some assets to the project:

In [None]:
kili.append_many_to_dataset(
    project_id,
    content_array=["text1", "text2", "text3"],
    external_id_array=["asset1", "asset2", "asset3"],
)

100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:01<00:00,  1.59it/s]


{'id': 'clh0ghbb80xff0j668f3i15lp'}

Once the assets are uploaded, we can start labeling them manually through the Kili UI.

For this tutorial, we will just only upload already existing labels.

In [None]:
labels_to_upload = [
    {"CLASSIFICATION_JOB": {"categories": [{"confidence": 75, "name": "A"}]}},
    {"CLASSIFICATION_JOB": {"categories": [{"confidence": 50, "name": "B"}]}},
    {"CLASSIFICATION_JOB": {"categories": [{"confidence": 25, "name": "C"}]}},
]
kili.append_labels(
    json_response_array=labels_to_upload,
    project_id=project_id,
    asset_external_id_array=["asset1", "asset2", "asset3"],
)

100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00,  4.21it/s]


[{'id': 'clh0gheax0xgs0j66bj9y58lo'},
 {'id': 'clh0gheax0xgt0j66eohy17op'},
 {'id': 'clh0gheax0xgu0j66awoz8sfn'}]

When querying labels using `kili.labels()`, it is possible to automatically parse the labels using the `output_format` argument:

In [None]:
labels = kili.labels(
    project_id, output_format="parsed_label"
)  # labels is a list of ParsedLabel object

100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00,  4.60it/s]

3
<class 'kili.utils.labels.parsing.ParsedLabel'>





In [None]:
print(len(labels))

3


In [None]:
print(type(labels[0]))

<class 'kili.utils.labels.parsing.ParsedLabel'>


Using the `.jobs` attribute with the job name, one can access the label's data:

In [None]:
print(labels[0].jobs["CLASSIFICATION_JOB"])

{'categories': [{'name': 'A', 'confidence': 75}]}


In [None]:
print(labels[0].jobs["CLASSIFICATION_JOB"].categories)

[{'name': 'A', 'confidence': 75}]


In [None]:
print(labels[0].jobs["CLASSIFICATION_JOB"].categories[0].name)

A


Since `CLASSIFICATION_JOB` is a single-category classification job, the `.category` attribute is available, and is an alias for `.categories[0]`:

In [None]:
print(labels[0].jobs["CLASSIFICATION_JOB"].category.name)
print(labels[0].jobs["CLASSIFICATION_JOB"].category.confidence)

A
75


## Convert to Python dict

A `ParsedLabel` is a custom class and is not serializable by default. However, it is possible to convert it to a Python dict using the `to_dict` method:

In [None]:
label_as_dict = label.to_dict()
print(type(label_as_dict))

## Classification job

## Object detection job

## Transcription job

## Video job

## Named entities recognition job

## Named entities recognition in PDF job

## Relation job

### Named entities relation job

### Object detection relation job

## Pose estimation job

## Children jobs