<td>
   <a target="_blank" href="https://labelbox.com" ><img src="https://labelbox.com/blog/content/images/2021/02/logo-v4.svg" width=256/></a>
</td>

<td>
<a href="https://colab.research.google.com/github/Labelbox/labelbox-python/blob/develop/examples/model_assisted_labeling/image_mal.ipynb" target="_blank"><img
src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>
</td>

<td>
<a href="https://github.com/Labelbox/labelbox-python/tree/develop/examples/model_assisted_labeling/image_mal.ipynb" target="_blank"><img
src="https://img.shields.io/badge/GitHub-100000?logo=github&logoColor=white" alt="GitHub"></a>
</td>

# Text Annotation Import
* This notebook will provide examples of each supported annotation type for text assets. It will cover the following:
    * Model-assisted labeling - used to provide pre-annotated data for your labelers. This will enable a reduction in the total amount of time to properly label your assets. Model-assisted labeling does not submit the labels automatically, and will need to be reviewed by a labeler for submission.
    * Label Import - used to provide ground truth labels. These can in turn be used and compared against prediction labels, or used as benchmarks to see how your labelers are doing.

* For information on what types of annotations are supported per data type, refer to this documentation:
    * https://docs.labelbox.com/docs/model-assisted-labeling#option-1-import-via-python-annotation-types-recommended

* Notes:
    * Wait until the import job is complete before opening the Editor to make sure all annotations are imported properly.

# Installs

In [None]:
!pip install -q 'labelbox[data]'

# Imports

In [2]:
from labelbox.schema.ontology import OntologyBuilder, Tool, Classification, Option
from labelbox import Client, LabelingFrontend, LabelImport, MALPredictionImport
from labelbox.data.annotation_types import (
    Label, TextData, Checklist, Radio, ObjectAnnotation, TextEntity,
    ClassificationAnnotation, ClassificationAnswer
)
from labelbox.data.serialization import NDJsonConverter
import uuid
import json
import numpy as np

# API Key and Client
Provide a valid api key below in order to properly connect to the Labelbox Client.

In [3]:
# Add your api key
API_KEY = None
client = Client(api_key=API_KEY)

INFO:labelbox.client:Initializing Labelbox client at 'https://api.labelbox.com/graphql'


---- 
### Steps
1. Make sure project is setup
2. Collect annotations
3. Upload

### Project setup

We will be creating two projects, one for model-assisted labeling, and one for label imports

In [4]:
ontology_builder = OntologyBuilder(
    tools=[
        Tool(tool=Tool.Type.NER, name="named_entity")
        ],
    classifications=[
        Classification(class_type=Classification.Type.CHECKLIST, instructions="checklist", options=[
            Option(value="first_checklist_answer"),
            Option(value="second_checklist_answer")            
        ]),
        Classification(class_type=Classification.Type.RADIO, instructions="radio", options=[
            Option(value="first_radio_answer"),
            Option(value="second_radio_answer")
        ])])

In [5]:
mal_project = client.create_project(name="text_mal_project")
li_project = client.create_project(name="text_label_import_project")


dataset = client.create_dataset(name="text_annotation_import_demo_dataset")
test_txt_url = "https://storage.googleapis.com/labelbox-sample-datasets/nlp/lorem-ipsum.txt"
data_row = dataset.create_data_row(row_data=test_txt_url)
editor = next(client.get_labeling_frontends(where=LabelingFrontend.name == "Editor"))

mal_project.setup(editor, ontology_builder.asdict())
mal_project.datasets.connect(dataset)

li_project.setup(editor, ontology_builder.asdict())
li_project.datasets.connect(dataset)

### Create Label using Annotation Type Objects
* It is recommended to use the Python SDK's annotation types for importing into Labelbox.

### Object Annotations

In [6]:
def create_objects():
  named_enity = TextEntity(start=10,end=20)
  named_enity_annotation = ObjectAnnotation(value=named_enity, name="named_entity")
  return named_enity_annotation

### Classification Annotations

In [7]:
def create_classifications():
  checklist = Checklist(answer=[ClassificationAnswer(name="first_checklist_answer"),ClassificationAnswer(name="second_checklist_answer")])
  checklist_annotation = ClassificationAnnotation(value=checklist, name="checklist")
  radio = Radio(answer = ClassificationAnswer(name = "second_radio_answer"))
  radio_annotation = ClassificationAnnotation(value=radio, name="radio")
  return checklist_annotation, radio_annotation

### Create a Label object with all of our annotations

In [8]:
image_data = TextData(uid=data_row.uid)

named_enity_annotation = create_objects()
checklist_annotation, radio_annotation = create_classifications()

label = Label(
    data=image_data,
    annotations = [
        named_enity_annotation, checklist_annotation, radio_annotation
    ]
)

label.__dict__

{'annotations': [ObjectAnnotation(name='named_entity', feature_schema_id=None, extra={}, value=TextEntity(start=10, end=20, extra={}), classifications=[]),
  ClassificationAnnotation(name='checklist', feature_schema_id=None, extra={}, value=Checklist(name='checklist', answer=[ClassificationAnswer(name='first_checklist_answer', feature_schema_id=None, extra={}, keyframe=None), ClassificationAnswer(name='second_checklist_answer', feature_schema_id=None, extra={}, keyframe=None)])),
  ClassificationAnnotation(name='radio', feature_schema_id=None, extra={}, value=Radio(answer=ClassificationAnswer(name='second_radio_answer', feature_schema_id=None, extra={}, keyframe=None)))],
 'data': TextData(file_path=None,text=None,url=None),
 'extra': {},
 'uid': None}

### Model Assisted Labeling 

To do model-assisted labeling, we need to convert a Label object into an NDJSON. 

This is easily done with using the NDJSONConverter class

We will create a Label called mal_label which has the same original structure as the label above

Notes:
* Each label requires a valid feature schema id. We will assign it using our built in `assign_feature_schema_ids` method
* the NDJsonConverter takes in a list of labels

In [9]:
mal_label = Label(
    data=image_data,
    annotations = [
        named_enity_annotation, checklist_annotation, radio_annotation
    ]
)

mal_label.assign_feature_schema_ids(ontology_builder.from_project(mal_project))

ndjson_labels = list(NDJsonConverter.serialize([mal_label]))

ndjson_labels

[{'classifications': [],
  'dataRow': {'id': 'cl084bkqg6x46109uhkagffci'},
  'location': {'end': 20, 'start': 10},
  'schemaId': 'cl084bl2k6tlw10bb6crig15b',
  'uuid': 'bd6bbc3e-62c8-4d83-be10-f2d8b74dfb67'},
 {'answer': [{'schemaId': 'cl084bl2k6tlz10bb3o4y3x2t'},
   {'schemaId': 'cl084bl2k6tm110bb706v94qj'}],
  'dataRow': {'id': 'cl084bkqg6x46109uhkagffci'},
  'schemaId': 'cl084bl2k6tly10bb5zx90mil',
  'uuid': '921a710d-b497-47e2-adbe-a4eaa2d53771'},
 {'answer': {'schemaId': 'cl084bl2k6tm710bb5alkg5ss'},
  'dataRow': {'id': 'cl084bkqg6x46109uhkagffci'},
  'schemaId': 'cl084bl2k6tm410bb3y9kat3l',
  'uuid': 'b63d0c2c-5470-4245-8fe4-3f17d68c2f83'}]

In [10]:
upload_job = MALPredictionImport.create_from_objects(
    client = client, 
    project_id = mal_project.uid, 
    name="upload_label_import_job", 
    predictions=ndjson_labels)

In [11]:
# Errors will appear for each annotation that failed.
# Empty list means that there were no errors
# This will provide information only after the upload_job is complete, so we do not need to worry about having to rerun
print("Errors:", upload_job.errors)

INFO:labelbox.schema.annotation_import:Sleeping for 10 seconds...


Errors: []


### Label Import

Label import is very similar to model-assisted labeling. We will need to re-assign the feature schema before continuing, 
but we can continue to use our NDJSonConverter

We will create a Label called li_label which has the same original structure as the label above

In [12]:
#for the purpose of this notebook, we will need to reset the schema ids of our checklist and radio answers
image_data = TextData(uid=data_row.uid)

named_enity_annotation = create_objects()
checklist_annotation, radio_annotation = create_classifications()

li_label = Label(
    data=image_data,
    annotations = [
        named_enity_annotation, checklist_annotation, radio_annotation
    ]
)

li_label.assign_feature_schema_ids(ontology_builder.from_project(li_project))

ndjson_labels = list(NDJsonConverter.serialize([li_label]))

ndjson_labels, li_project.ontology().normalized

([{'classifications': [],
   'dataRow': {'id': 'cl084bkqg6x46109uhkagffci'},
   'location': {'end': 20, 'start': 10},
   'schemaId': 'cl084bljq4osi10851hcq060h',
   'uuid': '2386230e-9886-43da-8cf7-fde6eb8c9ede'},
  {'answer': [{'schemaId': 'cl084bljq4osl10851vprdw54'},
    {'schemaId': 'cl084bljq4osn1085a7628w67'}],
   'dataRow': {'id': 'cl084bkqg6x46109uhkagffci'},
   'schemaId': 'cl084bljq4osk1085fztf1ty1',
   'uuid': 'e6589d65-be72-4937-8835-99272ebacda0'},
  {'answer': {'schemaId': 'cl084bljq4ost1085dyy85pkj'},
   'dataRow': {'id': 'cl084bkqg6x46109uhkagffci'},
   'schemaId': 'cl084bljq4osq1085dln78tju',
   'uuid': 'f674a983-83f8-444c-bd7c-72b4f84b8c22'}],
 {'classifications': [{'archived': 0,
    'featureSchemaId': 'cl084bljq4osk1085fztf1ty1',
    'instructions': 'checklist',
    'name': 'checklist',
    'options': [{'featureSchemaId': 'cl084bljq4osl10851vprdw54',
      'label': 'first_checklist_answer',
      'schemaNodeId': 'cl084bljq4osm1085cygq839a',
      'value': 'first_che

In [13]:
upload_job = LabelImport.create_from_objects(
    client = client, 
    project_id = li_project.uid, 
    name="upload_label_import_job", 
    labels=ndjson_labels)

In [14]:
print("Errors:", upload_job.errors)

INFO:labelbox.schema.annotation_import:Sleeping for 10 seconds...


Errors: []
