In [None]:
import os
LABEL_STUDIO_URL = os.getenv('LABEL_STUDIO_URL', 'http://localhost:8080')
LABEL_STUDIO_API_KEY = os.getenv('LABEL_STUDIO_API_KEY')


# Import pre-annotations

When you have a data pipeline with models outputting predictions, you want to be able to import those predicted annotations, or pre-annotations, into Label Studio for review and correction.

In this example, use the [Label Studio SDK](https://labelstud.io/sdk/index.html) to write a Python script that transforms and imports predictions into a Label Studio project.

## Connect to Label Studio

Connect to the Label Studio API using the Client module of the SDK:

In [None]:
from label_studio_sdk.client import LabelStudio

ls = LabelStudio(base_url=LABEL_STUDIO_URL, api_key=LABEL_STUDIO_API_KEY)

## Create a project

Create a project for your labels. In this example, create an [image classification](https://labelstud.io/templates/image_classification.html) project:

In [2]:
from label_studio_sdk.label_interface import LabelInterface
from label_studio_sdk.label_interface.create import choices

label_config = LabelInterface.create({
    'image': 'Image',
    'image_class': choices(['Cat', 'Dog'])
})
print(label_config)

project = ls.projects.create(
    title='Project Created from SDK: Image Preannotation',
    label_config=label_config
)

<View>
  <Image name="image" value="$image"/>
  <Choices name="image_class" toName="image">
    <Choice value="Cat"/>
    <Choice value="Dog"/>
  </Choices>
</View>


## Import tasks with pre-annotations

You can import tasks with pre-annotations in several different ways. Choose one of these three methods for your script, based on how your model predictions are formatted.

### 1. Import tasks in Label Studio JSON format

You can format tasks in basic [JSON Label Studio format](https://labelstud.io/guide/tasks.html#Basic-Label-Studio-JSON-format) and choose to import the pre-annotations that way.

In [5]:
ls.projects.import_tasks(
    project.id,
    request=[{
        'data': {'image': 'https://data.heartex.net/open-images/train_0/mini/0045dd96bf73936c.jpg'},
        'predictions': [{
            'result': [{
                'from_name': 'image_class',
                'to_name': 'image',
                'type': 'choices',
                'value': {
                    'choices': ['Dog']
                }
            }],
            'score': 0.87
        }]
    }, {
        'data': {'image': 'https://data.heartex.net/open-images/train_0/mini/0083d02f6ad18b38.jpg'},
        'predictions': [{
            'result': [{
                'from_name': 'image_class',
                'to_name': 'image',
                'type': 'choices',
                'value': {
                    'choices': ['Cat']
                }
            }],
            'score': 0.65
        }]
    }]
)

ProjectsImportTasksResponse(task_count=2, annotation_count=0, predictions_count=None, duration=0.031484127044677734, file_upload_ids=[], could_be_tasks_list=False, found_formats=[], data_columns=[], prediction_count=2)

> If you're not importing predictions for an image classification task, see the documentation for [importing pre-annotations](https://labelstud.io/guide/predictions.html) in Label Studio JSON format for more examples.

### 2. Import simple JSON predictions

This simpler JSON format is a way to import pre-annotation results from a specific field for a single image. In this case, import task data with predictions in the `pet` field, and specify that the `pet` field contains the predicted classification:

In [6]:
ls.projects.import_tasks(
    project.id,
    request=[{'image': f'https://data.heartex.net/open-images/train_0/mini/0045dd96bf73936c.jpg', 'pet': 'Dog'},
             {'image': f'https://data.heartex.net/open-images/train_0/mini/0083d02f6ad18b38.jpg', 'pet': 'Cat'}],
    preannotated_from_fields=['pet']
)

ProjectsImportTasksResponse(task_count=2, annotation_count=0, predictions_count=None, duration=0.027251005172729492, file_upload_ids=[], could_be_tasks_list=False, found_formats=[], data_columns=[], prediction_count=2)

### 3. Import predictions from CSV

If your predictions are stored in CSV files, you can use pandas to read the dataframes:

In [1]:
!pip install pandas

In [1]:
import pandas as pd
df = pd.read_csv('data/images.csv')

Then you can specify the CSV file with the image references, and specify that the `pet` field contains the predicted classification:

In [1]:
ls.projects.import_tasks(
    project.id,
    request=df.to_dict(orient='records'),
    preannotated_from_fields=['pet']
)

### 4. Import predictions to existing tasks

In some cases, you may want to apply predictions to already imported tasks. For example, you can retrieve tasks from Label Studio, then create a new prediction:

In [1]:
from label_studio_sdk.label_interface.objects import PredictionValue

li = ls.projects.get(id=project.id).get_label_interface()

for task in ls.tasks.list(project=project.id, include=['id']):
    prediction = PredictionValue(
        # Tag predictions with specific model version
        model_version='my_model_v1',
        # Define your labels here
        result=[
            li.get_control('image_class').label(['Dog']),
        ]
    )
    ls.predictions.create(task=task.id, **prediction.model_dump())

In more complex annotation scenarios, check out [JSON format for expected predictions / preannotations](https://labelstud.io/guide/predictions.html)

## Annotate tasks in Label Studio

After importing pre-annotations using your preferred method, you can open Label Studio at http://localhost:8080 and correct, review the predictions, and finish annotating your tasks.

## Calculate prediction accuracy

If you're using Label Studio Enterprise, you can calculate the accuracy of your predictions compared to the corrected annotations created as a ground truth dataset.

Install and import the [evalme package](https://github.com/heartexlabs/label-studio-evalme) and calculate an agreement score for each task, comparing the annotation to the prediction:

In [None]:
print('Skipping evalme install; using SDK stats if available.')

In [None]:
from label_studio_sdk.core.api_error import ApiError
import json

print('\nAgreement (SDK):')
try:
    # total_agreement is typically Enterprise-only
    agreement_stats = ls.projects.stats.total_agreement(project.id)
    print(json.dumps(agreement_stats.model_dump(), indent=2))
except ApiError as e:
    print(f'Agreement stats unavailable on this edition: {e}')
    print('`ls.projects.stats.total_agreement` is typically an Enterprise-only feature.')
    print('For OSS, export annotations and predictions and calculate agreement manually.')

## Conclusion

With the Label Studio SDK, you can more easily import pre-annotated tasks into Label Studio so that you can create ground truth datasets, visually review the accuracy of predictions, and more.

The `preannotated_from_fields` option for the `import_tasks()` method makes it easier to add your predictions without worrying about the intricacies of the Label Studio JSON format, but you can still use that field to add valuable metadata such as prediction scores and model versions to your pre-annotated task data.