# Getting Started with Labellerr SDK

This notebook demonstrates how to use the Labellerr SDK for video processing and scene detection. The SDK provides powerful tools for managing video datasets, processing videos, and detecting scene changes using various algorithms.

### Import the required Classes from Labellerr SDK
We'll start by importing the essential classes needed for working with the SDK:

In [1]:
from labellerr.client import LabellerrClient
from labellerr.core.datasets import LabellerrDataset
import os
from tqdm.notebook import tqdm


## 1. Authentication Setup

Before using the Labellerr SDK, you need to set up your authentication credentials. These credentials ensure secure access to the Labellerr platform and its services.

### Required Credentials:

1. **API Key & API Secret**
   - Log in to your Labellerr account
   - Navigate to the "Get API" tab
   - Copy your unique API key and secret

2. **Client ID**
   - This is a unique identifier for your application
   - Contact Labellerr support to obtain your client ID
   
⚠️ Important: Never share these credentials or commit them to version control.

In [2]:
from dotenv import dotenv_values
config = dotenv_values(".env")

api_key = config["API_KEY"]
api_secret = config["API_SECRET"]
client_id = config["CLIENT_ID"]

## 2. Project Configuration

### Dataset and Project IDs
To work with specific datasets and projects in Labellerr, you need their respective IDs. These IDs are unique identifiers that link your code to the correct resources on the platform.

How to obtain the IDs:
1. Go to the Labellerr platform
2. Create or select an existing dataset
3. Create or select an existing project
4. Copy the dataset_id and project_id from their respective pages

Note: The dataset_id is a UUID format string, while the project_id is typically a human-readable string.

In [3]:
# go to our platform to create dataset and project then get their ids
dataset_id = "16257fd6-b91b-4d00-a680-9ece9f3f241c"
project_id = "gabrila_artificial_duck_74237"

## 3. Initializing the Labellerr SDK

### Create LabellerrClient Instance
Now we'll create instances of the main SDK classes:

1. **LabellerrClient**: The main client that handles communication with the Labellerr API
2. **LabellerrDataset**: A specialized class for working with datasets

These instances will be used for all subsequent operations with the platform.

In [4]:
client = LabellerrClient(api_key, api_secret, client_id) 
dataset = LabellerrDataset(client, dataset_id, project_id)

In [5]:
results = dataset.download()


######################################################################
# Starting batch video processing for dataset: 16257fd6-b91b-4d00-a680-9ece9f3f241c
######################################################################

Total file IDs extracted: 1

Creating LabellerrFile instances for 1 files...
Successfully created 1 LabellerrFile instances

Processing 1 video files...


Starting download of 1 files...

Processing file: c44f38f6-0186-436f-8c2d-ffb50a539c76

[1/4] Fetching frame data from API (0 to 1440)...
Retrieved 1440 frames

[2/4] Setting up output folders...

[3/4] Downloading frames...
Starting download of 1440 frames...
Frames downloaded: 1440/1440 (1440 successful, 0 failed)

[4/4] Creating video from frames...
Running command: ffmpeg -y -start_number 0 -framerate 30 -i ./Labellerr_datastets\16257fd6-b91b-4d00-a680-9ece9f3f241c\c44f38f6-0186-436f-8c2d-ffb50a539c76\%d.jpg -c:v libx264 -pix_fmt yuv420p ./Labellerr_datastets\16257fd6-b91b-4d00-a680-9ece9f3f241c\c44f38f6-0

### download Videos
The `download()` method will:
- Fetch all videos in the dataset
- Process them according to the configured settings
- Return the results of the processing

This is typically used as the first step in video analysis to ensure all videos are properly prepared for further processing.

## 4. Scene Change Detection

### Available Scene Detection Methods
Labellerr SDK provides multiple algorithms for scene detection in videos:

1. **PySceneDetect**: 
   - Python-based scene detection
   - Uses content-aware detection
   - Good for general-purpose scene detection

2. **SSIMSceneDetect**:
   - Uses Structural Similarity Index (SSIM)
   - Better for detecting subtle scene changes
   - More computationally intensive but more accurate

3. **FFMPEGSceneDetect**:
   - Uses FFMPEG for scene detection
   - Fastest method
   - Good for quick analysis of large video files

Choose the method that best suits your needs based on accuracy requirements and processing speed constraints.

In [6]:
from labellerr.services.video_sampling.pyscene_detect import PySceneDetect
from labellerr.services.video_sampling.ssim import SSIMSceneDetect
from labellerr.services.video_sampling.ffmpeg import FFMPEGSceneDetect

## Scene Detection Implementation

### Setting up the Scene Detector
Now we'll set up the scene detection process:

1. First, we'll define the dataset directory where our videos are stored
2. Then we'll create an instance of our chosen detector
3. Finally, we'll process each video in the dataset

Note: Make sure you have sufficient disk space for storing the extracted scenes, as this process can generate multiple files per video.

In [None]:
dataset_dir = f".\Labellerr_datasets\{dataset_id}"

In [10]:
detector = FFMPEGSceneDetect()

### Initialize the Scene Detector
Here we create an instance of the SSIMSceneDetect class. This detector uses the Structural Similarity Index Measure (SSIM) to identify scene changes in videos. SSIM is particularly effective at detecting subtle changes between frames.

In [11]:
for filename in os.listdir(dataset_dir):
    file_path = os.path.join(dataset_dir, filename)
    
    if os.path.isfile(file_path):
        detector.detect_and_extract(file_path)

Keyframes extracted to FFMPEG_detects\16257fd6-b91b-4d00-a680-9ece9f3f241c\c44f38f6-0186-436f-8c2d-ffb50a539c76\frames
JSON mapping saved to: FFMPEG_detects\16257fd6-b91b-4d00-a680-9ece9f3f241c\c44f38f6-0186-436f-8c2d-ffb50a539c76\c44f38f6-0186-436f-8c2d-ffb50a539c76_mapping.json


### Process Videos for Scene Detection

The following code block:
1. Iterates through all files in the dataset directory
2. Constructs the full file path for each video
3. Verifies that each path points to a file (not a directory)
4. Applies scene detection to each video using the `detect_and_extract` method

The detected scenes will be saved in a subdirectory with the same name as the input video file. Each scene will be saved as a separate video file.

## 5. Project Creation

In this section, we'll explore how to create and manage projects in Labellerr. Projects are essential containers that organize your data and annotations. We'll cover:

1. Creating image datasets from video frames
2. Setting up annotation projects
3. Managing project configurations

### Image Dataset Creation from Sampled Frames


In [13]:
import os

images_files = []
# Clear existing entries in images_files
images_files.clear()

# Construct the base directory path for detected frames
base_dir = os.path.join("FFMPEG_detects", dataset_id)

# Walk through all subdirectories
for root, dirs, files in os.walk(base_dir):
    for file in files:
        if file.endswith('.jpg'):  # Only collect jpg files
            file_path = os.path.join(root, file)
            images_files.append(file_path)

print(f"Found {len(images_files)} image files")

Found 52 image files


In [14]:
images_files

['FFMPEG_detects\\16257fd6-b91b-4d00-a680-9ece9f3f241c\\c44f38f6-0186-436f-8c2d-ffb50a539c76\\frames\\0.jpg',
 'FFMPEG_detects\\16257fd6-b91b-4d00-a680-9ece9f3f241c\\c44f38f6-0186-436f-8c2d-ffb50a539c76\\frames\\1008.jpg',
 'FFMPEG_detects\\16257fd6-b91b-4d00-a680-9ece9f3f241c\\c44f38f6-0186-436f-8c2d-ffb50a539c76\\frames\\1016.jpg',
 'FFMPEG_detects\\16257fd6-b91b-4d00-a680-9ece9f3f241c\\c44f38f6-0186-436f-8c2d-ffb50a539c76\\frames\\1028.jpg',
 'FFMPEG_detects\\16257fd6-b91b-4d00-a680-9ece9f3f241c\\c44f38f6-0186-436f-8c2d-ffb50a539c76\\frames\\1060.jpg',
 'FFMPEG_detects\\16257fd6-b91b-4d00-a680-9ece9f3f241c\\c44f38f6-0186-436f-8c2d-ffb50a539c76\\frames\\1082.jpg',
 'FFMPEG_detects\\16257fd6-b91b-4d00-a680-9ece9f3f241c\\c44f38f6-0186-436f-8c2d-ffb50a539c76\\frames\\1106.jpg',
 'FFMPEG_detects\\16257fd6-b91b-4d00-a680-9ece9f3f241c\\c44f38f6-0186-436f-8c2d-ffb50a539c76\\frames\\1119.jpg',
 'FFMPEG_detects\\16257fd6-b91b-4d00-a680-9ece9f3f241c\\c44f38f6-0186-436f-8c2d-ffb50a539c76\\frame

In [15]:
# code to create dataset from sampled frames

def upload_images_from_files(images_files, client, client_id):
    """Upload specific image files to create a dataset"""
    
    client.enable_connection_pooling = True
    
    dataset_config = {
        "client_id": client_id,
        "dataset_name": "video_sampling_1",
        "dataset_description": "video sampling dataset from frames",
        "data_type": "image",  
    }
    
    try:
        response = client.create_dataset(
            dataset_config=dataset_config,
            files_to_upload=images_files  
        )
        print(f"Dataset created successfully!")
        print(f"Dataset ID: {response['dataset_id']}")
        return response['dataset_id']
    except Exception as e:
        print(f"Error creating dataset: {e}")

In [16]:
upload_images_from_files(images_files, client, client_id)

Dataset created successfully!
Dataset ID: 6a680901-fe81-49f0-9120-bb754d63a341


'6a680901-fe81-49f0-9120-bb754d63a341'

In [19]:
new_dataset_id = '6a680901-fe81-49f0-9120-bb754d63a341'

### Image Annotation Project Creation

In [None]:
# modify to add questions to image project

questions = [
  {
    "question_number": 1,
    "question": "Test",
    "question_id": "533bb0c8-fb2b-4394-a8e1-5042a944802f",
    "option_type": "polygon",
    "required": True,
    "options": [
      { "option_name": "#fe1236" }
    ],
    "question_metadata": []
  }
            ]


In [None]:
# creeate the annotation guideline template

template_id = client.create_annotation_guideline(
    client_id=client_id,
    questions=questions,
    template_name="video_sampling_template_1",
    data_type="image",
)


In [None]:
# create the image annotation project

response = client.create_project(
                project_name="Video_sampling_project_1",
                data_type="image",
                client_id= client_id,
                dataset_id=new_dataset_id,
                annotation_template_id=template_id,
                rotation_config={
                    "annotation_rotation_count": 1,
                    "review_rotation_count": 1,
                    "client_review_rotation_count": 1,
                },
                created_by="yashsuman15@gmail.com"
            )


In [31]:
if response['response']['project_id']:
    print(f"Project created successfully!")
    print(f"Project ID: {response['response']['project_id']}")
    image_project_id = response['response']['project_id']

Project created successfully!
Project ID: sherri_puny_rattlesnake_84247


## 6. Performing Annotations of Image Project

In [None]:
# annotations of image project on labellerr platform

### Exporting the Annotation Data

In [None]:
# code to export the annotations from image project

export_config = {
        "export_name": "Weekly Export",
        "export_description": "Export of all accepted annotations",
        "export_format": "coco_json",
        "statuses": [
            "review",
            "r_assigned",
            "client_review",
            "cr_assigned",
            "accepted",
        ],
    }


response = client.create_local_export(
    project_id=image_project_id,
    client_id=client_id,
    export_config=export_config
)

## 7. Uploading annotations to Video Project

### Trigger SAM2 tracking on Video annotation project

Using the export, retrive the prompt to run SAM2 tracking on video

In [12]:
# code to create video annotation project from image annotations export