# Getting Started with Labellerr SDK

This notebook demonstrates how to use the Labellerr SDK for video processing and scene detection. The SDK provides powerful tools for managing video datasets, processing videos, and detecting scene changes using various algorithms.

### Import the required Classes from Labellerr SDK
We'll start by importing the essential classes needed for working with the SDK:

In [None]:
from labellerr.client import LabellerrClient
from labellerr.core.datasets import LabellerrDataset
from labellerr.exceptions import LabellerrError
import os
from tqdm.notebook import tqdm


## 1. Authentication Setup

Before using the Labellerr SDK, you need to set up your authentication credentials. These credentials ensure secure access to the Labellerr platform and its services.

### Required Credentials:

1. **API Key & API Secret**
   - Log in to your Labellerr account
   - Navigate to the "Get API" tab
   - Copy your unique API key and secret

2. **Client ID**
   - This is a unique identifier for your application
   - Contact Labellerr support to obtain your client ID
   
⚠️ Important: Never share these credentials or commit them to version control.

In [None]:
from dotenv import dotenv_values
config = dotenv_values(".env")

api_key = config["API_KEY"]
api_secret = config["API_SECRET"]
client_id = config["CLIENT_ID"]
email = config["EMAIL"]

## Kaggle Dataset Download and Project creation

Before downloading the dataset from Kaggle, you need to:

1. Install kagglehub package using pip
2. Authenticate with Kaggle
3. Download the CCTV footage dataset

The kagglehub package provides a simple interface to download datasets directly from Kaggle. Make sure you have a Kaggle account and API credentials set up before proceeding.

Note: If you haven't set up Kaggle authentication before, you'll need to:
1. Create a Kaggle account at https://www.kaggle.com
2. Go to "Account" settings
3. Scroll to API section and click "Create New API Token"
4. This will download a kaggle.json file with your credentials

In [None]:
# !pip install kagglehub dotenv

In [None]:
import kagglehub

kagglehub.login()

In [None]:
# Download 1000 videos(~1 min) dataset
path_to_dataset = kagglehub.dataset_download("yashsuman/cctv-footage")


print("Path to dataset files:", path_to_dataset)

## 2. Project Configuration

### Create Project with kaggle dataset
Create a project with sample annotation template with kaggle dataset

### project_payload structure (keys and purpose)
- client_id: str — client identifier.
- dataset_name: str — human-readable dataset name.
- dataset_description: str — short description of the dataset.
- data_type: str — "video" (or "image") indicating dataset type.
- created_by: str — email of the creator/owner.
- project_name: str — name for the new Labellerr project.
- annotation_guide: list — annotation questions; each question includes:
    - question_number (int), question (str), question_id (str), option_type (str),
    - required (bool), options (list of option objects with option_name, etc.)
- rotation_config: dict — rotation counts for annotation, review, and client review:
    - annotation_rotation_count, review_rotation_count, client_review_rotation_count
- autolabel: bool — whether to enable autolabeling.
- folder_to_upload: str — local folder path containing files to upload to the project.

### Typical usage steps
1. Ensure .env has API_KEY, API_SECRET, CLIENT_ID, EMAIL and `config` is loaded.
2. Ensure `path_to_dataset` points to the correct local dataset folder.
3. Instantiate client if not already done:
   client = LabellerrClient(api_key, api_secret, client_id)
4. Review or adjust `project_payload` (annotation guide, rotation, folder_to_upload).
5. Create the project:
   try:
       result = client.initiate_create_project(project_payload)
       project_id = result['project_id']['response']['project_id']
   except LabellerrError as e:
       handle or log the exception

## Notes & best practices
- Do not commit API credentials to source control.
- Verify `folder_to_upload` contains the expected video files before initiating the create-project call.
- Customize `annotation_guide` to match your annotation schema and color/options.
- Use rotation_config to control annotation/review distribution and workload.

In [None]:
client = LabellerrClient(api_key, api_secret, client_id)

project_payload = {
        "client_id": client_id,
        "dataset_name": "CCTV Footage Dataset",
        "dataset_description": "A sample dataset for video annotation",
        "data_type": "video",
        "created_by": email,
        "project_name": "SDK workflow",
        "annotation_guide": [
            {
                "question_number": 1,  # incremental series starting from 1
                "question": "Test",  # question name
                "question_id": "533bb0c8-fb2b-4394-a8e1-5042a944802f",  # random uuid
                "option_type": "polygon",
                "required": True,
                "options": [
                    {
                        "option_name": "#fe1236"
                    },  # give the hex code of some random color
                ],
            }
        ],
        "rotation_config": {
            "annotation_rotation_count": 1,
            "review_rotation_count": 1,
            "client_review_rotation_count": 1,
        },
        "autolabel": False,
        "folder_to_upload": path_to_dataset,
    }

try:
    result = client.initiate_create_project(project_payload)
    print(
        f"Project ID: {result['project_id']['response']['project_id']}"
    )
except LabellerrError as e:

    print(f"Project creation failed: {str(e)}")

In [None]:
project_id = result['project_id']['response']['project_id']
dataset_id = client.datasets.get_all_datasets(project_id=project_id)[0]['dataset_id']

## 3. Initializing the Labellerr SDK

### Create LabellerrClient Instance
Now we'll create instances of the main SDK classes:

1. **LabellerrClient**: The main client that handles communication with the Labellerr API
2. **LabellerrDataset**: A specialized class for working with datasets

These instances will be used for all subsequent operations with the platform.

In [None]:
# client = LabellerrClient(api_key, api_secret, client_id)
dataset = LabellerrDataset(client, dataset_id, project_id)

In [None]:
results = dataset.download()

### download Videos
The `download()` method will:
- Fetch all videos in the dataset
- Process them according to the configured settings
- Return the results of the processing

This is typically used as the first step in video analysis to ensure all videos are properly prepared for further processing.

## 4. Scene Change Detection

### Available Scene Detection Methods
Labellerr SDK provides multiple algorithms for scene detection in videos:

1. **PySceneDetect**: 
   - Python-based scene detection
   - Uses content-aware detection
   - Good for general-purpose scene detection

2. **SSIMSceneDetect**:
   - Uses Structural Similarity Index (SSIM)
   - Better for detecting subtle scene changes
   - More computationally intensive but more accurate

3. **FFMPEGSceneDetect**:
   - Uses FFMPEG for scene detection
   - Fastest method
   - Good for quick analysis of large video files

Choose the method that best suits your needs based on accuracy requirements and processing speed constraints.

In [None]:
from labellerr.services.video_sampling.pyscene_detect import PySceneDetect
from labellerr.services.video_sampling.ssim import SSIMSceneDetect
from labellerr.services.video_sampling.ffmpeg import FFMPEGSceneDetect

## Scene Detection Implementation

### Setting up the Scene Detector
Now we'll set up the scene detection process:

1. First, we'll define the dataset directory where our videos are stored
2. Then we'll create an instance of our chosen detector
3. Finally, we'll process each video in the dataset

Note: Make sure you have sufficient disk space for storing the extracted scenes, as this process can generate multiple files per video.

In [None]:
dataset_dir = f".\Labellerr_datasets\{dataset_id}"

In [None]:
detector = FFMPEGSceneDetect()

### Initialize the Scene Detector
Here we create an instance of the SSIMSceneDetect class. This detector uses the Structural Similarity Index Measure (SSIM) to identify scene changes in videos. SSIM is particularly effective at detecting subtle changes between frames.

In [None]:
for filename in os.listdir(dataset_dir):
    file_path = os.path.join(dataset_dir, filename)
    
    if os.path.isfile(file_path):
        detector.detect_and_extract(file_path)

### Process Videos for Scene Detection

The following code block:
1. Iterates through all files in the dataset directory
2. Constructs the full file path for each video
3. Verifies that each path points to a file (not a directory)
4. Applies scene detection to each video using the `detect_and_extract` method

The detected scenes will be saved in a subdirectory with the same name as the input video file. Each scene will be saved as a separate video file.

## 5. Project Creation

In this section, we'll explore how to create and manage projects in Labellerr. Projects are essential containers that organize your data and annotations. We'll cover:

1. Creating image datasets from video frames
2. Setting up annotation projects
3. Managing project configurations

### Image Dataset Creation from Sampled Frames


In [None]:
import os

images_files = []
# Clear existing entries in images_files
images_files.clear()

# Construct the base directory path for detected frames
base_dir = os.path.join("FFMPEG_detects", dataset_id)

# Walk through all subdirectories
for root, dirs, files in os.walk(base_dir):
    for file in files:
        if file.endswith('.jpg'):  # Only collect jpg files
            file_path = os.path.join(root, file)
            images_files.append(file_path)

print(f"Found {len(images_files)} image files")

In [None]:
images_files

In [None]:
# code to create dataset from sampled frames

def upload_images_from_files(images_files, client, client_id):
    """Upload specific image files to create a dataset"""
    
    client.enable_connection_pooling = True
    
    dataset_config = {
        "client_id": client_id,
        "dataset_name": "video_sampling_1",
        "dataset_description": "video sampling dataset from frames",
        "data_type": "image",  
    }
    
    try:
        response = client.create_dataset(
            dataset_config=dataset_config,
            files_to_upload=images_files  
        )
        print(f"Dataset created successfully!")
        print(f"Dataset ID: {response['dataset_id']}")
        return response['dataset_id']
    except Exception as e:
        print(f"Error creating dataset: {e}")

In [None]:
upload_images_from_files(images_files, client, client_id)

In [None]:
new_dataset_id = '6a680901-fe81-49f0-9120-bb754d63a341'

### Image Annotation Project Creation

In [None]:
# modify to add questions to image project

questions = [
  {
    "question_number": 1,
    "question": "Test",
    "question_id": "533bb0c8-fb2b-4394-a8e1-5042a944802f",
    "option_type": "polygon",
    "required": True,
    "options": [
      { "option_name": "#fe1236" }
    ],
    "question_metadata": []
  }
            ]


In [None]:
# creeate the annotation guideline template

template_id = client.create_annotation_guideline(
    client_id=client_id,
    questions=questions,
    template_name="video_sampling_template_1",
    data_type="image",
)


In [None]:
# create the image annotation project

response = client.create_project(
                project_name="Video_sampling_project_1",
                data_type="image",
                client_id= client_id,
                dataset_id=new_dataset_id,
                annotation_template_id=template_id,
                rotation_config={
                    "annotation_rotation_count": 1,
                    "review_rotation_count": 1,
                    "client_review_rotation_count": 1,
                },
                created_by="yashsuman15@gmail.com"
            )


In [None]:
if response['response']['project_id']:
    print(f"Project created successfully!")
    print(f"Project ID: {response['response']['project_id']}")
    image_project_id = response['response']['project_id']

## 6. Performing Annotations of Image Project

In [None]:
# annotations of image project on labellerr platform

### Exporting the Annotation Data

In [None]:
# code to export the annotations from image project

export_config = {
        "export_name": "Weekly Export",
        "export_description": "Export of all accepted annotations",
        "export_format": "coco_json",
        "statuses": [
            "review",
            "r_assigned",
            "client_review",
            "cr_assigned",
            "accepted",
        ],
    }


response = client.create_local_export(
    project_id=image_project_id,
    client_id=client_id,
    export_config=export_config
)

## 7. Uploading annotations to Video Project

### Trigger SAM2 tracking on Video annotation project

Using the export, retrive the prompt to run SAM2 tracking on video

In [None]:
# code to create video annotation project from image annotations export