# ⚡ Accelerate Your Workflow with Pre-Annotations

> Upload existing labels and reduce annotation time by up to 70%

## 📚 What You'll Learn

- ✅ Upload pre-annotations from model predictions
- ✅ Use synchronous upload for small datasets
- ✅ Use asynchronous upload for large datasets
- ✅ Validate file name matching requirements
- ✅ Monitor upload status and handle errors
- ✅ Understand COCO JSON format requirements

---

## 🎯 Why Pre-Annotations Matter

### Benefits:
- ⚡ **70% Time Reduction**: Focus on review instead of creation
- 🤖 **ML Integration**: Import model predictions directly
- 🔄 **Platform Migration**: Move annotations from other tools
- 📊 **Consistency**: Maintain annotation quality across projects

---

## 📋 Prerequisites

- **Google Colab** or Jupyter environment
- **Labellerr API credentials**
- **Existing project** with uploaded images
- **COCO JSON annotations** file

### 🔑 Colab Secrets Setup:
Add `LABELLERR_API_KEY`, `LABELLERR_API_SECRET`, `LABELLERR_CLIENT_ID`

---

## 🛠️ Installation

In [None]:
!pip install git+https://github.com/tensormatics/SDKPython.git

---

## 🔐 Authentication

In [None]:
from labellerr.client import LabellerrClient
from labellerr.exceptions import LabellerrError
import json

try:
    from google.colab import userdata
    api_key = userdata.get('LABELLERR_API_KEY')
    api_secret = userdata.get('LABELLERR_API_SECRET')
    client_id = userdata.get('LABELLERR_CLIENT_ID')
    print("✅ Credentials loaded from Colab Secrets")
except:
    api_key = input("API Key: ")
    api_secret = input("API Secret: ")
    client_id = input("Client ID: ")

client = LabellerrClient(api_key, api_secret)
print("✅ Client initialized!")

---

## ⚠️ CRITICAL: File Name Matching Requirement

> **File names in your annotations MUST exactly match the image file names in your project**

### Validation Checklist:
- ✅ File names match exactly (case-sensitive)
- ✅ File extensions match (`.jpg` vs `.jpeg` vs `.png`)
- ✅ No extra spaces or special characters

### Examples:

**✅ CORRECT:**
```json
{"file_name": "burger.jpeg"}  → Uploaded as: burger.jpeg
```

**❌ INCORRECT:**
```json
{"file_name": "burger.jpeg"}  → Uploaded as: burger.jpg  (extension mismatch)
{"file_name": "burger.jpeg"}  → Uploaded as: Burger.jpeg (case mismatch)
```

---

## 📁 Prepare Your Data

### COCO JSON Format Overview

Your annotation file should contain:
- `images`: List of images with file names and IDs
- `annotations`: List of annotations with bounding boxes/polygons
- `categories`: List of category definitions

### Sample COCO JSON Structure:

In [None]:
# Example COCO JSON structure
sample_coco = {
    "images": [
        {
            "id": 1,
            "file_name": "image001.jpg",
            "width": 1920,
            "height": 1080
        }
    ],
    "annotations": [
        {
            "id": 1,
            "image_id": 1,
            "category_id": 1,
            "bbox": [100, 100, 200, 150],  # [x, y, width, height]
            "area": 30000,
            "iscrowd": 0
        }
    ],
    "categories": [
        {
            "id": 1,
            "name": "car",
            "supercategory": "vehicle"
        }
    ]
}

print("📋 Sample COCO JSON structure:")
print(json.dumps(sample_coco, indent=2))

---

## 🔄 Method 1: Synchronous Upload

### When to use:
- Small annotation files (< 5 MB)
- Sequential workflows
- Simple batch scripts

### Characteristics:
- Blocks execution until complete
- Simple error handling
- Direct result return

In [None]:
# Configuration
project_id = 'your_project_id_here'  # Replace with your project ID
annotation_format = 'coco_json'  # Format: 'coco_json'
annotation_file = '/content/annotations.json'  # Path to your COCO JSON file

# Upload pre-annotations synchronously
try:
    print("📤 Uploading pre-annotations (synchronous)...")
    print(f"   Project: {project_id}")
    print(f"   Format: {annotation_format}")
    print(f"   File: {annotation_file}")
    print("\n⏳ Please wait...\n")
    
    result = client.upload_preannotation_by_project_id(
        project_id=project_id,
        client_id=client_id,
        annotation_format=annotation_format,
        annotation_file=annotation_file
    )
    
    # Check status
    if result['response']['status'] == 'completed':
        print("✅ Pre-annotations uploaded successfully!")
        print("\n📊 Upload Details:")
        
        # Extract metadata
        metadata = result['response'].get('metadata', {})
        print(f"   Status: {result['response']['status']}")
        print(f"   Activity ID: {result['response'].get('activity_id', 'N/A')}")
        
        if metadata:
            print(f"\n   Metadata:")
            for key, value in metadata.items():
                print(f"      {key}: {value}")
    else:
        print(f"⚠️ Upload status: {result['response']['status']}")
        
except LabellerrError as e:
    print(f"❌ Upload failed: {str(e)}")
except FileNotFoundError:
    print(f"❌ Annotation file not found: {annotation_file}")
except Exception as e:
    print(f"❌ Unexpected error: {str(e)}")

---

## 🚀 Method 2: Asynchronous Upload

### When to use:
- Large annotation files (> 5 MB)
- Production applications
- Non-blocking workflows

### Characteristics:
- Returns immediately with Future object
- Enables concurrent operations
- Configurable timeout handling

In [None]:
# Configuration
project_id = 'your_project_id_here'  # Replace with your project ID
annotation_format = 'coco_json'
annotation_file = '/content/annotations.json'

# Upload pre-annotations asynchronously
try:
    print("📤 Initiating async upload...")
    print(f"   Project: {project_id}")
    print(f"   Format: {annotation_format}")
    
    # Initiate async upload
    future = client.upload_preannotation_by_project_id_async(
        project_id=project_id,
        client_id=client_id,
        annotation_format=annotation_format,
        annotation_file=annotation_file
    )
    
    print("\n✅ Upload initiated successfully!")
    print("⏳ Processing in background...")
    print("\nYou can continue with other tasks while this processes.")
    
    # Wait for result with timeout
    try:
        print("\n⏱️ Waiting for completion (timeout: 5 minutes)...")
        result = future.result(timeout=300)  # 5 minute timeout
        
        if result['response']['status'] == 'completed':
            print("\n✅ Pre-annotations processed successfully!")
            
            metadata = result['response'].get('metadata', {})
            print(f"\n📊 Upload Details:")
            print(f"   Activity ID: {result['response'].get('activity_id')}")
            print(f"   Status: {result['response']['status']}")
            
            if metadata:
                print(f"\n   Metadata:")
                for key, value in metadata.items():
                    print(f"      {key}: {value}")
        else:
            print(f"⚠️ Upload status: {result['response']['status']}")
            
    except TimeoutError:
        print("\n⏰ Upload is taking longer than expected")
        print("   The upload is still processing in the background.")
        print("   Check your project dashboard for status updates.")
    except Exception as e:
        print(f"\n❌ Processing error: {str(e)}")
        
except LabellerrError as e:
    print(f"❌ Upload initialization failed: {str(e)}")
except Exception as e:
    print(f"❌ Unexpected error: {str(e)}")

---

## 📊 Understanding the Response

### Status Values:

| Status | Meaning | Action |
|--------|---------|--------|
| `completed` | Annotations successfully applied | Ready to review |
| `processing` | Upload in progress | Wait or check later |
| `failed` | Processing encountered errors | Review error details |

### Response Structure:

In [None]:
# Example response structure
example_response = {
    "response": {
        "status": "completed",
        "activity_id": "9c696f10-7e76-40e0-9852-34b99be2db52",
        "metadata": {
            "total_annotations": 42,
            "matched_files": 10,
            "unmatched_files": 0,
            "processing_time_seconds": 15.3
        }
    }
}

print("📋 Example Response Structure:")
print(json.dumps(example_response, indent=2))

---

## 🎯 Complete Workflow Example

Let's create a complete pre-annotation workflow with validation:

In [None]:
import os

def validate_and_upload_preannotations(project_id, annotation_file, method='sync'):
    """
    Complete workflow for uploading pre-annotations with validation
    """
    print("="*70)
    print("PRE-ANNOTATION UPLOAD WORKFLOW")
    print("="*70)
    
    # Step 1: Validate file exists
    print("\n📋 Step 1: Validating annotation file...")
    if not os.path.exists(annotation_file):
        print(f"❌ File not found: {annotation_file}")
        return None
    
    file_size = os.path.getsize(annotation_file) / (1024 * 1024)  # MB
    print(f"✅ File found: {annotation_file}")
    print(f"   Size: {file_size:.2f} MB")
    
    # Step 2: Validate JSON structure
    print("\n📋 Step 2: Validating JSON structure...")
    try:
        with open(annotation_file, 'r') as f:
            annotations = json.load(f)
        
        # Check required fields
        if 'images' in annotations and 'annotations' in annotations:
            print(f"✅ Valid COCO JSON format")
            print(f"   Images: {len(annotations['images'])}")
            print(f"   Annotations: {len(annotations['annotations'])}")
            print(f"   Categories: {len(annotations.get('categories', []))}")
        else:
            print("⚠️ Missing required fields (images, annotations)")
            return None
    except json.JSONDecodeError as e:
        print(f"❌ Invalid JSON: {str(e)}")
        return None
    
    # Step 3: Upload
    print(f"\n📋 Step 3: Uploading annotations ({method})...")
    try:
        if method == 'sync':
            result = client.upload_preannotation_by_project_id(
                project_id=project_id,
                client_id=client_id,
                annotation_format='coco_json',
                annotation_file=annotation_file
            )
        else:  # async
            future = client.upload_preannotation_by_project_id_async(
                project_id=project_id,
                client_id=client_id,
                annotation_format='coco_json',
                annotation_file=annotation_file
            )
            result = future.result(timeout=300)
        
        # Step 4: Verify results
        print("\n📋 Step 4: Verifying upload...")
        if result['response']['status'] == 'completed':
            print("\n" + "="*70)
            print("✅ UPLOAD SUCCESSFUL!")
            print("="*70)
            
            metadata = result['response'].get('metadata', {})
            print(f"\n📊 Summary:")
            print(f"   Activity ID: {result['response'].get('activity_id')}")
            if metadata:
                for key, value in metadata.items():
                    print(f"   {key}: {value}")
            
            return result
        else:
            print(f"⚠️ Upload status: {result['response']['status']}")
            return result
            
    except Exception as e:
        print(f"❌ Upload error: {str(e)}")
        return None

# Example usage
# result = validate_and_upload_preannotations(
#     project_id='your_project_id',
#     annotation_file='/content/annotations.json',
#     method='sync'  # or 'async'
# )

print("✅ Workflow function defined!")
print("\nUsage:")
print("  result = validate_and_upload_preannotations(")
print("      project_id='your_project_id',")
print("      annotation_file='/path/to/annotations.json',")
print("      method='sync'  # or 'async'")
print("  )")

---

## 📊 Comparison: Sync vs Async

| Feature | Synchronous | Asynchronous |
|---------|-------------|---------------|
| **Best For** | Small files (< 5 MB) | Large files (> 5 MB) |
| **Execution** | Blocks until complete | Returns immediately |
| **Complexity** | Simple | Moderate |
| **Use Cases** | Batch scripts, testing | Production apps, large datasets |
| **Timeout Handling** | Automatic | Configurable |
| **Concurrent Tasks** | No | Yes |

---

## 💼 Common Use Cases

### 1. Model Prediction Integration
Import predictions from your ML models for human validation:
```python
# Run model predictions
predictions = model.predict(images)
# Convert to COCO JSON
coco_annotations = convert_predictions_to_coco(predictions)
# Upload to Labellerr
result = client.upload_preannotation_by_project_id(...)
```

### 2. Platform Migration
Transfer annotations from other annotation tools:
- Export annotations from old platform
- Convert to COCO JSON format
- Upload to Labellerr project

### 3. Project Replication
Reuse annotations from completed projects:
- Export annotations from source project
- Upload to new project with similar structure

---

## 🔧 Troubleshooting Guide

### Issue 1: Annotations Not Applied to Images
**Symptoms:** Upload succeeds but annotations don't appear

**Solutions:**
- ✅ Verify exact file name matching (case-sensitive)
- ✅ Check file extension consistency (`.jpg` vs `.jpeg`)
- ✅ Ensure images were uploaded before pre-annotations
- ✅ Validate COCO JSON format

### Issue 2: Upload Timeout
**Symptoms:** Operation times out or fails

**Solutions:**
- ✅ Use asynchronous method for large files
- ✅ Split large files into smaller batches (< 25 MB each)
- ✅ Increase timeout parameter: `future.result(timeout=600)`
- ✅ Check network stability

### Issue 3: Format Validation Error
**Symptoms:** API returns format errors

**Solutions:**
- ✅ Validate JSON structure against COCO specification
- ✅ Ensure all required fields are present
- ✅ Check annotation coordinates are within image bounds
- ✅ Verify category IDs match definitions

### Issue 4: Authentication Errors
**Symptoms:** Upload fails with auth errors

**Solutions:**
- ✅ Verify API credentials in Colab Secrets
- ✅ Check project ID is correct
- ✅ Ensure you have write permissions for the project
- ✅ Confirm client ID matches your organization

---

## 💡 Best Practices

### Pre-Upload:
1. **Validate locally** - Test JSON format before uploading
2. **Start small** - Test with 5-10 images first
3. **Match names** - Double-check file name consistency
4. **Document mapping** - Keep record of annotation IDs

### During Upload:
1. **Choose right method** - Sync for small, async for large
2. **Monitor progress** - Track activity IDs
3. **Handle errors** - Implement robust error handling
4. **Set timeouts** - Configure appropriate timeout values

### Post-Upload:
1. **Verify results** - Check annotations in UI
2. **Review metadata** - Analyze upload statistics
3. **Track unmatched** - Investigate unmatched files
4. **Update logs** - Maintain upload records

---

## 🎯 Next Steps

Congratulations! You're now a pre-annotation expert! 🎉

### Recommended Next Steps:

1. **Magic Wand Confidence Filtering** - Filter annotations by confidence
   - 📓 [05_magic_wand_confidence_filtering.ipynb](./05_magic_wand_confidence_filtering.ipynb)

2. **Create Projects** - Set up new annotation projects
   - 📓 [02_create_project.ipynb](./02_create_project.ipynb)

3. **Annotation Questions** - Design custom annotation schemas
   - 📓 [03_annotation_questions.ipynb](./03_annotation_questions.ipynb)

### Additional Resources:

- 📖 [COCO Format Specification](http://cocodataset.org/#format-data)
- 📖 [Labellerr Documentation](https://docs.labellerr.com)
- 🌐 [SDK GitHub](https://github.com/tensormatics/SDKPython)
- 📧 **Support**: support@tensormatics.com

---

### 💡 Pro Tips:

- Save activity IDs for tracking
- Use version control for annotation files
- Implement retry logic for production systems
- Monitor upload metrics over time

---

**Happy Annotating! 🚀**