# 📁 Datamint Data Upload Tutorial

This comprehensive tutorial covers all aspects of uploading data to Datamint, from single files to complex batch operations with metadata and segmentations.

## 📋 Table of Contents

| Section | Description | Key Features |
|---------|-------------|--------------|
| **[Setup & Connection](#setup--connection)** | Initialize API connection | Authentication, configuration |
| **[1. Single Resource Upload](#1-single-resource-upload)** | Basic file upload with metadata | Tags, channels, anonymization |
| **[2. Batch Upload with Different File Types](#2-batch-upload-with-different-file-types)** | Multiple files in one operation | Mixed formats, error handling |
| **[Data Organization Guide](#data-organization)** | Best practices for organizing data | Tags, channels, projects |
| **[3. Upload Segmentation](#3-upload-segmentation)** | Adding segmentation masks | Multi-class support, resource linking |
| **[4. Upload Resource with Segmentations](#4-upload-resource-with-segmentations-in-one-request)** | Combined upload workflow | Efficiency, automatic linking |
| **[5. Upload with JSON Metadata](#5-upload-with-json-metadata)** | Structured metadata inclusion | Custom fields, DICOM metadata |
| **[6. Project Management](#6-project-management)** | Team collaboration features | Project creation, resource organization |
| **[7. Downloading and Accessing Data](#7-downloading-and-accessing-data)** | Retrieve uploaded resources | Format conversion, annotations |
| **[Next Steps](#next-steps)** | Additional resources and tutorials | Documentation, advanced features |

---

# Setup & Connection

Initialize the Datamint API connection. Make sure you've run `datamint-config` in your terminal first.

In [None]:
from datamint import APIHandler
import json
from pathlib import Path

# Creates a connection with the server
# Don't forget to run `datamint-config` in a terminal, if you haven't already.
# Or use api_key parameter in APIHandler
api = APIHandler()

# 1. Single Resource Upload

Upload a single file with basic metadata and organization options.

In [None]:
# Single file upload with comprehensive options
dicom_file = '../data/Case14.dcm'
new_resource_id = api.upload_resource(
    dicom_file,
    channel='tutorial_channel',  # arbitrary channel name for organization
    tags=['tutorial', 'case14'],  # tags for easy searching later
    publish=False,  # set to True to bypass inbox and directly publish
    anonymize=True,  # anonymize DICOM data (default for DICOM files)
)

print(f"Uploaded resource ID: {new_resource_id}")

In [None]:
# Get all the resources with specific tags
all_resources = list(api.get_resources(
    status='inbox',
    tags=['tutorial']
))

print(f"Found {len(all_resources)} resources with 'tutorial' tag")

# 2. Batch Upload with Different File Types

Upload multiple files of different types in a single operation.

## Data Organization

**Channels and Tags** are powerful tools for organizing your data:

### 🏷️ **Tags**
- **Purpose**: Searchable labels for filtering and categorization
- **Examples**: `['mri', 'brain', 'segmented']`, `['study_2024', 'patient_cohort_a']`
- **Best Practice**: Use consistent naming conventions

### 📁 **Channels** 
- **Purpose**: Logical groupings for related resources
- **Examples**: `'cardiac_study'`, `'preprocessing_pipeline'`, `'quality_control'`
- **Best Practice**: One channel per study or workflow

### 🎯 **Projects**
- **Purpose**: Collaborative workspaces with access control
- **Features**: Resource collections, annotation workflows, team management
- **Best Practice**: Create projects for specific research objectives

In [None]:
# Upload multiple files at once
files_to_upload = [
    '../data/Case14.dcm',
    '../data/sample_image.png',  # Replace with actual image file
    '../data/sample_video.mp4'   # Replace with actual video file
]

resource_ids = api.upload_resources(
    files_to_upload,
    channel='batch_upload_demo',
    tags=['batch', 'mixed_types'],
    on_error='skip',  # Skip files that fail to upload
    mung_filename='all'  # Include full path in filename
)

print(f"Uploaded {len([r for r in resource_ids if not isinstance(r, Exception)])} files successfully")
for file, result in zip(files_to_upload, resource_ids):
    if isinstance(result, Exception):
        print(f"Failed to upload {file}: {result}")
    else:
        print(f"✓ {Path(file).name} -> {result}")

# 3. Upload Segmentation

**Objective**: Add segmentation masks to existing resources for machine learning and analysis.

## Segmentation Features:
- 🎨 **Multi-class Support**: Handle multiple anatomical regions
- 🔗 **Resource Linking**: Associate segmentations with source images  
- 📏 **Format Flexibility**: Support for NIfTI, PNG, and numpy arrays
- 🏷️ **Class Naming**: Map pixel values to meaningful labels

In [None]:
# Upload segmentation with comprehensive class mapping
seg_file = '../data/Case14_Bones.nii.gz'
resource_id = new_resource_id

# Define pixel value to anatomical region mapping
class_names = {
    1: "Femur",      # Pixel value 1 represents femur
    2: "Tibia"       # Pixel value 2 represents tibia
}

segmentation_ids = api.upload_segmentations(
    resource_id=resource_id,
    file_path=seg_file,
    name=class_names,
    imported_from='manual_annotation'  # Track the source of annotations
)

# 4. Upload Resource with Segmentations in One Request

**Objective**: Optimize workflow by uploading resources and segmentations simultaneously.

## Advantages of Combined Upload:
- ⚡ **Efficiency**: Single API call for related data
- 🔗 **Automatic Linking**: Resources and segmentations are pre-associated
- 🛡️ **Atomicity**: Either both succeed or both fail
- 📊 **Progress Tracking**: Unified upload monitoring

In [None]:
dicom_file = '../data/Case14.dcm'
class_names = {
    1: "Femur",
    2: "Tibia"
}

# These are the segmentation files that will be uploaded to the server
segfiles = {
    'files': ['../data/Case14_Bones.nii.gz'], # same number of frames as the image file
    'names': class_names  # mapping pixel values to class names
}

new_resource_id = api.upload_resource(
    dicom_file,
    segmentation_files=segfiles,
    channel='with_segmentation',
    tags=['tutorial', 'with_seg'],
    publish=False
)

print(f"Uploaded resource with segmentation: {new_resource_id}")

# 5. Upload with JSON Metadata

Include structured metadata with your uploads. This is particularly useful for research data with custom fields.

* **Supported Metadata Fields:** Common DICOM and research fields are automatically recognized and indexed.

In [None]:
nifti_file = '../data/my_niftidata.nii.gz'

# Create and upload resources with structured JSON metadata
metadata_example = {
    # Core identifiers
    "SeriesInstanceUID": "1.2.3.4.5.6.7.8.9.10.TUTORIAL.001",
    "StudyInstanceUID": "1.2.3.4.5.6.7.8.9.STUDY.001",

    # Clinical information
    "patient_age": 45,
    "acquisition_date": "2024-01-15",
    "scanner_model": "Example Scanner 3T",
    "modality": "CT"
}
# Upload with metadata
resource_with_metadata = api.upload_resource(
    nifti_file,
    channel='with_metadata',
    tags=['tutorial', 'metadata_example'],
    metadata=metadata_example  # List of metadata files
)

print(f"Uploaded resource with metadata: {resource_with_metadata}")

# Verify the metadata was included
resource_info = api.get_resources_by_ids(resource_with_metadata)
print("Resource modality:", resource_info.get('modality', 'Not specified'))


# 6. Project Management

**Objective**: Organize resources into collaborative projects for team-based workflows.

## Project Features:
- 👥 **Team Collaboration**: Shared access and work
- 📁 **Resource Organization**: Logical grouping of related data
- 🔄 **Workflow Management**: Annotation tasks and review processes
- 📊 **Progress Tracking**: Monitor project completion status

In [None]:
# Get some resources to add to a project
tutorial_resources = list(api.get_resources(
    tags=['tutorial'],
    status='inbox'
))

if tutorial_resources:
    resource_ids_for_project = [r['id'] for r in tutorial_resources[:3]]  # Take first 3 resources

    # Create a new project
    try:
        project = api.create_project(
            name="Tutorial Project",
            description="A project created for demonstration purposes",
            resources_ids=resource_ids_for_project
        )

        print(f"Created project: {project['name']} (ID: {project['id']})")

        # List all projects
        all_projects = api.get_projects()
        print(f"\nAll projects ({len(all_projects)}):")
        for proj in all_projects:
            print(f"  - {proj['name']} (ID: {proj['id']})")

    except Exception as e:
        print(f"Error creating project (may already exist): {e}")

        # Try to find existing project
        existing_projects = [p for p in api.get_projects() if p['name'] == "Tutorial Project"]
        if existing_projects:
            print(f"Found existing project: {existing_projects[0]['name']}")
else:
    print("No tutorial resources found to add to project")

# 7. Downloading and Accessing Data

**Objective**: Retrieve and work with uploaded resources, including format conversion and annotation access.

## Download Features:
- 📥 **Format Flexibility**: Raw bytes, auto-converted objects, or saved files
- 🔄 **Type Conversion**: Automatic conversion to appropriate data types
- 📊 **Metadata Access**: Retrieve associated annotations and metadata

In [None]:
# Download a resource file
api.download_resource_file(
    new_resource_id,
    auto_convert=False,
    save_path='downloaded_resource.dcm'  # Save to a specific file
)

# Download and auto-convert (for DICOM files, returns pydicom Dataset)
resource_object = api.download_resource_file(
    new_resource_id,
    auto_convert=True
)
print(f"Auto-converted to: {type(resource_object)}")  # `pydicom.Dataset` object

# Get annotations for this resource
annotations = list(api.get_annotations(resource_id=new_resource_id))
for ann in annotations:
    print(f"  - {ann.get('identifier', 'Unknown')}: {ann.get('type', 'Unknown type')}")

# Next Steps

This tutorial covered the main features of the Datamint Python API. For more advanced usage:

1. **Check the full documentation**: https://sonanceai.github.io/datamint-python-api/
2. **Explore other notebooks**:
   - `upload_model_segmentations.ipynb` - For AI model predictions
   - `upload_annotations.ipynb` - For simple annotation management, like image/frame categories.
   - `geometry_annotations.ipynb` - For adding lines, boxes, and other geometric annotations.

Happy coding! 🚀