# CellRake: Complete Cell Analysis Pipeline Tutorial

**CellRake** is a Python package for automated cell detection and analysis in fluorescence microscopy images. This notebook demonstrates all features with practical examples.

## What CellRake Does:
1. **Segments** fluorescence images to identify potential cells (named blobs)
2. **Trains** machine learning models using semi-supervised learning to identify cells among blobs
3. **Analyzes** new images to count and characterize cells based on previously trained models
4. **Exports** results and visualizations

## Prerequisites:
- Install CellRake: `pip install cellrake`
- Have some histological TIFF images ready for analysis
- Run this notebook in a Python environment with CellRake installed

## 1. Setup and Import

In [None]:
from pathlib import Path
from cellrake.main import CellRake

# Create sample directories (you'll replace these with your actual paths)
image_dir = Path("./sample_images")
results_dir = Path("./cellrake_results")

## 2. Initialize CellRake

The CellRake API requires `project_dir` as the first parameter, making it clear where results will be saved. You can define the folder where you store the TIFF images when setting up the class, or after you did it by calling the method `set_image_folder`:

In [None]:
# Option 1: Initialize the class with project_dir and image_folder
sample_project = CellRake(
    project_dir=results_dir,
    image_folder=image_dir
)

# Option 2: Initialize the class with only project_dir and add image folder later
sample_project = CellRake(project_dir=results_dir)
sample_project.set_image_folder(image_dir)

In case you already used CellRake before and have a file of segmented data, you may include it with the method `load_segmentation`. Just copy the file into the `project_dir` and specify the name of the file in the method:

In [None]:
# Example: previous_segmentation.pkl stored in results_dir
sample_project.load_segmentation(previous_segmentation)

## 3. Image Segmentation

First, let's explore the segmentation capabilities:

In [None]:
# Basic segmentation with default parameters
segmentation_result = sample_project.segment_images(threshold_rel=0.1)

# Check results
print(f"Found {len(segmentation_result['layers'])} images")
print(f"Total blobs across all images: {sum(len(rois) for rois in segmentation_result['rois'].values())}")

# Print the current status of the CellRake object
print(sample_project)

In [None]:
# Advanced segmentation with custom parameters
segmentation_params = {
    "threshold_rel": 0.15,      # More sensitive detection
    "max_sigma": 20,            # Larger blob detection range
    "num_sigma": 12,            # More steps in detection
    "min_area": 50,             # Smaller minimum cell size
    "max_area": 2500,           # Larger maximum cell size
    "hole_fill_ratio": 0.7,     # Fill smaller holes
}

# Clear previous segmentation to try new parameters
sample_project.segmented_data = None

# Try advanced segmentation
advanced_segmentation = sample_project.segment_images(**segmentation_params)

# Check new results
print(f"Found {len(advanced_segmentation['layers'])} images")
print(f"Total blobs across all images: {sum(len(rois) for rois in advanced_segmentation['rois'].values())}")

## 4. Model Training

CellRake uses semi-supervised learning to train cell classifiers. You'll manually label a small number of blobs, and the algorithm will propagate labels to similar blobs.

In [None]:
# Basic training with default parameters
# It will perform segmentation if not already done
sample_project.train()

# Check the metrics of the training
print(sample_project.metrics)

# Check the current status of the CellRake object
print(sample_project)

### 4.1 Different Model Types and Parameters

CellRake supports multiple classifiers and extensive parameter customization.

In [None]:
# Training with a different model and modified parameters
training_params = {
    "model_type": "svm", # SVM model
    "samples": 15, # Draw 15 samples from each cluster for manual labeling
    "entropy_threshold": 0.01, # Confidence threshold for pseudo-label selection
    "default_train_ratio": 0.75 # Balance train/test split
}


# Train the model with custom parameters
sample_project.train(**training_params)

# Check updated training results
print(sample_project.metrics)

## 5. Saving and Loading Models

CellRake makes it easy to save your work and reuse trained models.

In [None]:
# Save your trained model model (specify a name)
sample_project.save_model("my_cell_classifier")

# Save segmentation
sample_project.save_segmentation("my_segmentation")

# Load model
sample_project.load_model("my_cell_classifier")

# Load segmentation
sample_project.load_segmentation("my_segmentation")

## 6. Analyze other Experiments

In [None]:
# Basic analysis with default settings
analysis_results = sample_project.analyze()