## Training the CELTIC Model

In this notebook, we demonstrate the process of training the CELTIC model. Using preprocessed images and context data, we initialize the experiment, configure the model, and run the training process. The trained model is saved in a local folder for later use in predictions (see `predict.ipynb`).

**NOTE:** While we are actively working on releasing the single-cell dataset, this notebook cannot be run until it becomes available. In the meantime, all other example notebooks are fully executable with their provided sample images.


In [None]:
# package installation (e.g for Colab users)
!git clone https://github.com/zaritskylab/CELTIC
%cd CELTIC
!pip install .

In [2]:
from celtic.utils.functions import initialize_experiment, download_resources
from celtic.train import train
import os

# Presets
organelle = 'microtubules'
resources_dir = '../resources'

# Note: This path is not available to users at the moment. 
# A new method for accessing the single-cell data path externally will be released soon.
# Users have to wait for this update to gain access to the the single cell data.
path_single_cells = f'/sise/assafzar-group/assafzar/Nitsan/hipsc_single_cell_image_dataset/{organelle}/fov_processed/cells/source'


In [3]:
# download resources - sample images, metadata and models (2-3 min)
if not os.path.exists(resources_dir):
    shared_folder_link = 'https://drive.google.com/drive/folders/1KTzb3fzwjH5ffSLtLNHuYiLiPg2p2VUf?usp=sharing'
    download_resources(shared_folder_link, os.path.dirname(resources_dir))

### Initialize the Experiment

This step initializes the experiment by creating a local folder to store the training files. It also sets up CSV files that contain the paths to the images, and if contexts are used, it includes CSV files with the context data. In this example, we provide the microtubules context files. The process of context creation is explained in the `context_creation.ipynb` notebook.


In [4]:
path_run_dir, context_model_config = initialize_experiment(organelle, 'train', resources_dir)
print("the experiment will be saved in:", path_run_dir)

path_images_csv = [f'{resources_dir}/{organelle}/metadata/{item}_images.csv' for item in ['train', 'valid']]
path_context_csv = [f'{resources_dir}/{organelle}/metadata/{item}_context.csv' for item in ['train', 'valid']]

the experiment will be saved in: ./experiments/train/microtubules/2025-01-11-21-42-40


### Run Training

This step starts the training process using the specified parameters, including image paths, context data, and model configuration. The results are saved in the local folder of the experiment.


In [None]:
train.run_training(path_run_dir,
                    path_images_csv, 
                    path_context_csv,
                    path_single_cells, 
                    masked = True,
                    transforms = context_model_config['transforms'],
                    patch_size = context_model_config['train_patch_size'],
                    iterations = 60_000,
                    batch_size = 24,
                    learning_rate = 0.001,
                    context_features = context_model_config['context_features'], 
                    daft_embedding_factor = context_model_config['daft_embedding_factor'], 
                    daft_scale_activation = context_model_config['daft_scale_activation'])