## Training the CELTIC Model

In this notebook, we demonstrate the process of training the CELTIC model. 

Using the single cell images and the context data, we initialize the experiment, configure the model, and run the training process. The trained model is saved in a local folder for later use in predictions (see `predict.ipynb`).

Please download the microtubules dataset from BIA, as used in this running example.

**Important:** Unlike the two other notebook examples on prediction and context creation, this training example requires a large amount of single-cell data that cannot be downloaded inline. The data must be downloaded using an FTP client (e.g., FileZilla) from our BioImage Archive dataset [FTP server](ftp://ftp.ebi.ac.uk/pub/databases/biostudies/S-BIAD/156/S-BIAD2156/Files). Therefore, if you are running this in Google Colab, you must first link the data to your Google Drive. Otherwise, do not run this example in Colab. 



### CELTIC installation

In [None]:
# package installation (e.g for Colab users)
!git clone https://github.com/zaritskylab/CELTIC
%cd CELTIC
!pip install .

### Initializations

In [None]:
from celtic.utils.functions import initialize_experiment, download_resources
from celtic.train import train
import os

# Presets
organelle = 'microtubules'
abs_path_resources_dir = f'/content/CELTIC/resources/{organelle}' # location of the samples to be downloaded

# This is the local path to the training images (signal, target, mask).
# To be downloaded from from our BioImage Archive dataset, path:microtubules/cell_images. For more details see the notebook header section above.
path_single_cells = f'/sise/assafzar-group/assafzar/Nitsan/hipsc_single_cell_image_dataset/{organelle}/fov_processed/cells/source'


### Download resources

Download the training metadata

In [None]:
# Install the aria2 download utility
!apt-get install -y aria2

# !mkdir -p $abs_path_resources_dir
bia_ftp_dir = "ftp://ftp.ebi.ac.uk/pub/databases/biostudies/S-BIAD/156/S-BIAD2156/Files/microtubules/data_for_git_examples/resources/microtubules/metadata"
!aria2c -x 2 -s 2 -c -d  {abs_path_resources_dir} {bia_ftp_dir}/train_images.csv
!aria2c -x 2 -s 2 -c -d  {abs_path_resources_dir} {bia_ftp_dir}/train_context.csv
!aria2c -x 2 -s 2 -c -d  {abs_path_resources_dir} {bia_ftp_dir}/valid_images.csv
!aria2c -x 2 -s 2 -c -d  {abs_path_resources_dir} {bia_ftp_dir}/valid_context.csv


### Initialize the Experiment

This step initializes the experiment by creating a local folder to store the training files. It also sets up CSV files that contain the paths to the images, and if contexts are used, it includes CSV files with the context data. In this example, we provide the microtubules context files. The process of context creation is explained in the `context_creation.ipynb` notebook.


In [None]:
path_run_dir, context_model_config = initialize_experiment(organelle, 
                                                           'train', 
                                                           models_dir=f'{abs_path_resources_dir}/models')
print("the experiment will be saved in:", path_run_dir)

path_images_csv = [f'{abs_path_resources_dir}/{item}_images.csv' for item in ['train', 'valid']]
path_context_csv = [f'{abs_path_resources_dir}/{item}_context.csv' for item in ['train', 'valid']]

the experiment will be saved in: ./experiments/train/microtubules/2025-01-11-21-42-40


### Run Training

This step starts the training process using the specified parameters, including image paths, context data, and model configuration. The results are saved in the local folder of the experiment.


In [None]:
train.run_training(path_run_dir,
                    path_images_csv, 
                    path_context_csv,
                    path_single_cells, 
                    masked = True,
                    transforms = context_model_config['transforms'],
                    patch_size = context_model_config['train_patch_size'],
                    iterations = 60_000,
                    batch_size = 24,
                    learning_rate = 0.001,
                    context_features = context_model_config['context_features'], 
                    daft_embedding_factor = context_model_config['daft_embedding_factor'], 
                    daft_scale_activation = context_model_config['daft_scale_activation'])