# Image Compensation
## This notebook is an example: create a copy before running it or you will get merge conflicts!

Rosetta is the normalization process for your images produced by the MIBI. By normalizing the images you can reduce forms contamination that may show up.

For example, we illustrate Pre and Post Rosetta processing on the CD11c channel.

<table><tr>
    <td> <img src="./img/CD11c_pre_rosetta_cropped.png" style="width:100%"/> </td>
    <td> <img src="./img/CD11c_post_rosetta_cropped.png" style="width:100%"/> </td>
</tr></table>


In [None]:
import sys
sys.path.append('../')

import os
import shutil

import skimage.io as io
import pandas as pd
from toffy import rosetta
from ark.utils.io_utils import list_folders, list_files

## 1. Setup

Below, you will set up the necessary structure for testing rosetta on all of your runs.
- `cohort_name` is a descriptive name for the data comprised of all of your related runs
- `run_names` is a list of all the runs you would like to retrieve FOV images from for testing
- `panel_path` should point to a panel csv specifying the targets on your panel (see [panel format](https://github.com/angelolab/toffy#panel-format) for more information)
- `output_folder` will be the name of the folder containing the rosetta compensated images

In [None]:
# run specifications
cohort_name = '20220101_new_cohort'
run_names = ['20220101_TMA1', '20220102_TMA2']
panel_path = 'C:\\Users\\Customer.ION\\Documents\\panel_files\\my_cool_panel.csv'

# pick an informative name
output_folder = 'rosetta_output'

By default, the `commercial_rosetta_matrix_v1.csv` from the `files` directory of toffy will be used for rosetta. If you would like to use a different matrix, specify the path below. 

**UPDATE LATER TO BE COMPATIBLE WITH THE NEW AUTOMATED PANEL ADJUSTMENT SCRIPT**

A new directory based on the provided `cohort_name` above will be created within `C:\\Users\\Customer.ION\\Documents\\rosetta_testing`; this folder will contain all the files need for and produced in **Section 2** of the notebook.

In [None]:
# default rosetta matrix provided in toffy
rosetta_mat_path = '..\\files\\commercial_rosetta_matrix_v1.csv'

rosetta_testing_dir = 'C:\\Users\\Customer.ION\\Documents\\rosetta_testing'
extracted_imgs_dir = 'D:\\Extracted_Images'

With the provided run names, we will randomly choose 10 FOVs to normalize and then test rosetta on.

In [None]:
# copy random fovs from each run
rosetta.copy_image_files(cohort_name, run_names, rosetta_testing_dir, 
                         extracted_imgs_dir, fov_number=10)

# normalize images to allow direct comparison with rosetta
img_out_dir = os.path.join(rosetta_testing_dir, cohort_name, 'extracted_images')
fovs = list_folders(img_out_dir)
for fov in fovs:
    fov_dir = os.path.join(img_out_dir, fov)
    sub_dir = os.path.join(fov_dir, 'normalized')
    os.makedirs(sub_dir)
    chans = list_files(fov_dir)
    for chan in chans:
        img = io.imread(os.path.join(fov_dir, chan))
        img = img / 100
        io.imsave(os.path.join(sub_dir, chan), img, check_contrast=False)

## 2. Rosetta - Remove Signal Contamination
We'll now process the images with rosetta to remove signal contamination. This will give us a new set of compensated images.

In [None]:
# create sub-folder to hold images and files from this set of multipliers
output_folder_path = os.path.join(rosetta_testing_dir, cohort_name, output_folder)
os.makedirs(output_folder_path)

# Read in toffy panel file
panel = pd.read_csv(panel_path)

# compensate the data
rosetta.compensate_image_data(raw_data_dir=img_out_dir, comp_data_dir=output_folder_path, comp_mat_path=rosetta_mat_path, 
                              raw_data_sub_folder='normalized', panel_info=panel, batch_size=1, norm_const=1)

Now that we've generated the compensated data, we'll generate stitched images to visualize what signal was removed

In [None]:
# stitch images together to enable easy visualization of outputs
stitched_dir = os.path.join(rosetta_testing_dir, cohort_name, 'stitched_images')
os.makedirs(stitched_dir)

rosetta.create_tiled_comparison(input_dir_list=[img_out_dir, output_folder_path], output_dir=stitched_dir)

# add the source channel for gold and Noodle
for channel in ['Noodle']:
    output_dir = os.path.join(, 'stitched_with_' + channel)
    os.makedirs(output_dir)
    rosetta.add_source_channel_to_tiled_image(raw_img_dir=img_out_dir, tiled_img_dir=stitched_dir,
                                                 output_dir=output_dir, source_channel=channel)

There will now be a folder named `stitched_with_Au` and `stitched_with_Noodle` within the `base_dir`. You can look through these stitched images to visualize what signal is being removed from the two most common source channels.

## 3. Rosetta - Compensate the Whole Run

Once you're satisfied that the Rosetta is working appropriately, you can use it to process your run. First select the run you want to process, and define the relevant top-level folders. 

**You will need to run the cells below for each run in your cohort, changing the `run_name` argument each time.**

In [None]:
# Put the name of your run here
run_name = '20220101_my_run'

In [None]:
# The path to the folder containing raw run data
bin_file_dir = 'D:\\Data'

# This folder is where all of the extracted images will get saved
extracted_image_dir = 'D:\\Extracted_Images'

# This folder will hold the post-rosetta images
rosetta_image_dir = 'D:\\Rosetta_Compensated_Images'

Then, you can compensate the data using rosetta.

In [None]:
# Perform rosetta on extracted images
run_extracted_dir = os.path.join(extracted_image_dir, run_name)
run_rosetta_dir = os.path.join(rosetta_image_dir, run_name)
if not os.path.exists(run_rosetta_dir):
    os.makedirs(run_rosetta_dir)

rosetta.compensate_image_data(raw_data_dir=run_extracted_dir, comp_data_dir=run_rosetta_dir, 
                             comp_mat_path=rosetta_mat_path, panel_info=panel, batch_size=1)