# Image Normalization

This notebook will walk you through the process of normalizing your image data.

Changes in detector sensitivity over a run can result in different image intensities, even when there are no actual difference in biological signal. To correct for this, we use the median pulse height (MPH) to measure detector sensitivity. We then combine this estimate of sensitivity with the tuning curve generated by `1_set_up_toffy.ipynb` to determine the normalization coefficient for each FOV.

Before running through the notebook, make sure you've completed section 3 of `1_set_up_toffy.ipynb`, which is used to create the necessary normalization curve. In addition, you should have already compensated your data with rosetta using `4_compensate_image_data.ipynb`.

In [4]:
import sys
sys.path.append('../')

import os

from toffy import normalize
from toffy.panel_utils import load_panel
from ark.utils.io_utils import list_files, list_folders

### You'll first need to specify the location of the relevant files to enable image normalization
 - `run_name` should contain the exact name of the MIBI run to extract from
 - `panel_path` should point to a panel csv specifying the targets on your panel (see [panel format](https://github.com/angelolab/toffy#panel-format) for more information)

In [None]:
# The name of the run
run_name = '20220101_run_to_be_processed'

# Path to user panel
panel_path = 'C:\\Users\\Customer.ION\\Documents\\panel_files\\my_cool_panel.csv'

Everything necessary for and subsequently outputted from this notebook is stored in the automatic directories established in `1_set_up_toffy.ipynb`. More information on the uses and locations of the directories in toffy can be found in the [README](https://github.com/angelolab/toffy#directory-structure).

In [None]:
panel = load_panel(panel_path)

# These paths should point to the folders containing each step of the processing pipeline
bin_base_dir = 'D:\\Data'
rosetta_base_dir = 'D:\\Rosetta_Compensated_Images'
normalized_base_dir = 'D:\\Normalized_Images'
mph_run_dir = os.path.join('C:\\Users\\Customer.ION\\Documents\\run_metrics', run_name, 'fov_data')

# check the run for changes in detector voltage
normalize.check_detector_voltage(os.path.join(bin_base_dir, run_name))

### Within the defined directories, we'll specify the relevant run_dir based on the run_name provided above

In [None]:
# specify sub-folder for rosetta images
img_sub_folder = 'rescaled'

# create directory to hold normalized images
normalized_run_dir = os.path.join(normalized_base_dir, run_name)
if not os.path.exists(normalized_run_dir):
    os.makedirs(normalized_run_dir)
    
# create directory to hold associated processing files
if not os.path.exists(mph_run_dir):
    os.makedirs(mph_run_dir)

### Then, we'll loop over each FOV, generating the necessary normalization files if they weren't already created

In [None]:
# get all FOVs
fovs = list_folders(os.path.join(rosetta_base_dir, run_name))

# loop over each FOV
for fov in fovs:
    # generate mph values
    mph_file_path = os.path.join(mph_run_dir, fov + '_pulse_heights.csv')
    if not os.path.exists(mph_file_path):
        normalize.write_mph_per_mass(base_dir=os.path.join(bin_base_dir, run_name), output_dir=mph_run_dir, 
                                     fov=fov, masses=panel['Mass'].values, start_offset=0.3, stop_offset=0)

###  Finally, we'll normalize the images, and save them to the output folder

In [None]:
normalize.normalize_image_data(img_dir=os.path.join(rosetta_base_dir, run_name), norm_dir=normalized_run_dir, pulse_height_dir=mph_run_dir,
                               panel_info=panel, img_sub_folder=img_sub_folder)