## Centrosome dynamics analysis
This notebook consists of several APIs to analyze and generate graphs for the general centrosome dynamics analysis. It's a very abridged guide so please contact isaacwongsiushing@gmail.com if any question arises.

#### 1. Pre-processing raw spinning disk microscopy videos
**Step 1** performs an exponential decay correction for photobleaching, a uneven illumination background correction, and a maximum intensity projection for all videos in a directory. The videos must be in **.tif** format 


In [None]:
from centropy.io import video


In [None]:
input_dir = "/Users/isaacwong/Desktop/xxxx"
output_dir = "/Users/isaacwong/Desktop/yyyy"


In [None]:
video.batch_process_videos(input_dir, output_dir, processed=False) # processed indicate if the movies had been processed


**Step 1** produce a folder structure that contains processed videos and a spreadsheet comprising the details of the image preprocessing called **videos.csv**.

Then, users need to track centrosomes using TrackMate plugin in Fiji and save the results using a default filename called **Spots in tracks statistics.csv** in the same folder as its corresponding processed video. For proteins other than Cnn and clients, users may consider using TrackPy (see **Step 2**).

The tutorial of TrackMate can be found in https://youtu.be/SvdfWLIsCQk. 


#### 2. Extraction of features from centrosomes
Users are expected to have run **Step 1** so that **videos.csv** can be called by the programme. They need to ensure in **videos.csv** the column named **analyze** has **yes** for the videos they would like to analyze. They need to indicate the tracking programme used using **tracking_program** which can either be **TrackMate** or **TrackPy**. The programme also applies different quantifications on centrosomes for example measuring the total intensity within segmented area if the option **get_patch** is set to **True** and **patch_method** is set to **otsu**. If **get_pedigree** is set to **True**, the program will also try to map out if the centrosome is an old mother or a new mother. The distance (in unit of pixel) between old mothers and new mothers must be within **max_pairing_distance**.

If the tracking was done based on physical unit e.g. um, **convert_dim** must set to **True** and **pix_scale** must have a value that maps from physical unit to pixel unit.

In [None]:
from centropy.analysis import dynamics


In [None]:
# The input_dir should contain videos.csv after step 1
input_dir = "/Users/isaacwong/Desktop/xxxx"


In [None]:
# The keyword and corresponding input can be found in the centropy package and manual
dynamics.batch_centrosomes(input_dir, None, tracking_program='TrackMate', main_channel=0, window_size=[14, 14], 
                           frame_rate=0.5, get_patch=True, patch_method='otsu', get_pedigree=True, max_pairing_distance=50, 
                           convert_dim=False, pix_scale=0.11)


**Step 2** reads the **Spots in track statistics.csv** and maps the coordiates of centrosomes in videos. The quantifications is saved as **centrosomes.csv** which is located in the same folder as **videos.csv**. The relationship between centrosomes is saved in **pedigree.csv** as centrosomes.csv

#### 3. Parameterization of centrosome dynamics by curve fitting
Users are expected to have run **Step 1** and **Step 2** so that **videos.csv** and **centrosomes.csv** can be called by the programme. They also need to ensure in **videos.csv** the column named **analyze** has **yes** for the videos they would like to analyze.

They need to look for the column named **model_intensity_all** to choose the curve used to fit into the dynamics for example linear, linear_plateau, linear_piecewise, and single_oscillation. The column named **model_intensity_init** can be used to adjust the initial guess of the fitting which may improve the fitting. The default value of **model_intensity_init** is **_** which means using the default initial condition. Users may also add a new column named **start_end** to limit the region of fitting.

To analyze old mother and new mother dynamics, they need to create a new column in the spreadsheet with the name **model_intensity_om** and **model_intensity_nm** for the intensity of old mothers and new mothers respectively. 

In [None]:
from centropy.regression import single_cycle


In [None]:
input_dir = "/Users/isaacwong/Desktop/xxxx"


In [None]:
single_cycle.batch_fitting(input_dir, None, frame_rate=0.5)


This step generates **models.csv** in the same folder as **videos.csv**.

#### 4. Previewing data by exploratory visualization
Users are expected to have run **Step 1** to **Step 3** so that **videos.csv**, **centrosomes.csv**, **models.csv**, and **simulations.csv** can be called by the programme to generate graphs. They also need to ensure in **videos.csv** the column named **analyze** has **yes** for the videos to be included in the graphs. This step generates standard graphs for dynamics for exmaple line plots of certain attribute against time, categorical plot across experimental conditions, and correlation between different parameters

In [None]:
from centropy.visualization import line, categorical, correlation


In [None]:
input_dir = "/Users/isaacwong/Desktop/xxxx"


In [None]:
# Common parameters across different program
comparison_type='cycle'
frame_rate=0.5
residue_threshold = 1

# define the attribute and parameters to plot
attribute_dict = { 'intensity': ['initial_m', 'peak_m', 'added_m','peak_time_m', 'increase_rate_m', 
                                 's-phase_duration'],}


In [None]:
# Overlay all embryos from the same conditions into the same graphs. Users can choose to align to centrosome separation
# or NEB
line.all_embryos(input_dir, output_dir=None, comparison_type=comparison_type, 
                     attribute='total_intensity_norm', label_age=False, alignment='neb')


In [None]:
# Plot all embryos separately to check the quality of fitting and also select the region of model fitting
line.batch_individual_embryo(input_dir, output_dir=None, attribute='total_intensity_norm', label_age=True)


In [None]:
# Generate graphs of different parameters of an attribute across different conditions
categorical.batch_box(input_dir, output_dir=None, comparison_type=comparison_type, 
                          attribute_dict=attribute_dict, between_type='age_type', frame_rate=frame_rate, 
                          residue_threshold=residue_threshold)


In [None]:
# Generate pairwise comparison between different parameters of an attribute
correlation.pairwise_scatter(input_dir, output_dir=None, comparison_type=comparison_type, 
                             attribute_dict=None, between_type=None, frame_rate=frame_rate, 
                             residue_threshold=residue_threshold)


#### 5. Statistical testing
Users are expected to have run **Step 1** and **Step 3** so that **videos.csv** and **models.csv** can be called by the programme. They also need to ensure in **videos.csv** the column named **analyze** has **yes** for the videos they would like to analyze, column named **s-phase_duration** also needs to be filled, and provide a attribute_dict (see below). 


In [None]:
from centropy.io import helper
from centropy.statistics import summary, hypothesis_testing, correlation


In [None]:
input_dir = "/Users/isaacwong/Desktop/xxxx"


In [None]:
# parameters
comparison_type='manipulated_protein'
control='Control'
frame_rate=0.5
label_age=False

# define attribute and parameters for statistical testing
attribute_dict = { 'intensity': ['initial_r', 'peak_r', 'added_r','peak_time_r', 'increase_rate_r', 
                                 's-phase_duration'],}


In [None]:
# Obtain a summary of the data e.g. mean, standard deviation, and a relative mean
summary.batch_descriptive_stat(input_dir, output_dir=None, 
                               attribute_dict=attribute_dict, comparison_type=comparison_type, control=control, 
                               res_threshold=1, frame_rate=frame_rate, label_age=label_age)


In [None]:
# Hypothesis testing of attribute and parameters in attribute_dict
hypothesis_testing.batch_hypothesis_testing(input_dir, output_dir=None, 
                                            attribute_dict=attribute_dict, comparison_type=comparison_type, 
                                            between_type=None, res_threshold=1, frame_rate=frame_rate)

In [None]:
# Testing the strength of correlation
# pairwise_dict has a format of {attribute: [parameter1, parameter2, ...]}
# Then the program will run correlation on any combination of parameters of a particular attribute.
correlation.batch_correlation(input_dir, output_dir=None, 
                              pairwise_dict=attribute_dict, comparison_type=comparison, res_threshold=1, 
                              frame_rate=frame_rate, label_age=label_age)


The output is a **correlations.csv** which is saved in a folder called **Statistics** under a folder called **Figures**.