# Notebook summaries

Cyna Shirazinejad, 7/7/21

In [None]:
%load_ext autoreload
%autoreload 2
%matplotlib inline
from IPython.display import Image, display
import pandas as pd
import numpy as np

# Outlines

this notebook is a part of series of notebooks which aim to quantify the spatial 
arrangement of branched actin (marked by ARPC3 and N-WASP) in clathrin-mediated endocytosis

the principal aspects of this analysis pipeline include:
* decomposing tracked events to a set of features
* clustering events based on feature similarities with a unsupervised learning model
* removing false-positive DNM2-recruiting CME events 
* isolating single-peaked DNM2 events (CCPs) from hot-spots or insufficiently DNM2-rich events
* predicting the identity of CCP events from new data sets not seen by the model
* comparing the effects of additional tags on the dynamics of CCPs
* measuring the effects of ARPC3 recruitment to CCPs
* measuring the effects of N-WASP recruitment to CCPs

the following is a summary of each notebooks' contents (with further details at the header of each notebook):

__Notebook 1: loading data for model generation__
* load all data, including:
    * movies from AP2-tagRFP-T, tagGFP2-DNM2 cell lines
* filter out 'valid' tracks
    * valid' tracks are tracks which consist of tracks that appear and disappear 
      in the bounds of the movie with no more than 2 consecutive gaps
    * this is characterized when using AP2 as the primary channel for tracking
* creating dataframes of features from tracked events from fitted amplitude and position space to target feature space
    * each track will be decomposed into 30 features, described in the notebook
    * the number of cell line tags will be included as a label (2 or 3)
    * the experiment number will be included as a label (1-8)
    * the date of the experiment
    * the cmeAnalysis classification as "DNM2-positive" (cmeAnalysisDNM2+) 
      or "DNM2-negative" will be included as a label (1 or 0)
* save dataframes and tracks for future notebooks

__Notebook 2: visualize distributions of model features__
* visualize raw features
* compare raw features between:
* * imaging dates
* * experiments/fields-of-view
* * cmeAnalysis DNM2 recruitment status (cmeAnalysis DNM2-positive events will be called cmeDNM2+)

__Notebook 3: sort events into clusters__
* rescale raw features
* apply dimensionality reduction to scaled features
* visualize features' contributions to the projection axes
* apply clustering of tracks in projection space

__Notebook 4: compare clustering models__
* generate clustering models using alternative combinations of training sets
* check if DNM2+ events are uniformly selected by alternative models

__Notebook 5: visualize clustering results__
* visualize lifetime cohorts of cmeDNM2+ events
* visualize the lifetime distribution of cmeAnalysisDNM2+ events
* compare the features of events between different model clusters
* * repeat for events within clusters that are cmeDNM2+ events 
* visualize lifetime cohorts of clustered events 
* visualize examples of events within each cluster
* * repeat for examples of events within each cluster that are cmeAnalysisDNM2+
* attempt to predict the identity of events with supervised classifiers for:
* * events that are within the DNM2+ cluster vs. other clusters
* * events within their respective 5 clusters

__Notebook 6: detect single DNM2 peaks__
* visualize the lifetime distribution of model's DNM2 positive events (DNM2+)
* visualize the frequency decomposition over DNM2 intensity through time measurements
* find the optimal peak-characteristic parameters for a single DNM2 burst
* confirm the model's selection with alternative statistics for goodness-of-fit
* visualize the effects of alternative peak-constraints in the parameter sweep
* visualize the lifetime distribution of single-peaked DNM2+ events or clathrin-coated pits (CCPs)
* visualize examples of CCPs, hotspots, or non-peaking DNM2+ events
* determine the boundaries of clusters and the overlap of cmeAnalysisDNM2+ and members of clusters
* plots AP2 lifetime cohorts of CCPs aligned to DNM2 peaks

__Notebook 7: use trained model for integrating new data__
* load data from cell lines:
* * AP2-tagRFP-T, tagGFP2-DNM2, ARPC3-HaloTag 
* * AP2-tagRFP-T, tagGFP2-DNM2, N-WASP-HaloTag 
* extract features from tracks
* * use existing feature scaler, decomposition axes, and mixture model to predict 
    the identity of each new event
* merge the new data with existing tracks, features, and model cluster identities

__Notebook 8: compare cell lines__
* compare the following attributes:
* * principal component distributions
* * maximum intensities
* * lifetimes
* * initiation rates of events

__Notebook 9: create ARPC3 KDTrees__
* load independently-tracked ARPC3 
* create a KDTree of track (x, y) positions for every frame of the movie

__Notebook 10: parameter sweep for merging CCPs with ARPC3 tracks__
* calculate the fraction of ARPC3+ CCPs as a function of KDTree search radius and minimum number of overlapping AP2 and ARPC3 frames

__Notebook 11a: merge AP2 with ARPC3, 'nan' padding__
* find ARPC3+/- events
* measure the effect of CCP motility with ARPC3 recruitment

__Notebook 11b: merge AP2 with ARPC3, 'zero' padding__
* find ARPC3+/- events
* measure the effect of CCP motility with ARPC3 recruitment

__Notebook 12: measure random ARPC3+ "recruitment" and test alternative CCP-selection models__
* shuffle AP2 and ARPC3 channel pairs between movies and measure ARPC3+ percentages
* test ARPC3+ percentages for alternative CCP-selection models with variable DNM2+ peak requirements

In [None]:
# set a path to the prefix of the pooled working directory with all of the data 
# the folder that contains all data for this analysis is 'ap2dynm2arcp3_project'
# (this folder, containing all raw and tracking data, is available upon request)
unique_user_saved_outputs = '/Volumes/GoogleDrive/My Drive/Drubin Lab/ap2dynm2arcp3_project/arpc3_notebook_outputs_2colormodelalldata'

In [None]:
df_merged_features = pd.read_csv(unique_user_saved_outputs+'/dataframes/df_merged_features.zip')
cohort_groups = np.load(unique_user_saved_outputs+"/dataframes/cohort_groups.npy", allow_pickle=True)
number_of_clusters = np.load(unique_user_saved_outputs+"/dataframes/number_of_clusters.npy", allow_pickle=True)
scaling_distribution_options = np.load(unique_user_saved_outputs+'/dataframes/scaling_distribution_options.npy', allow_pickle=True)
df_pcs_normal_scaled_with_gmm_cluster = pd.read_csv(unique_user_saved_outputs+'/dataframes/df_pcs_normal_scaled_with_gmm_cluster.zip')

Cell lines used
 
Both gene-edited cell lines originate from the parental WTC-11 human induced pluripotent stem cell (hiPSC) line. Both cell lines have bi-allelic knock-ins of TagRFP-T and TagGFP2 at the endogenous locus of AP2’s mu subunit and DNM2, respectively. The first cell line is marked only for the coat (AP2) and vesicle scission complex (DNM2), while the second cell line has a third diploid knock-in of a HaloTag chimerically fused to ARPC3. ARPC3 is a subunit of the ARP2/3 complex used to mark branched-actin assembly. The AP2/DNM2 cell line and AP2/DNM2/ARPC3 cell lines will also be denoted as the 2 and 3 color cell lines, respectively. AP2, DNM2, and ARPC3 together mark three endocytic modules starting from coat initiation, vesicle scission, and cytoskeletal force-generation at the plasma membrane.
 
TIRF (2-dimensional + time) Imaging conditions
 
Three days of imaging generated 20 field-of-views (FOVs) between two cell lines. All imaging was done with a TIRF objective with a numerical aperture of 1.49, at 60x magnification, at 37 degrees Celsius, with CO2 buffer, and imaged onto a 512x512 pixel camera with an isotropic voxel size of 108 nanometers. There are 8 FOVs apiece for the 2 and 3 color cell lines. Two imaging dates included back-to-back imaging of both cell lines, where the TIRF angle was set and unchanged between consecutive coverslips. The same microscope settings were used for the third imaging date.
 
Tracking experiments
 
All tracking experiments were carried out in cmeAnalysis. The default setting for the point-spread-function (PSF) was changed from data-fit to model-determined using the theoretical PSF. These parameters were determined by the numerical aperture, magnification, and pixel size of the imaging system. Tracking parameters were set as defaults except for DNM2-only tracking where the “TrackingGapLength” was set to 0. All imaging data, detection, tracking, and cmeAnalysis inputs are placed in one folder [link here]. Shown below is the organization structure of all live-cell imaging data. Starting from the top in the parental folder “TIRF movies revised”, three folders (shaded in green) contain raw .ND2 files from the microscope. “tracking_data_test_cyna” (shaded in magenta) contains four folders split by imaging data and cell line. Each cell line/data combination (“200819_ADA3” shaded in yellow) contains raw data (“raw_data”) and TIFF-splits for each channel from the .ND2 files under “split_channel_data” (shaded in blue). Each FOV for the cell line/data combination has one or more folders to indicate which combination of channels were tracked. For instance, “200819_ADA3_001” (the first FOV in the cell line/data combination), contains tracking performed on only DNM2 as well as tracking on a combination of AP2, DNM2, and ARPC3 with AP2 as the primary channel. A settings files (shaded in red) has all entries for cmeAnalysis in the loadConditionData() entry run prior to cmeAnalysis(). The final output used for all downstream analysis is “ProcessedTracks.mat” (shaded in orange). 
 
cmeAnalysis entries
 
All cmeAnalysis loadConditionData() entries are as follows, where ‘root’ marks the path leading up to the provided “ap2dnm2arpc3_project” folder. All of the entries shown in appendix XXXXXX are provided in a “settings” text file on a FOV basis in the folder structure described below.
 
Classifying events as true clathrin-coated pits with only AP2 and DNM2 as imaging markers
 
AP2 was used as the fiducial marker for clathrin-mediated endocytosis (CME) in all of our tracking experiments in cmeAnalysis. DNM2 was used as a secondary channel to mark vesicle scission and the termination of vesicle formation. cmeAnalysis provides a classification for the secondary tracking channel, DNM2, as significant or insignificant  [citation]. This marks whether the detection of DNM2 would have been registered as an independent event without its detection being dependent on the primary marker, AP2, being detected and tracked through time. The model used for determining DNM2’s significance does not require that DNM2 detections be consecutive or be present in specific stages of AP2’s lifetime. However, previous studies have indicated that DNM2 is recruited at low levels in the early development of the endocytic site as well as a rapid burst of recruitment prior to scission. 
 
Several previous attempts have been made to distinguish between visitor-vesicles to the TIRF field from authentic clathrin-coated pits (CCPs) originating at the plasma membrane [citation]. These studies used a combination of hard-percentile thresholds of measured intensities, specific hand-engineered features, or supervised machine learning techniques. Our methods are motivated by extending the scope of features that can be used to select authentic CCPs without the use of single features or arbitrary thresholds in lifetimes and intensities. Phenotypic measurements of CME events typically include the lifetime and relative brightness of measured intensities of tracked events over time.
 
The advantage of automated particle tracking allows for further characterization of tracked events. Our aim was to define observables that could be extracted from fitted positions and amplitudes of tracked events that describe events in an abstraction that simplifies amplitudes and positions as a function of time, A(t) and (x(t), y(t)), respectively. The observables, or features, are separated into seven modules. The ‘brightness’ module describes each tracks’ lifetime (marked by AP2 detections in the primary channel) and maximum intensities of AP2 and DNM2. Subsequent modules describe the motion of events, position of peak intensities, rates of intensity changes through time, relative peak characteristics between channels, signal moments or shapes, and frame-by-frame detection significance of the secondary channel (DNM2). These parameters allow a more careful understanding on how potential changes to dynamics of CME can arise from perturbations such as drug treatments, plasmid expression, and knock-downs. Table XXXXX summarizes the thirty features used for the remainder of this analysis. The naming conventions follow those used in the accompanying Jupyter Notebooks (NBs).
 
Here, to accompany the corresponding NBs, I outline the steps used to generate the results, conclusions, and figures of this manuscript. All analyses are done in Python following generating tracked events in cmeAnalysis using MATLAB. Both Jupyter Notebooks (NBs) and Python scripts are used in this analysis. The folder “cmeAnalysisPostProcessingPythonScripts” contains all custom-written Python scripts necessary for this analysis. Occasionally, non-routine calculations are done in the NBs themselves, otherwise, the Python scripts carry out much of the automation needed to streamline this work.
 
The following descriptions outline the work carried out in each notebook. This text document is meant to accompany the notebooks and provide information without the need to read code. The comments in notebooks will serve as a reference for coding-specifics, while this document will expand on the thinking during and between each step of the analysis. Some comments on coding styles and decision making for routine functions can be found in this document. The intention of this approach is to lay out all details, big and small, in order to encourage reproducibility, allow for easy tweaking, and explore further directions.


# Notebook 1: loading data for model generation

outline:
* load all data, including:
    * movies from AP2-tagRFP-T, tagGFP2-DNM2 cell lines
* filter out 'valid' tracks
    * valid' tracks are tracks which consist of tracks that appear and disappear 
      in the bounds of the movie with no more than 2 consecutive gaps
    * this is characterized when using AP2 as the primary channel for tracking
* creating dataframes of features from tracked events from fitted amplitude and position space to target feature space
    * each track will be decomposed into 30 features, described in the notebook
    * the number of cell line tags will be included as a label (2 or 3)
    * the experiment number will be included as a label (1-8)
    * the date of the experiment
    * the cmeAnalysis classification as "DNM2-positive" (cmeAnalysisDNM2+) 
      or "DNM2-negative" will be included as a label (1 or 0)
* save dataframes and tracks for future notebooks

All AP2/DNM2 cell line tracks are uploaded as “ProcessedTracks.mat”. cmeAnalysis categorizes tracks into 8 groups. The first four consist of single events and the last four consist of splitting/merging events. Category 1, or ‘valid”, tracks are those that have gaps that do not exceed the designated ceiling of 2 consecutive gaps. Only valid tracks are considered for the remainder of the AP2/DNM2 analysis. Valid tracks also show up after the movie begins and leave before the movie ends.
 
Each valid tracked is decomposed into the 30 features previously described using an object-oriented approach to select the desired features needed for each track. Then, in addition to these features, each track is additionally labeled with the number of channels in the cell line (2), the imaging date, and the prediction of DNM2-positive or negative made by cmeAnalysis. From here on, cmeAnalysisDNM2+ or cmeAnalysisDNM2- will be used to reference tracks identified by cmeAnalysis that did recruit sufficient DNM2 and those that did not recruit DNM2, respectively.
 
All events and their corresponding labels are then merged into a Pandas dataframe that is saved for future use. The Python-converted track objects from MATLAB are also saved for subsequent use. Tracks are split to allow for saving through NumPy with a set and reused number of splits throughout all notebooks.


In [None]:
df_merged_features

# Notebook 2: visualize distributions of model features

outline:
* visualize raw features
* compare raw features between:
* * imaging dates
* * experiments/fields-of-view
* * cmeAnalysis DNM2 recruitment status (cmeAnalysis DNM2-positive events will be called cmeDNM2+)

The feature dataframe generated in Notebook 1 is reuploaded. Then, raw distributions of each feature are shown. Some features appear to be unimodal (“variation_dnm2”) while most have broad and multimodal distributions (“lifetime”, “md_ap2”, “fraction_significant_dnm2”). Following the merged feature display, features are compared on an experiment-to-experiment basis. These plots show the considerable variability in tracks’ brightness on a field-of-view basis; however, these effects are more subtle when comparing features pooled across imaging dates. Using the classification of cmeAnalysisDNM2+/-, features are pooled from all experiments and compared. These results show the overall expected trend that true DNM2 positive events have longer AP2 lifetimes, are brighter, are less motile, and recruit DNM2 for longer durations of the tracked AP2 event.

### raw data features of all merged valid tracks

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_features_merged_tracks_histograms.png', height=500, width=500)    

### overlay of features separated by imaging experiment 

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_features_cdf_split_by_experiments.png',height=500, width=500)    

### overlay of features separated by date

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_features_cdf_split_by_dates.png',height=500, width=500)    

### overlay of features separated by cmeAnalysis DNM2 selection

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_features_cdf_split_by_cmeAnalysis_prediction.png',height=500, width=500)    

# Notebook 3: sort events into clusters

outline:
    
* rescale raw features
* apply dimensionality reduction to scaled features
* visualize features' contributions to the projection axes
* apply clustering of tracks in projection space
 
We next sought to discover natural patterns of similarly behaved events within the feature space. The goal of this is to extract events that fall within distinct phenotypic categories that may reveal functional differences. Most importantly, we are now seeking tracks that have a characteristic DNM2 peak that corresponds to vesicle scission. To do so, we target events that have the most DNM2 recruitment relative to the entire population of tracks.
 
While the raw feature distributions showed that tracks could fall within two or more modes, there were no clear tails between populations of events. Furthermore, with 30 dimensions in the feature space, pair-wise feature comparisons and boundary drawing becomes tedious and can introduce arbitrary decision making.
 
To make the visualization of similarly behaved events possibly, we turned to linear dimensionality reduction with Principal Component Analysis. The relative scale of the features between one another are disparate, therefore, it is imperative to scale features to alternative distributions for ease of comparison. We attempted a suite of scaling options and found, when viewing the first two principal components, that scaling each feature to normal and uniform distributions revealed distinct modes. The first two principal components of the normal-scaled data revealed 5 clear modes when viewed in log scale: clear peaks in event densities were surrounded by events with decreasing densities.
 
The principal components of the normal-scaled data could be related to the feature space from which the maximal variance directions are computed. We found that the directions of maximal variance in the data corresponded to lifetimes, maximum intensities, motilities, and relative DNM2 recruitment. Features such as lifetimes, intensities, and motilities have been used for generating discriminating boundaries between tracked events to filter ‘authentic’ CME events from putative ‘visitors’. Importantly, since two-color tracking was performed on the AP2/DNM2 cell line, the DNM2 events that are tightly co-localized with AP2 are picked up. Here, without making any assumptions about the timing of DNM2 recruitment, we find that DNM2 recruitment is highly variable with regards to the number of DNM2 detections, the maximum number of consecutive DNM2 detections, and the fraction of the AP2 events’ lifetime that recruits DNM2.
 
The variances of each cluster of events were non-symmetric along the first and second principal component, thus, we turned to clustering with a Gaussian Mixture Model (GMM). While GMMs are an algorithm traditionally used for density estimation, they provide an advantage here over k-means clustering since there are soft boundaries and allow for non-diagonal covariance matrices to be computed.
 
The normal-scaling option of features showed that the first two principal components retain 57% of the variance in the scaled feature space. For the purposes of visual validation and reducing complexity, we elected to retain only the first two principal components of this data set. The first two principal components were then clustered with a GMM with variable numbers of mixture components. To verify the visual result of 5 distinct clusters, we calculated the Bayesian Information Criterion (BIC) for each GMM with an increasing number of components. The BIC describes the trade-off between the goodness-of-fit of a model versus the number of model components; the BIC serves the purpose of not underfitting or overfitting data to a particular model. A minimum BIC is desirable, however, we found that the BIC asymptotical decreases. We therefore picked the optimal number of GMM components by finding the point of minimal diminishing returns with additional components. When plotting the BIC versus increasing number of GMM components, a clear elbow at 5 components is visible.
 
The highest probability fit of each event to each cluster was used to determine the initial assignment of each event. Then, the cluster with the highest average of maximum DNM2 intensities was chosen to find the preliminary DNM2 positive candidates (DNM2+ from hereon).


### explained variance culminations across various feature scaling options

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/explained_variance_pca_various_Scalings.png', height=500, width=500)

### relationships between paired PCA components across various feature scaling options

In [None]:
for dist in scaling_distribution_options:
    display(Image(filename=unique_user_saved_outputs+'/plots/pca_2_comp_dist_' + str(dist) + '.png', height=500, width=500))

### explained variance culminations across select feature scaling options

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/explained_variance_pca_select_scalings.png', height=500, width=500)

### visualize the weights of features in PC-space

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/PC_heatmap.png', height=500, width=500)

### visualize the absolute values of weights of features in PC-space

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/PC_heatmap_abs.png', height=500, width=500)

### visualize the absolute values of weights of features in PC-space, sorted

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/PC_heatmap_abs_sorted.png', height=500, width=500)

### Bayesian Information Criterion for models with varying numbers of mixture components

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/bic_gmm_gs.png', height=500, width=500)

### visualize overlay of cluster means on principal components, as a heatmap

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/first_two_principal_components.png', height=500, width=500)

### visualize overlay of cluster means on principal components, as a scatter plot

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/compenents_overlaid_clusters.png', height=500, width=500)

### events in PC-space, labeled by cmeAnalysis DNM2+ prediction

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/PC_overlay_with_cmeAnalysis_DNM2_predictions.png', height=500, width=500)

### magnitudes of principal components

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/magnitudes_of_pcs.png', height=500, width=500)

### events in PC-space, labeled by values in features

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_features_overlaid_pc_individual.png', height=500, width=500)

# Notebook 4: compare clustering models

outline:
    
* generate clustering models using alternative combinations of training sets
* check if DNM2+ events are uniformly selected by alternative models

### look for effects of different training data combinations on results on DNM2+ predictions

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/stats_dnm2pos_variousmodels.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/boostrapped_dnm2_pos_fractions_averages_across_training_models_ndatasets_mean_and_individual_pred_labeled.png', height=500, width=500)

### comparison of distributions of the fraction of DNM2+ events from each dataset

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/tukey_matrix_dnm2_comparison.png', height=500, width=500)

### boundaries between clusters

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/cluster_boundaries.png', height=500, width=500)

### events in PC-space, labeled by the percentage of all possible clustering models that choose event as DNM2+

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/fraction_models_consider_event_dnm2pos_overlaid_pcs_temp.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/fraction_models_consider_event_dnm2pos_overlaid_pcs_temp_zoomed.png', height=500, width=500)

# Notebook 5: visualize clustering results

outline:
    
* visualize lifetime cohorts of cmeDNM2+ events
* visualize the lifetime distribution of cmeAnalysisDNM2+ events
* compare the features of events between different model clusters
* * repeat for events within clusters that are cmeDNM2+ events 
* visualize lifetime cohorts of clustered events 
* visualize examples of events within each cluster
* * repeat for examples of events within each cluster that are cmeAnalysisDNM2+
* attempt to predict the identity of events with supervised classifiers for:
* * events that are within the DNM2+ cluster vs. other clusters
* * events within their respective 5 clusters

In order to aid in the interpretability of the events in different GMM components, tracks were aligned within cohorts for visualization of their average behavior. Here, the alignment procedure (Python code available) follows closely to that of cmeAnalysis’ algorithm. First, tracks are binned within lifetime cohorts: <40, 40-60, 60-80, and >80 seconds with inclusive lower bounds and exclusive upper bounds. Then, one of two alignment options are available: cmeAnalysis-style interpolation and maximum intensity alignment.
 
In the first, no modifications are made from cmeAnalysis besides interpolating to the ceiling of the cohort bounds rather than the mean of the upper and lower bounds. Briefly, intensity traces for each track are interpolated to the ceiling of the cohort and averaged across all time domain points in the interpolated space. For all instances of cohort plotting, 0.25 standard deviations above and below the mean are displayed with transparent hue.
 
For the maximum intensity alignment, a similar approach is used. First, the index of the peak intensity (DNM2 in our case) is found. Then, each intensity trace is padded with zeros before and after the beginning and end of each trace. Zeros were chosen to mute the effects of outlier events that span far past the DNM2 peak. For each trace, the number of zeros before and after was calculated using the difference between the maximum of the number of measured values on each end for all tracks and the number in the present intensity trace. This way, all the intensity traces within a cohort are of the same length and have their peaks line up. The intensity traces are then interpolated to span a time domain equal to the ceiling of the cohort. Then, intensities are averaged and plotted.
 
For the purposes of notebook 4, where many events do not have distinct DNM2 peaks especially in non-DNM2+ clusters, the cmeAnalysis approach for plotting was used. Subsequent notebooks use the maximum intensity method, adding additional visualization aids such as centering time around the DNM2 peak (vesicle scission).
 
We first demonstrate that cmeAnalysisDNM2+ events show a gradual DNM2 recruitment that, on average, peaks near the maximum AP2 intensity as expected. Cohort plots of cmeAnalysisDNM2- events show low levels of AP2 recruitment with little to no DNM2 signal above background.
 
CME is a multi-step process that requires the continual recruitment of several protein modules that contribute to the formation of a vesicle. Modeling the kinetics of a multi-step maturation process have indicated that the expected lifetimes of clathrin-coated pits (CCPs) should follow a Rayleigh distribution [citation]. Rayleigh processes describe events with an increasing rate of failure as lifetimes increase. The lifetime distribution of all events following the merging of all DNM2-positive events marked by cmeAnalysis resulted in an exponential-like distribution. Furthermore, an attempted fit to a Rayleigh distribution showed a poor fit which was confirmed by a Kolmogorov-Smirnov goodness-of-fit test with a p-value of 0.
 
After manual verification of many representative cmeAnalysisDNM2+ events, a minority of tracks had a rapid recruitment of DNM2 in the late stages of the event where the signal was significantly above background levels. The TIRF imaging only captured events taking place at the basal plasma membrane at the coverslip, so the potential origin of these typically short-lived and dim events is not known. While some cmeAnalysisDNM2+ events had the expected phenotype of gradual then burst-like recruitment of DNM2, a majority of events appeared to be visitors: short-lived, rapidly appearing, and rapidly disappearing.
 
When comparing the cmeAnalysisDNM2+/- predictions against the projected feature space, there was high overlap between DNM2+ events and cmeAnalysisDNM2+ events. However, a majority of the cmeAnalysisDNM2+ events were within found to be within the other four GMM components. This refinement criteria through feature analysis yields a more select group of DNM2+ events that can be subject to further filtering for events with true DNM2 peaks.
 
We then compare the features of tracks within different clusters. This is done two ways: first, where either all events within a cluster are compared, and second, all events in a cluster that also correspond to cmeAnalysisDNM2+ are compared. Both cases yielded similar observations: DNM2+ events are brighter, less motile, take longer to assemble, and recruit DNM2 for longer. These results are in agreement with previous methods used to find authentic CME events where the tracking of endocytic coat proteins (e.g. AP2) are accompanied by DNM2 recruitment.
 
Then, by comparing aligned cohorts of the GMM component events, the DNM2+ cohort exhibited the same overall behavior as cmeAnalysisDNM2+ events. This averaged phenotype of AP2 recruitment followed by DNM2 recruitment can be seen even though only 29% of cmeAnalysisDNM2+ events are in the GMM DNM2+ cluster. However, by using the shortest cohort as a comparison (<40 seconds), the DNM2+ events are 2-3 brighter on average than the cmeAnalysisDNM2+ events. This result highlights the misleading nature of alignment artifacts that show expected recruitment phenotypes but have contributions from many events that are possible false positives. The other 71% of cmeAnalysisDNM2+ events that fell into the four DNM2- GMM clusters showed similar average phenotypes: short-lived AP2 recruitment with little to no DNM2 recruitment. These results indicate the detection of many short-lived ‘visitor’ vesicles to the TIRF field that are within the detection sensitivity of cmeAnalysis, but are necessary to removal for downstream analysis of authentic CME events.
 
Manual verification of events in DNM2- clusters show the expected phenotype of ‘visitor’ vesicles: rapid AP2 appearance that quickly peaks and quickly disappears on time scales less than the minimal expected lifetime of CME events (<15 seconds). These intensity characteristics are the expected result of non-plasma membrane bound events that more freely diffuse than their authentic CME counterpart events which nucleate, grow, and pinch off the plasma membrane.
 
The average DNM2 recruitment levels across DNM2+ cohorts provide initial insight into the minimal amount of DNM2 recruitment necessary for successful vesicle scission. Across the two shortest cohorts, <40 and 40-60 seconds, the DNM2 levels reach about ~100 a.u. The longer two cohorts, 60-80 and >80 seconds, recruit an average of ~200 a.u., indicating possibly longer helices of DNM2 at the neck of a budding pit [citation].
 
The four clusters described as DNM2- are not used in downstream analysis, as they do not exhibit the phenotypes previously characterized as unambiguous and authentic CME events. While they are broadly described as DNM2-, they do exhibit distinct phenotypes within each cluster. All clusters are short-lived and highly-motile, however, they have variable lifetimes and DNM2 characteristics. Some cohorts reveal AP2 and DNM2 dynamics that are highly-correlated, indicating that DNM2 is still associated with the freed vesicle at weakly-detected levels. On the other hand, there are cohorts that exhibit no DNM2, indicating DNM2’s departure from the detected AP2-coated vesicle.
 
We developed a generic Support Vector Classifier using sklearn’s default parameters (C=1.0, a radius basis function kernel, and a scaled gamma) for supervised classification of the five clusters. This model can be used to predict the identity of DNM2+ events for experiments conducted in an identical manner to those used to generate a similar clustering model. We found that this simple model could predict labels with 99.80% accuracy. The false positives and false negatives associated with DNM2+ events were found to lie on the boundary between DNM2+ events and an adjacent cluster in principal component space. This adjacent cluster contains events with DNM2 associated for long durations relative to the whole AP2 event, however, a majority of this cluster’s events take place in less than 10 seconds.

The result of notebook 4 is the isolation of future CCP candidates using a minimally biased unsupervised machine learning approach using simple, albeit user-selected, features that aid in the interpretability of the model’s separation. These DNM2+ events are not completely designated as CCPs since they have variable numbers of visible DNM2 peaks. Also, some DNM2+ events weakly recruit DNM2 for long periods of time without forming a sharp burst. Therefore, it is imperative to further refine events into “authentic CCPs” that are marked by single DNM2 bursts. Multi-burst events, or “hot-spots” are known to be phenotypically distinct from de novo nucleation of AP2 followed by vesicle scission. Also, due to the temporal resolution of our imaging (1 second), we are unable to capture and identify events taking place on time scales of 2 seconds or less, so some events may not exhibit a clear DNM2 peak, making them difficult to classify them as DNM2-mediated scission events. 


### cohort groups: [[0, 40], [40, 60], [60, 80], [80, 222] seconds

### cohort plots of cmeAnalyis +/- DNM2, binned in cohorts defined above

In [None]:
for cohorts in cohort_groups:

    display(Image(filename=unique_user_saved_outputs+'/plots/cmeAnalysis_dynamin2_significance_[['+str(cohorts[0][0])+', '+str(cohorts[0][1])+']].png',height=500, width=500) )

### fit of cmeAnalysis DNM2+ events to Rayleigh distribution

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/cmeAnalyis_dnm2_positive_events_fit_rayleigh.png', height=500, width=500)

### feature comparison between model clusters

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_features_compared_between_classes.png',height=500, width=500)  

### feature comparison between DNM2+ events between experiments

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_features_comparing_dnm2_pos_across_experiments.png',height=500, width=500)

### feature comparison between model clusters for members that overlap with cmeAnalysis' DNM2 positive prediction

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_features_compared_between_classes_overlap_cmeAnalysis_dnm2_positive.png',height=500, width=500)

### feature comparison between model clusters, separating DNM2+ cluster and rest

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_features_compared_between_classes_highlighting_dnm2positive.png',height=500, width=500)

### cohort plots of GMM class, binned in cohorts defined above

In [None]:
for cohorts in cohort_groups:

    display(Image(filename=unique_user_saved_outputs+'/plots/gmm_'+str(number_of_clusters)+'_clusters_cohorts_[['+str(cohorts[0][0])+', '+str(cohorts[0][1])+']].png', height=500, width=500))

### cohort plots of GMM class, fix axes for each cohort to compare clusters

In [None]:
for cohorts in cohort_groups:

    display(Image(filename=unique_user_saved_outputs+'/plots/gmm_fixed_axis_cohort_'+str(number_of_clusters)+'_clusters_cohorts_[['+str(cohorts[0][0])+', '+str(cohorts[0][1])+']].png', height=500, width=500))

In [None]:
gmm_class_indices = []

for i in range(number_of_clusters):
    print('gmm cluster: ' + str(i))
    gmm_class_indices.append(df_pcs_normal_scaled_with_gmm_cluster[df_pcs_normal_scaled_with_gmm_cluster['gmm_predictions']==i].index.values)    

### plot examples of random samples from each GMM class

In [None]:
for i in range(len(gmm_class_indices)):
    print('gmm cluster: ' + str(i))
  
    display(Image(filename=unique_user_saved_outputs+'/plots/gmm_'+str(number_of_clusters)+'_clusters_class_'+str(i)+'.png',height=500, width=500))

### include background significance thresholds

In [None]:
for i in range(len(gmm_class_indices)):
    print('gmm cluster: ' + str(i))
    display(Image(filename=unique_user_saved_outputs+'/plots/gmm_'+str(number_of_clusters)+'_clusters_class_'+str(i)+'_including_background.png', height=500, width=500))

### plot examples of random samples from each GMM class that also overlaps with DNM2 positive from cmeAnalysis

In [None]:
for i in range(len(gmm_class_indices)):
    print('gmm cluster: ' + str(i))
    display(Image(filename=unique_user_saved_outputs+'/plots/gmm_'+str(number_of_clusters)+'_clusters_overlap_cmeDNM2positive_class_'+str(i)+'.png', height=500, width=500))

# Notebook 6: detect single DNM2 peaks

outline:

* visualize the lifetime distribution of model's DNM2 positive events (DNM2+)
* visualize the frequency decomposition over DNM2 intensity through time measurements
* find the optimal peak-characteristic parameters for a single DNM2 burst
* confirm the model's selection with alternative statistics for goodness-of-fit
* visualize the effects of alternative peak-constraints in the parameter sweep
* visualize the lifetime distribution of single-peaked DNM2+ events or clathrin-coated pits (CCPs)
* visualize examples of CCPs, hotspots, or non-peaking DNM2+ events
* determine the boundaries of clusters and the overlap of cmeAnalysisDNM2+ and members of clusters
* plots AP2 lifetime cohorts of CCPs aligned to DNM2 peaks

This notebook makes, to our knowledge, the first attempt to classify tracked CME events by the number of DNM2 bursts they contain. Our strategy was to identify the characteristics of a single DNM2 peak and then count how many DNM2 peaks, if any, show up in an event. Our approach to finding DNM2 peak characteristics was inspired by the assertion that authentic CCP events have a Rayleigh distribution of lifetimes. Therefore, we turned to a parameter sweep where we tested which minimum peak-describing requirements yielded single-DNM2-peak events with lifetimes closely resembling a Rayleigh process.

We first show that our DNM2+ events appear more similar by eye to a Rayleigh distribution, but are still fit poorly by three standard goodness-of-fit metrics. In particular, there is a large contribution of very short lived events that are DNM2-rich that creates a distinct peak in the lifetime distribution near 10-15 seconds. The first distribution comparison was made with a two-way Kolmogorov-Smirnov (KS) test, but it is important to note that since the raw distribution is forced to curve-fit a Rayleigh distribution, this test’s results can be misleading when comparing. So, we supplemented the KS result with a Chi-squared goodness-of-fit test and sum-of-squared errors (SSE) comparison. The Chi-squared test was executed as follows: the cumulative observed frequency was estimated by taking the lifetime percentiles at 2% intervals and binning a histogram of lifetimes at these lifetime percentiles; then, an expected cumulative frequency distribution was measured by using Rayleigh-fit parameters at the lifetime percentile thresholds; then, for each percentile bin, the fraction of all events in each bin was calculated and used to generate a discretized cumulative frequency array; finally, these cumulative frequency distributions are compared with a one-way Chi-squared test. The SSE is measured in the following way: observed frequencies are computed from a normalized-to-one histogram which is sampled at 2% percentile intervals; then, a probability density function is estimated from the Rayleigh fit parameters and sampled at the bin edges from the histogram; finally, the difference between the observed frequencies and estimated frequencies are squared and then summed up.

In order to aid in filtering out slow oscillations in intensity representing DNM2 bursts, a signal filter was applied to DNM2 intensities to remove high-frequency noise. We applied a Fast Fourier Transform (FFT) to visualize the contributions of frequencies of all DNM2 intensity traces from DNM2+ events stitched together. The available frequencies are hard-capped by the Nyquist frequency which is half the sampling frequency or 2 seconds per frame. The Nyquist frequency describes the fastest oscillation that can be used in a Fourier reconstruction of our DNM2 waveforms, since we are imaging at twice its rate or 1 second per frame. The two-dimensional histogram of FFT intensities versus sample frequencies show two features. First, there is a near-homogenous contribution of high-frequency signals present in DNM2 intensity traces that are quicker than 0.2 cycles per second. Below this frequency, there is a broad distribution of slower oscillations that are up to an order of magnitude more intense than the high-frequency signals. This result indicates a non-uniform range of slow frequencies that correspond to possible DNM2 peaks. The 0.2 cycles per second elbow served as the cut-off frequency used on all DNM2 intensities via a low-pass, fourth-order Butterworth filter.

Previous work using molecule counting has established a basis for the minimum number of DNM2 molecules necessary for vesicle scission. However, since our imaging was not calibrated for single-molecule sensitivity, we needed to discover a proxy for sufficiently-bright peaks that correspond to the minimum DNM2 requirement. Additionally, quickly-appearing and bright signals can appear in intensity traces as a result of unbound vesicles appearing near a tracked event, therefore, it is important to establish the requirements for minimum peak widths that correspond to de facto DNM2 recruitment and disassembly. Finally, since multiple peaks can appear in a single event, it is important to establish the minimum time requirement between peaks. This serves to eliminate false-positive peaks that appear too close to prominent peaks which can result in an event being misclassified as multi-peaked.

Our algorithm’s outline is as follows: select from a combination of three peak-requirement parameters (minimum height, minimum width, and distance between peaks), find which events have filtered DNM2 intensities with one peak, and make a statistical comparison between the lifetimes of single-peaked events and a Rayleigh distribution fit from Maximum Likelihood Estimation. The combinations of peak requirements span a meshed grid housing all combinations of the three parameters across selected ranges. The minimum height was chosen to be 50 to 300 a.u.’s in 25 a.u. intervals since the DNM2+ cohort plot peaks were within this range. The minimum peak width was chosen as 1 to 10 seconds as to cover the sampling frequency (inaccessible frequency modes below Nyquist) to time scales nearing the minimum recorded CME lifetimes (15 seconds). The minimum distance between peaks was chosen as 1 to 20 seconds to cover the length of lifetimes around the minimum recorded CME lifetimes. We used the KS p-value maximum to select the best-fit parameters and validated this goodness-of-fit with the Chi-squared p-value and SSE measurement. The KS test implementation (SciPy) was working under the null hypothesis that the distributions are identically distributed. 

The optimal parameters to describe a DNM2 peak were a height of 125 a.u.’s, 5 second width, and 17 seconds between peaks, which were found in 42.61% of DNM2+ events. The p-value of the KS test was 0.6457, the p-value of the Chi-squared test was 0.9381, and the SSE was 0.000125094 between the observed and estimated probability density function. The minimum peak height is in close agreement with the peak amplitude found in interpolation-based cohort plots across all lifetime bins, however, this interpolation method distorts the maximum of each event since they do not necessarily occur at the same index in the interpolated time domain. The minimum peak width is, to our knowledge, not a parameter that has been measured in the field. The minimum distance between peaks is close to the minimum recorded CME lifetime. This last result is difficult to closely compare, however, since the minimum CME lifetimes referred to were imaged with single-molecule sensitivity, making their 15 second minimum a more accurate lifetime since the first and last clathrin molecules between initiation and vesicle uncoating were captured.

The single-peaked DNM2+ events will hereon be referred to as CCPs as to differentiate them from multi-peaked hot-spots and non-peaked events without detectable DNM2 bursts. We randomly plotted 25 examples of CCPs and 25 examples of non-CCP DNM2+ events to demonstrate the efficacy of our results that agrees with the expected phenotypes. Although occasional extra “peaks” show up in CCPs, we note that our experiments (and perhaps few feasible studies) can effectively capture a functional readout of the precise signature of vesicle scission. These uniform, model-mined parameters are a significant step towards using consistent heuristics in selecting relevant events in studies of CME. Importantly, when making comparisons across various experimental conditions, it is imperative to establish criteria for what the properties of wild-type assembly dynamics are. While this approach does not eliminate the need for or underscore the importance of molecule counting, it does offer a possible workaround for determining successful scission based on reverse modeling expected lifetime distributions. Further studies will be necessary to validate the results of the selected DNM2 peak characteristics. After all, there is still no consensus in the CME field on how many DNM2 molecules are necessary for scission. Two conflicting reports have indicated that either integer multiples of helices or 1.5 helices are necessary for scission, although, these studies were carried out in different cell types. Additionally, neither study assayed whether the DNM2-recruiting vesicles recruited downstream CME proteins. These modeling efforts aim to bridge the gap between the need for robust functional read-outs of vesicle scission and the feasibility of the such necessary experiments.

Now that we have CCP requirements, we find that of the 59,239 valid tracks used to build the model, 10.08% of all events are DNM2+ and 4.30% of all events are CCPs. The remainder of the DNM2+ events had either no peaks (46.78%) or two or more peaks (10.60%). Binned and aligned-to-peak-DNM2 CCPs showed the expected phenotype of AP2/DNM2 dynamics: gradual AP2 assembly accompanied by weak DNM2 recruitment that bursts near the peak of AP2 recruitment. To verify that DNM2- clusters did not contain sufficient DNM2 peaks to mediate vesicle scission, we searched for DNM2 peaks across the other GMM component events. Less than one percent of “cluster 2” events contained DNM2 peaks and they were entirely absent in other clusters. As a reminder, “cluster 2” was labeled as putative “DNM2-carrying visitors”, so the presence of occasional bright DNM2 signals was not entirely surprising. 

To our surprise, the CCPs were situated near the geometric center of DNM2+ events in principal component space, the non-peaked events were situated largely near “cluster 2”, and an increasing number of peaks was positively correlated with increasing PC-0 and PC-1. Using a RandomForestClassifier (SciPy) with default settings (100 trees), the identity amongst all DNM2+ events of all non-peaked events, CCPs, or multi-peaked events were classified with accuracies of 84.33%, 92.03%, and 92.30%, respectively, when trained with raw event features. However, when we trained classifiers with the identity of the non-peaked, CCP, or multi-peaked events and their two-dimensional principal component projections, we only obtained accuracies of 82.85%, 74.08%, 90.35%, respectively. These results suggest that the principal component space is losing information that may encode the number of DNM2 peaks found in each event that can be recovered with information containing all training features. 

We verified the average assembly dynamics of non-peaked DNM2+ events and found a surprising result: the lifetime binned and aligned-to-DNM2-peak events showed characteristic AP2/DNM2 recruitment. However, it is important to note when comparing the longest-lived non-peaked cohort (>=80 second AP2 lifetime) and the shortest single-DNM2-peaked cohort (<40 second AP2 lifetime), the two had a similar peak AP2 brightness but the single-peaked events had nearly double the DNM2 brightness. 25 random non-peaked events show that a select few events contain what one might call a “DNM2 peak”, however, we note and accept an inescapable consequence of automation: no algorithm is perfect. The standards for selecting events relevant to a study of CME dynamics often involve arbitrary event selection and filtering criteria with few easy-to-interpret standards. Cherry-picking relevant data is an easy way to fall into traps of misinterpretation and false conclusions. Therefore, we fully accept the limitations of these results and expect they may recover a large portion of events that behave as expected; our work leaves room for error that can be understood in terms of quantifiable differences between algorithmic selection and hand-drawn events. We hope these efforts will move the studies of CME dynamics in a direction where biological insights can be made more clear by removing events that can be classified as noise. 


### fit candidate CCPs and hotspot lifetime distribution to Rayleigh distribution

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/dnm2_positive_events_fit_rayleigh.png', height=500, width=500)

### visualize all frequencies of DNM2 signal

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/fft_of_dnm2_signals.png', height=500, width=500)

### percentage of models that consider event a CCP

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/percent_models_consider_event_ccp_overlaid_pcs.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/percent_models_consider_event_ccp_overlaid_pcs_zoom_dnm2cluster.png', height=500, width=500)

### percentage of models that consider event a CCP, only considering models with good Rayleigh fit

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/percent_models_sigKS_consider_event_ccp_overlaid_pcs.png', height=500, width=500)

### percentage of models that consider event a hot-spot, only considering models with good Rayleigh fit

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/percent_models_sigKS_consider_event_hotspot_overlaid_pcs.png', height=500, width=500)

### percentage of models that consider event non-peaked, only considering models with good Rayleigh fit

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/percent_models_sigKS_consider_event_nonpeaked_overlaid_pcs.png', height=500, width=500)

### number of single-peaked events across candidate models

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/num_single_peaked.png', height=500, width=500)

### KS-test p-value across candidate models

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ks_val_per_ccpmodel.png', height=500, width=500)

### Chi-squared p-value across candidate models

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/cspval_val_per_ccpmodel.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/csstat_val_per_ccpmodel.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/chisqstat_vs_chisqpval.png', height=500, width=500)

### comparisons between goodness-of-fit statistics and numbers of CCPs

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/numsingle_peaked_vs_csgofstat_colored_csgofpval.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/chi_squared_gof_pval_vs_num_single_peaked_with_width_colored_chi_squared_gof_stat.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/significance_position_with_width_vs_num_single_peaked_with_width_colored_chi_squared_gof_stat.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/significance_position_with_width_vs_num_single_peaked_with_width_colored_chi_squared_gof_pval.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/three_stats_tests_ccp_finding.png', height=500, width=500)

### slice of heatmap of number of the number of CCPs along varying values of the search parameters' axes: minimum peak height, peak width, and peak-to-peak distance

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/distance_height_width_distheight_projection_hotspot_parameter_sweep.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/distance_height_width_distwidth_projection_hotspot_parameter_sweep_numccps.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/distance_height_width_distwidth_projection_hotspot_parameter_sweep_widthdist_cspval.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/distance_height_width_distwidth_projection_hotspot_parameter_sweep_widthdist_sse.png', height=500, width=500)

In [None]:
display(Image(filename=unique_user_saved_outputs+'/plots/distance_height_width_distwidth_projection_hotspot_parameter_sweep_widthdist_ks.png', height=500, width=500))

### lifetime distribution of CCPs

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ccp_events_fit_rayleigh.png', height=500, width=500)

### lifetime distribution of non-peaked and multi-peaked events

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/hotspot_events_fit_rayleigh.png', height=500, width=500)

### plot an example CCP and example hot-spot

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/example_ccp_hotspot_subplot.png', height=500, width=500)

### examples of CCPs

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/sample_ccps_kept.png', height=500, width=500)

### examples of hot-spots and non-peaked events

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/sample_hotspots_and_nonpeaked_events_discarded.png', height=500, width=500)

### plot the decision boundaries of 5-means clusters, cluster centroids, and log-scale counts of PC components

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/PC_overlay_with_cluster_boundaries.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/PC_overlay_with_cluster_boundaries_paperversion.png', height=500, width=500)

### number of peaks in DNM2+ events

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/num_peaks_best_ccpmodel.png', unconfined=True, height=700, width=700)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/num_peaks_overlaid_pcs.png', unconfined=True, height=500, width=500)

### the number of peaks overlaid on the events' principal component projections

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/probs_dnm2plus_overlaidpcs.png', unconfined=True, height=500, width=500)

### comparing principle component projections of events with the probability that events are DNM2+

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/num_peaks_vs_probsdnm2plus.png', unconfined=True, height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/pc0_vs_numpeaks.png', unconfined=True, height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/pc1_vs_numpeaks.png', unconfined=True, height=500, width=500)

### cohort plots of CCPs

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_2colorcellline_cohorts_centered_zero.png', height=500, width=500)

### cohort plots of all non-peaked DNM2+ events

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_2colorcellline_cohorts_centered_zero_non_peaked.png', height=500, width=500)

### examples of non-peaked events

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/sample_non_peaked_events_discarded.png', height=500, width=500)

### cohort plots of events in clusters

In [None]:
for cluster_num in range(number_of_clusters):
    display(Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_2colorcellline_cohorts_centered_zero_cluster_'+str(cluster_num)+'.png', height=500, width=500))

# Notebook 7: incorporating new data for analysis


Now that we have established standards for selecting CCPs, imaging data acquired identically to the data used to generate the model can be incorporated into this analysis stream. It is important to note that of the 13 ARPC3-tagged movies, 8 were imaged on two separate dates but back-to-back with the 2 color cell line. 5 of the ARPC3-tagged movies were imaged independently but with the same microscope settings used in aforementioned movies. The 9 AP2/DNM2/N-WASP movies were also not imaged back-to-back with the control AP2/DNM2 cell lines.

Similar to the procedure used in notebook 1, the ProcessedTracks.mat objects from the tracked AP2/DNM2 channels of the 3 color cell lines were uploaded. The third channels were not tracked simultaneously with AP2 and DNM2. Feature extraction was applied to each track the principal component axes generated from the 2 color cell line were used to find the principal component projections of the 3 color cell line events. The previously fit GMM was used to find DNM2+ events, and the same DNM2 filtering and peak-finding parameters were used to find CCPs. CCP predictions, GMM classifications, merged raw track arrays, and feature dataframes were saved for use in subsequent notebooks.


# Notebook 8: compare cell lines

### compare contributions to principal components from separate experiments used to generate model (2 color cell line)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/pc01_trainingdata_nb7.png', height=500, width=500)

### compare contributions to principal components from separate experiments in ARPC3-tagged cell line

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/pc01_inferencearpc3linedata_nb7.png', height=500, width=500)

### compare contributions to principal components from separate imaging dates

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/pc01_imagingdates.png', height=500, width=500)

### compare contributions to principal components from separate cell lines

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/pc01_celllines.png', height=500, width=500)

### compare contributions to principal components from separate cell lines, per experiment

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/pc01_celllines_by_exp.png', height=500, width=500)

### compare features between cell lines' DNM2+ events

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/feature_comparison_dnm2_positive.png',unconfined=False)   

### feature comparison between cell lines' CCPs

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/feature_comparison_ccps.png',height=500, width=500)    

### compare lifetimes between 2 and 3 color experiments

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/lifetime_comparison_boxplot_celllines.png', height=500, width=500)

### compare max AP2 intensities between 2 and 3 color experiments

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ap2max_comparison_boxplot_celllines.png', height=500, width=500)

### compare max DNM2 intensities between 2 and 3 color experiments

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/dnm2max_comparison_boxplot_celllines.png', height=500, width=500)

### compare AP2 lifetimes of 2 cell lines

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/lifetime_histogram_comparison_celllines.png', height=500, width=500)

### compare CCP lifetimes of 2 cell lines

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/lifetime_cdf_histogram_comparison_celllines.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ap2initiationdnm2peak_comparing_celllines.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ap2dnm2peak_cdf_histogram_comparison_celllines.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ap2dnm2peak_and_lifetime_cdf_histogram_comparison_celllines_merged.png', height=500, width=500)

### observe where the lifetime distributions of the two cell lines diverge

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/lifetime_cellline_comparison_heatmap.png', height=500, width=500)

### the frequency of initiation of CCPs and all detected events

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_event_initiation_over_frames_cdf.png', height=500, width=500)

### measure the relative distribution of mixture components across experiments

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/proportions_gmm_clusters_between_experiments_hue_cellline.png', height=500, width=500)

### measure the rate of events from all mixture components across every experiments

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/proportions_gmm_clusters_between_experiments_hue_cellline.png', height=500, width=500)

### calculate initiation rates between 2 cell lines

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/dnm2plus_initiation_per_exp_celllinecomp.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ccp_initiation_per_exp_celllinecomp.png', height=500, width=500)

# Notebook 9: create ARPC3 KDTrees

outline:

* load independently-tracked ARPC3 
* create a KDTree of track (x, y) positions for every frame of the movie

# Notebook 10: parameter sweep for merging CCPs with ARPC3 tracks

outline:
    
* calculate the fraction of ARPC3+ CCPs as a function of KDTree search radius and minimum number of overlapping AP2 and ARPC3 frames

### measuring effect of search Radius-logging fraction ARPC3+ as a function of radius and minimum associated frames

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/percent_arpc3_positive_radius_num.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/percent_arpc3_positive_radius_num_matrix.png', height=500, width=500)

### the fraction of ARPC3+ events as a function of search radius, holding overlapping frames constant

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/percent_arpc3_positive_separate_minframes.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/percent_arpc3_varying_radius_and_average.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/percent_arpc3_varying_radius_and_average_overlaidlinesphases.png', height=500, width=500)

### fraction ARPC3+ across all possible model combinations

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/percent_arpc3_histogram_all.png', height=500, width=500)

### fraction ARPC3+ for minimum 1 frame overlap, varying radius

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/percent_arpc3_histogram_just_num_assoc_one.png', height=500, width=500)

# Notebook 11: merge AP2 with ARPC3, 'nan' padding

outline:
    
* find ARPC3+/- events
* measure the effect of CCP motility with ARPC3 recruitment

### AP2 motility before and after scission, comparing ARPC3+/-

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ap2movementbeforeafterarpc3plusminus_nanpadding.png', height=500, width=500)

### AP2 motility before and after scission, comparing ARPC3+/-, only ARPC3 events present at scission

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ap2movementbeforeafterarpc3plusminus_onlysigarpc3atscission_nanpadding.png', height=500, width=500)

### AP2 motility before and after scission, comparing ARPC3+/-, only ARPC3 events not present at scission

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ap2movementbeforeafterarpc3plusminus_onlynonsigarpc3atscission_nanpadding.png', height=500, width=500)

# Notebook 11b: merge AP2 with ARPC3, 'nan' padding

outline:
    
* find ARPC3+/- events
* measure the effect of CCP motility with ARPC3 recruitment

### AP2 motility before and after scission, comparing ARPC3+/-

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ap2movementbeforeafterarpc3plusminus_zeropadding.png', height=500, width=500)

### AP2 motility before and after scission, comparing ARPC3+/-, only ARPC3 events present at scission

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ap2movementbeforeafterarpc3plusminus_onlysigarpc3atscission_zeropadding.png', height=500, width=500)

### AP2 motility before and after scission, comparing ARPC3+/-, only ARPC3 events not present at scission

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ap2movementbeforeafterarpc3plusminus_onlynonsigarpc3atscission_zeropadding.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3minusccps_cohorts_centered_zero_zeropadding_randomized.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3plusccps_cohorts_centered_zero_zeropadding_allarpc3events_randomized.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3plusccps_cohorts_centered_zero_zeropadding_sigatdnm2peakarpc3events_randomized.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3plusccps_cohorts_centered_zero_zeropadding_nonsigatdnm2peakarpc3events_randomized.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/arpc3_lifetimes_randomized.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/exp_pairs_randomized_arpc3association.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/fractionarpc3pos_true_random.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/fractionarpc3pos_and_std_competingccpmodels.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/fractionarpc3pos_true_altccp_random.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/numframesvsmeansep_randomtruealtccp.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/numframesvsmeansep_randomtruealtccp_colorizedbyarpsigatdnm2peakoverlaid.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/numframesvsmeansep_randomtruealtccp_colorizedbyarpsigatdnm2peaksplitapart.png', height=500, width=500)

# Notebook 13a: analyze all ARPC3+ CCPS, 'nan' padding

### AP2 lifetimes of ARPC3+/- events

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ap2lifetimes_plusminus_arpc3_histogram_counts_nanpadding.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ap2lifetimes_plusminus_arpc3_histogram_density_nanpadding.png', height=500, width=500)

### CCP lifetimes (AP2 initiation to DNM2 peak) of ARPC3+/- events

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ccplifetimes_plusminus_arpc3_histogram_counts_nanpadding.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ccplifetimes_plusminus_arpc3_histogram_density_nanpadding.png', height=500, width=500)

### histogram of ARPC3 lifetimes

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/arpc3_lifetimes_nanpadding.png', height=500, width=500)

### ECDF of AP2 and CCP lifetimes, merged view

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/merged_lifetimes_arpc3_ecdfs_nanpadding.png', height=500, width=500)

### aligned AP2/DNM2 intensities to DNM2 peaks, stacked, for ARPC3- CCPs

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/arpc3minus_cohorts_centered_on_single_dnm2_max_spot_nanpadding.png', height=500, width=500)

### aligned AP2/DNM2/ARPC3 intensities to DNM2 peaks, stacked, for ARPC3+ CCPs

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/arpc3plus_cohorts_centered_on_single_dnm2_max_spot_nanpadding.png', height=500, width=500)

### cohort plots of ARPC3- CCPs

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3minusccps_cohorts_nanpadding.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3minusccps_cohorts_centered_zero_nanpadding.png', height=500, width=500)

### ARPC3- cohort with AP2/DNM2 separation

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3minusccps_cohorts_overlaid_separation_nanpadding.png', height=500, width=500)

### proof that ARPC3 separation from AP2 is not due to frame delays

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/proof_that_arpc3_separation_exists_nanpadding.png', height=500, width=500)

### cohort plots of ARPC3+ CCPs

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3plusccps_cohorts_centered_zero_nanpadding.png', height=500, width=500)

### ARPC3+ cohort with AP2/DNM2 and AP2/ARPC3 separation

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3plusccps_cohorts_overlaid_separation_nanpadding.png', height=500, width=500)

### comparing ARPC3+/- aligned intensities

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_cohorts_stacked_arpc3_no_distance_nanpadding.png', height=500, width=500)

### comparing ARPC3+/- aligned intensities with separations

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_cohorts_stacked_arpc3_nanpadding.png', height=500, width=500)

# Notebook 13b: analyze all ARPC3+ CCPS, 'zero' padding

### aligned AP2/DNM2 intensities to DNM2 peaks, stacked, for ARPC3- CCPs, 

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/arpc3minus_cohorts_centered_on_single_dnm2_max_spot_zeropadding.png', height=500, width=500)

### aligned AP2/DNM2/ARPC3 intensities to DNM2 peaks, stacked, for ARPC3+ CCPs

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/arpc3plus_cohorts_centered_on_single_dnm2_max_spot_zeropadding.png', height=500, width=500)

### cohort plots of ARPC3- CCPs

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3minusccps_cohorts_zeropadding.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3minusccps_cohorts_centered_zero_zeropadding.png', height=500, width=500)

### ARPC3- cohort with AP2/DNM2 separation

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3minusccps_cohorts_overlaid_separation_zeropadding.png', height=500, width=500)

### cohort plots of ARPC3+ CCPs

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3plusccps_cohorts_centered_zero_zeropadding.png', height=500, width=500)

### ARPC3+ cohort with AP2/DNM2 and AP2/ARPC3 separation

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3plusccps_cohorts_overlaid_separation_zeropadding.png', height=500, width=500)

### comparing ARPC3+/- aligned intensities

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_cohorts_stacked_arpc3_no_distance_zeropadding.png', height=500, width=500)

### comparing ARPC3+/- aligned intensities with separations

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_cohorts_stacked_arpc3_zeropadding.png', height=500, width=500)

# Notebook 14

### AP2 lifetimes of ARPC3+/- events

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ap2lifetimes_plusminus_arpc3_histogram_density_nonsigdnm2peak_zeropadding.png', height=500, width=500)

### CCP lifetimes (AP2 initiation to DNM2 peak) of ARPC3+/- events

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ccplifetimes_plusminus_arpc3_histogram_density_nonsigdnm2peak_zeropadding.png', height=500, width=500)

### histogram of ARPC3 lifetimes

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/arpc3_lifetimes_nonsigdnm2peak_zeropadding.png', height=500, width=500)

### ECDF of AP2 and CCP lifetimes, merged view

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/merged_lifetimes_arpc3_ecdfs_nonsigdnm2peak_zeropadding.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ap2lifetimes_plusminus_arpc3_histogram_density_nonsigdnm2peak_zeropadding.png', height=500, width=500)

### cohort plots of ARPC3+ CCPs

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3plusccps_cohorts_centered_zero_nonsigdnm2peak_zeropadding.png', height=500, width=500)

### ARPC3+ cohort with AP2/DNM2 and AP2/ARPC3 separation

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3plusccps_cohorts_overlaid_separation_zeropadding_nonsig.png', height=500, width=500)

### calculating "straightness index" of tracks, +/- ARPC3

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/straightness_index_merged_zero_nonsig.png', height=500, width=500)

# Notebook 15

### AP2 lifetimes of ARPC3+/- events

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ap2lifetimes_plusminus_arpc3_histogram_density_sigdnm2peak_zeropadding.png', height=500, width=500)

### CCP lifetimes (AP2 initiation to DNM2 peak) of ARPC3+/- events

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ccplifetimes_plusminus_arpc3_histogram_density_sigdnm2peak_zeropadding.png', height=500, width=500)

### histogram of ARPC3 lifetimes

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/arpc3_lifetimes_sigdnm2peak_zeropadding.png', height=500, width=500)

### ECDF of AP2 and CCP lifetimes, merged view

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/merged_lifetimes_arpc3_ecdfs_sigdnm2peak_zeropadding.png', height=500, width=500)

### cohort plots of ARPC3+ CCPs

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3plusccps_cohorts_centered_zero_sigdnm2peak_zeropadding.png', height=500, width=500)

### ARPC3+ cohort with AP2/DNM2 and AP2/ARPC3 separation

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_arpc3plusccps_cohorts_overlaid_separation_zeropadding_sig.png', height=500, width=500)

### comparing ARPC3+/- aligned intensities

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_cohorts_sigdnm2peak_stacked_arpc3_no_distance_zeropadding.png', height=500, width=500)

### comparing ARPC3+/- aligned intensities with separations

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_cohorts_sigdnm2_stacked_arpc3_zeropadding.png', height=500, width=500)

### calculating "straightness index" of tracks, +/- ARPC3

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/straightness_index_merged_zero_sig.png', height=500, width=500)

# Notebook 16

### AP2 lifetimes of N-WASP+/- events

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ap2lifetimes_plusminus_nwasp_histogram_counts.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ap2lifetimes_plusminus_nwasp_histogram_density.png', height=500, width=500)

### CCP lifetimes (AP2 initiation to DNM2 peak) of N-WASP+/- events

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ccplifetimes_plusminus_nwasp_histogram_counts.png', height=500, width=500)

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/ccplifetimes_plusminus_nwasp_histogram_density.png', height=500, width=500)

### histogram of N-WASP lifetimes

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/nwasp_lifetimes.png', height=500, width=500)

### ECDF of AP2, CCP, and N-WASP lifetimes, merged view

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/merged_lifetimes_nwasp_ecdfs.png', height=500, width=500)

### cohorts of N-WASP- CCPs

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_nwaspminusccps_cohorts_centered_zero.png', height=500, width=500)

### cohorts of N-WASP+ CCPs

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_nwaspplusccps_cohorts_centered_zero.png', height=500, width=500)

### comparing N-WASP+/- aligned intensities

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_cohorts_stacked_no_distance_nwasp.png', unconfined=False, height=500, width=500)

### N-WASP+ cohort with AP2/DNM2 and AP2/N-WASP separation

In [None]:
Image(filename=unique_user_saved_outputs+'/plots/all_overlaid_ap2dnm2_nwaspplusccps_cohorts_overlaid_separation.png', height=500, width=500)