# Introduction

This notebook combines the results from moving area segmentation, registration analysis and spatial auto-correlation analysis to visualise summary stats per experiment:

- Visualize Speed Summary Stats: this plots the average speed in each well for each type of knockout in both the moving regions and coordinated regions.
- Distribution of speeds: the distribution of speeds is shown per well and per knockout.
- Visualise Mask Summary Stats (Coordination vs Segmentation): the proportion of moving region is compared to proportion of coordinated region per well per knockout type.

The calculations required for the above visualisations are carried out in [1_segment_moving_regions](hbec/1_segment_moving_regions.ipynb), [3_registration](hbec/3_registration.ipynb) and [3b_spatial_auto_correlation](hbec/3b_spatial_auto_correlation.ipynb)

# Imports

In [None]:
import numpy as np
import pandas as pd
import scipy as sp
import os
import re
import cv2

from matplotlib import pyplot as plt
from tqdm import tqdm
from fam13a import utils

import altair as alt
from altair import datum

# Constants

In [None]:
PROJ_ROOT = utils.here()
# declare the data input directory
HBEC_ROOT = os.path.join(PROJ_ROOT, 'data', 'processed', 'hbec')
# print list of experiment IDs
print(os.listdir(HBEC_ROOT))

In [None]:
# choose experiment data to load
EXP_ID = 'N67030-66_6_perc'

# Process

In [None]:
summary_df, zero_mask_df = utils.generate_mask_and_speed_summary_df(EXP_ID)

In [None]:
zero_mask_df

The `zero_mask_df` contains a list of data points that will be missing on the speed plots below. If the mask-type is 1_motion, then there will be no average speed value for the data point for both the motion mask and coordination mask. If the mask_type is 2_coordination, then there will be no average speed value for the data point for just the coordination mask

# Visualise Mask Summary Stats (Coordination vs Segmentation)

In [None]:
mask_summary_df = summary_df.copy().drop(columns=['speed'])
mask_summary_df = mask_summary_df.drop_duplicates()

std = mask_summary_df.groupby(
    ['group_id','mask_type']
).mask_ratio.std().reset_index().rename(columns={'mask_ratio':'std'})
mean = mask_summary_df.groupby(
    ['group_id','mask_type']
).mask_ratio.mean().reset_index().rename(columns={'mask_ratio':'mean'})

mask_agg_df = pd.merge(mean, std, on = ['mask_type','group_id'], how='inner');

std = mask_summary_df.groupby(
    ['group_id','mask_type', 'batch_id']
).mask_ratio.std().reset_index().rename(columns={'mask_ratio':'std'})
mean = mask_summary_df.groupby(
    ['group_id','mask_type', 'batch_id']
).mask_ratio.mean().reset_index().rename(columns={'mask_ratio':'mean'})

mask_agg_df_per_group_per_batch = pd.merge(mean, std, on = ['mask_type', 'group_id', 'batch_id'], how='inner');

In [None]:
# for the motion mask, normalise proportion to the proportion value of NT
mask_t = '1_motion'
nt_average = np.mean(
    mask_summary_df.loc[
        (mask_summary_df.mask_type==mask_t) &
        (mask_summary_df.group_id.str.contains('NT')
        ), 'mask_ratio'
    ])
mask_summary_df.loc[
    mask_summary_df.mask_type==mask_t, 'mask_ratio'
] = mask_summary_df.loc[
    mask_summary_df.mask_type==mask_t, 'mask_ratio'
]/nt_average

In [None]:
base = alt.Chart(
    mask_summary_df
).properties(
    width=600,
    height=200
)

error_bars = base.mark_errorbar(extent='stdev').encode(
  x=alt.X('mask_ratio:Q', scale=alt.Scale(zero=False), axis=alt.Axis(title='masked proportion')),
  y=alt.Y('group_id:N')
)

points = base.mark_point(filled=False, color='black', size=50).encode(
  x=alt.X('mask_ratio:Q', aggregate='mean', axis=alt.Axis(title='masked proportion')),
  y=alt.Y('group_id:N'),
)

all_points = base.mark_square(size=50).encode(
    y=alt.X('group_id:N', axis=alt.Axis(title='experiment')),
    x=alt.Y('mask_ratio:Q',),
    color='batch_id'
)

(error_bars + points + all_points).facet(
    'mask_type:N',
    columns = 1
).configure_axis(
    labelFontSize=16,
    titleFontSize=16
).configure_legend(
    labelFontSize = 16
).configure_header(
    labelFontSize=20
).resolve_scale(
    x='independent'
)