# Analysis of experimental bacterial cell oritetation in microchannels
Analyse cell oritentation distribution in microchannels from the experiment [1].

## Purpose
Compare simualtion and experimental cell oritentation data to calibrate simulation parameters.

## Methodology
1. Read experimental data from `data/Ecoli_in_microchannels_EXPER_DATA/*_wid_len_orient_EXP.txt` as a string.
2. Strip string characters and convert the string array into a numerical array.
3. Save `orient` data from files into dictionary.
4. Plot cell oritentation histogram for each file.

### Filename explanation
Example: `4X9_20200609_15_01ch_wid_len_orient_EXP.txt`
- `4X9` ... average number of cells in a channel (height and width)
- `20200609` ... date of recording
- `15` ... ???
- `01ch` ... channel number
- `wid_len_orient` ... calculated variable(s)
- `EXP` ... experimantal

### File structure
Each `wid_len_orient` file contains three lines: width, length and orientation.
``` 
width = [5.60917, 5.787379, 4.9629235, ...]
length = [13.823369, 33.754707, 17.643686, ...]
orient = [75.4713, 102.6692, 94.99006, ...]   
```
All arrays have the same length that equals number of all cells tracked in the course of the experiment/imaging.  

To find total time of imaging use `uniq_numb` file for the same experiment. Firstly, find the length of the `num_uniq` 1D array. Then, multiply the lenght by 3 mins, rate of imaging.

## WIP - improvements
Use this section only if the notebook is not final.

## Notable TODOs:

- Plot oritentation histogram for each simulation step.
- Use `plotly` to visualize multiple steps in one figure.

## Results
A single files that has been analyzed has a normal distribution of angles.

## Suggested next steps
- Compare experimetal and simualtion data plots

# Setup
## Library import

In [1]:
import os
from os.path import join as join_paths, basename as get_basename
import glob

import numpy as np
from scipy.stats import gaussian_kde

from matplotlib import pyplot as plt
%matplotlib inline
import plotly.express as px

In [2]:
os.chdir("/home/i/igors-dubanevics/projects/bacteria-microchannel")
os.getcwd()

'/home/i/igors-dubanevics/projects/bacteria-microchannel'

## Parameter definition

In [3]:
data_dir_path = "data/Ecoli_in_microchannels_EXPER_DATA"

out_dir_path = join_paths("scratch", get_basename(data_dir_path))

dt = 3.0 # Frequency of each taken frame [min] (technically [min^{-1}])

## Pre-data import

In [4]:
# Create output dir
os.makedirs(out_dir_path, exist_ok=True)

## Data import

In [5]:
# Read and process orientation files
# Orientation stored in a dictionary with format {filename: orientation_array}
angles = {}

# Find data in directory
# filenames = sorted(glob.glob(join_paths(data_dir_path, '4X9_20200609_9_01ch_wid_len_orient_EXP.txt')))
filenames = sorted(glob.glob(join_paths(data_dir_path, '*_wid_len_orient_EXP.txt')))


for filename in filenames:
    # Get filename part that describes experiment
    file_info = get_basename(filename).replace("_wid_len_orient_EXP.txt","")
    name_vars = file_info.split("_")

    # Extract height and width of the channel
    (height, width) = [int(x) for x in name_vars[0].split('X')]
    # Date
    date = "{}-{}-{}".format(name_vars[1][:4],name_vars[1][4:7],name_vars[1][7:])
    # Channel name
    if len(name_vars[2:]) > 1:
        channel_name = '_'.join(name_vars[2:])
    else:
        channel_name = name_vars[2]

    # Load only orientation line, and skip width and length lines 
    data = np.loadtxt(filename, skiprows=2, dtype=str)
    data = ''.join(data)
    for string in ["orient=[","]"]:
        data = ''.join(data).replace(string,'')
    data = np.fromstring(data, dtype=float, sep=',')
    # Align orientation parallely to channel length (shift by -90 deg)
    data = data - 90

    angles[file_info] = data

In [18]:
# Get total elapsed experimental time
# Store time in dictionary {filename: time}
exp_time = {}

# Calcualte total elapsed experiment time
filenames = sorted(glob.glob(join_paths(data_dir_path, "*_numb_uniq_EXP.txt")))
print
for filename in filenames:
    # Get filename part that describes experiment
    file_info = get_basename(filename).replace("_numb_uniq_EXP.txt","")
    name_vars = file_info.split("_")

    data = np.loadtxt(filename, skiprows=0, dtype=str)
    frames = ''.join(data)
    for string in ["num_uniq=[","]"]:
        frames = ''.join(frames).replace(string,'')
    frames = np.fromstring(frames, dtype=float, sep=',')

    # Calculate total imaging time
    # Multiply number by imaging frequency
    time = len(frames) * dt

    exp_time[file_info] = time


## Data processing

In [7]:
# Perform Kernel Desity Estimation (KDE)
angles_kde = {}
for key, value in angles.items():
    kde = gaussian_kde(value, bw_method='scott', weights=None)
    angles_kde[key] = kde

In [8]:
x_grid = np.linspace(-90,90,100)

In [22]:
# for key in [list(angles.keys())[0]]:
for key in angles.keys():
    fig, ax = plt.subplots(figsize=(5,3.5))
    plt.title("filename: {}".format(key), loc='left')
    bin_num = 90
    ax.hist(angles[key], bin_num, fc='gray', histtype='stepfilled', alpha=0.3, 
        density=True, label='EXP ({:d} bins)'.format(bin_num))
    ax.plot(x_grid, angles_kde[key](x_grid), linewidth=2, label='KDE')

    plt.xlabel(r"cell orientation ($\arccos{\hat{r}\cdot\hat{x}}$) [deg]")
    plt.ylabel("Density (N={:d})".format(len(angles[key])))
    plt.tight_layout()

    # Put elapsed experiment time
    plt.text(.01, .99, "EXP time: {} min".format(exp_time[key]), ha='left', va='top', transform=ax.transAxes)
    
    plt.legend(loc='upper right', frameon=False)
    # Save figure
    for ext in ['png']:
        fig_outname =  "{file_info}_orient_EXP.{ext}".format(file_info=key, ext=ext)
        fig.savefig(join_paths(out_dir_path, fig_outname), dpi=300)
    plt.close()

In [20]:
# Plot histograms of some cell properties
fig = px.histogram(angles[990], 
    title="Cell orientation along x-axis")
fig.update_layout(bargap=0.2,
    xaxis={"title":  "Orientation [rad]"},
    yaxis={"title":  "Count (N=%d)"%(len(angles[990]))})
fig.show()

## References
1. Koldaeva, Anzhelika, et al. "Population genetics in microchannels." Proceedings of the National Academy of Sciences 119.12 (2022): e2120821119.
2. 