# Scaling Analysis for Co-Culture Quantification

We have developed a co-culture analysis notebook to evaulate a single time-lapse sequences. Now we are going to scale this analysis across multiple time-lapse sequences and extract quantitative insights across multiple cell populations.

## 1. Setup

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
# Install dependencies

%pip uninstall acia -y
%pip install acia==0.3.0

# dependencies for Cellpose segmentation
%pip uninstall -y cellpose
%pip install --use-pep517 git+https://www.github.com/mouseland/cellpose.git@8ef88040d9aec85737e12c3f2c2969ecf149f7f0

## Parameters

In [None]:
from pathlib import Path

analysis_script = str(Path("../../case_studies/02_FluorescenceCoCulture/FluorescenceLabeling.ipynb").absolute().resolve())

In [None]:
import os
print(os.getcwd())

In [None]:
import os

# place to store the data
dataset_folder = Path("02_CoCulture")

# make sure the data exists (otherwise download)
if not dataset_folder.is_dir():
    !wget -O 02_co_culture_dataset.zip https://fz-juelich.sciebo.de/s/V1Mo0VTRYDuDy2r/download
    !unzip 02_co_culture_dataset.zip

## 1.2 Specify the analysis script

Now you have to specify the name of the analysis script you want to apply to the image data.

**Note:** If the analysis script is not located in the same folder you need to specify the path to it.

In [None]:
print(Path(analysis_script).resolve().absolute(), Path(analysis_script).exists())
assert Path(analysis_script).exists(), f"The notebook '{analysis_script}' does not exist!"

# 2. Information about the underlying data

We summarize the amount of underlying data

In [None]:
image_ids = [str(p.absolute()) for p in dataset_folder.glob("*.tiff")]

## TODO: give an overview about the data
print(image_ids)

In [None]:
#!rm -r automated_executions
#!rm -r 02_CoCulture

# 3. Scale the analysis script to all image sequences

Now we apply the analysis script to every image sequence individually 🚀! You can lean back and enjoy the working computer 😎 🥂

**Note:** For heavy analysis scripts or for larget `datasets` this process may take a while (from minutes to hours or days). The top-level progress bar will indicate the total progress and give you an indication how long this will take. For large image data volumes we can recommend execution over night 🌔!

In [None]:
os.environ["JYPN_NO_DEP_INSTALL"] = "True"

In [None]:
from datetime import datetime
from pathlib import Path
from acia.analysis import scale

# set the base path for all results
stem = Path(analysis_script).stem
output_path = Path("./automated_executions")

print(f"Results are stored in: {output_path.absolute()}")

In [None]:

# scale your analysis script to many images
result = scale(
    output_path,
    analysis_script=analysis_script,
    image_ids=image_ids,
    exist_ok=True,
    execution_naming=lambda iid: f"execution_{Path(iid).stem}",
    kernel_name="python3")

# 4. Inspect your analysis results


In [None]:
import pandas as pd
import cv2

df_growth_estimates = []
df_all_cells = []
im_summaries = []

for res_path in sorted(output_path.glob("execution_*")):
    
    df_ge = pd.read_csv(res_path / "output" / "growth_estimates.csv")
    
    df_growth_estimates.append(pd.read_csv(res_path / "output" / "growth_estimates.csv"))
    im_summaries.append(cv2.imread(str(res_path / "output" / "complete_summary.png")))
    
    df_ac = pd.read_csv(res_path / "output" / "allcells.csv")
    df_ac["image_id"] = df_ge.iloc[0]["image_id"]
    
    df_all_cells.append(df_ac)
    
df_growth_estimates = pd.concat(df_growth_estimates)
df_all_cells = pd.concat(df_all_cells)

In [None]:
df_all_cells.loc[df_all_cells.label_name == "e2_crimson", "label_name"] = "e2-crimson"

In [None]:
df_areas = df_all_cells.groupby(["image_id", "time", "label_name"]).agg("sum").reset_index()
df_areas

In [None]:
df_fl = df_all_cells.groupby(["image_id", "time", "label_name"]).agg("mean").reset_index()
df_fl.loc[df_fl.label_name == "e2_crimson", "label_name"] = "e2-crimson"
df_fl

In [None]:
import matplotlib.gridspec as gridspec
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
#fig, axes = plt.subplots(3, 1, figsize=(3, 7))

fig = plt.figure(figsize=(3, 7))

gs0 = gridspec.GridSpec(2, 1, figure=fig)

gs00 = gridspec.GridSpecFromSubplotSpec(2, 1, subplot_spec=gs0[0:2], hspace=0.05)

ax1 = fig.add_subplot(gs00[0])
ax2 = fig.add_subplot(gs00[1])

#gs00.tight_layout(fig)

#ax3 = fig.add_subplot(gs0[2])

axes = [ax1, None, ax2]

plt.setp(ax1.get_xticklabels(), visible=False)


colors = {"mvenus": "blue", "e2-crimson": "red"}

image_ids = np.unique(df_areas["image_id"])
label_names = np.unique(df_areas["label_name"])

for image_id in image_ids:
    for label_name in label_names:
        local_df = df_areas[(df_areas.image_id == image_id) & (df_areas.label_name == label_name)]
        axes[0].plot(local_df["time"], local_df["area"], color = colors[label_name], linewidth=1)
        
axes[0].set_ylabel("TSCA [$\mu m^2$]")
axes[0].set_yscale("log")
axes[0].grid(True)

#axes[1].set_ylabel("Average Fluorescence Intensity\n[a.u.]")
#axes[1].grid(True)
#axes[1].set_xlabel("Time [h]")
        
#for image_id in image_ids:
#    for label_name in label_names:
#        local_df = df_fl[(df_fl.image_id == image_id) & (df_fl.label_name == label_name)]
#        axes[1].plot(local_df["time"], local_df[label_name], color = colors[label_name], linewidth=1)
        
sns.boxplot(df_growth_estimates[df_growth_estimates.label_name=="mvenus"], x="label_name", y="mu", ax=axes[2], color="gray")
sns.stripplot(df_growth_estimates[df_growth_estimates.label_name=="mvenus"], x="label_name", y="mu", ax=axes[2], color="blue")
sns.boxplot(df_growth_estimates[df_growth_estimates.label_name=="e2_crimson"], x="label_name", y="mu", ax=axes[2], color="gray")
sns.stripplot(df_growth_estimates[df_growth_estimates.label_name=="e2_crimson"], x="label_name", y="mu", ax=axes[2], color="red")
#sns.stripplot(df_growth_estimates, x="label_name", y="mu", ax=axes[2])

axes[2].set_ylabel("$\mu_{TSCA}$ [$h^{-1}$]")
axes[2].grid(True)
axes[2].set_xlabel("Labeled strain")

axes[2].set_xticklabels(["mVenus", "E2_Crimson"])

plt.tight_layout()

plt.savefig("summary.png", dpi=300)

# 5. Generate Summary Statistics

In this section you can generate your custom summary statistics that combine the results of all experiment analyses. Just design the analysis script that you scaled above such that it outputs the results into a local files. Here, these results can be loaded, merged together and further processed or visualized!

In [None]:
fig, axes = plt.subplots(1, 8, figsize=(14*8, 20))

for ax, im in zip(axes, im_summaries):
    ax.imshow(cv2.cvtColor(im, cv2.COLOR_RGB2BGR))
    ax.axis("off")
    
plt.savefig("total_summary.png")
plt.savefig("total_summary.pdf")

## 🔁 Reproducibility Information

pip and conda environment details

In [None]:
%pip freeze

In [None]:
%mamba env export