# Scaling Fluorescence Analysis

In this notebook we scale the fluorescence analysis :rocket: to distinguish aerobic and anaerobic growth conditions using a GFP fluorescence singal.

## Defining image sequences

Here we have to types of experiments with `aerobic` and `anaerobic` cultivation conditions. We have selected 5 images sequences.

In [None]:
aerobic_images = [17406, 17407, 17408, 17409, 17410]
anaerobic_images = [17411, 17412, 17413, 17414, 17415]

## Helper function to execute analyses on images

We write a helper function that executes the fluorescence analysis for the different image sequences.

In [None]:
import papermill as pm
import shutil
import os

from pathlib import Path
from datetime import datetime

# this is a utility function
def analyze_image(script_to_execute, image_id, timestamp=None):
    # create the storage folder
    stem = Path(script_to_execute).stem
    if timestamp is None:
        timestamp = datetime.today()

    output_path = Path("./automated_executions") / stem / timestamp.isoformat()
    execution_path = output_path / f"execution_{image_id}"

    # the parameters for the notebook
    parameter_list = dict(input_image_file=str((Path("data") / f"{image_id}.tif").absolute()), output_path=str(execution_path.absolute()))

    notebook_file = execution_path / "notebook.ipynb"

    os.makedirs(Path(notebook_file).parent, exist_ok=False)
    shutil.copy(script_to_execute, notebook_file)

    pm.execute_notebook(
        notebook_file,
        notebook_file,
        parameters = parameter_list,
        cwd=notebook_file.parent
    )

## Execute the segmentation and data extraction for all images

Now, we execute our analysis script including segmentation and data extraction on all selected image sequences!

In [None]:
from tqdm.auto import tqdm

now = datetime.today()

script_to_execute = "FluorescenceAnalysis.ipynb"

for image_id in tqdm(aerobic_images + anaerobic_images):
    analyze_image(script_to_execute, image_id, timestamp=now)

# Analyze the outcomes

### 1. Make units ready

In [None]:
from acia import ureg
from pint import set_application_registry
import pint_pandas

set_application_registry(ureg)
# define fluorescence unit
ureg.define("fluorescence = [au] = fluor = fluorescence")

### 2. Define helper function to collect the data from analyses

In [None]:
import numpy as np

def collect_data(image_ids):
    datas = []
    for image_id in image_ids:
        import pandas as pd
        
        print(image_id)

        data = pd.read_csv(Path("") / "automated_executions" / Path(script_to_execute).stem / now.isoformat() / f"execution_{image_id}" / "allcells.csv", decimal=',', sep=';', index_col=0)
        data.iloc[0]["my_area"] = "pixel ** 2"
        units = data.iloc[0]
        data = data.drop(["unit"])
        
        data = data.applymap(lambda val: val.replace(',', '.'))
        
        # convert to floats
        data = data.astype({col:"float" for col, unit in zip(data.columns, np.array(units))})
        
        # convert to pint units
        data = data.astype({col:f"pint[{unit}]" for col, unit in zip(data.columns, np.array(units))})

        data["image_id"] = image_id
        datas.append(data)
        
    return pd.concat(datas)    

### 3. Extract data

We collect all the informaiton from the analyzed image sequences.

In [None]:
# collect all the aerobic data
dataset = collect_data(aerobic_images)

# collect all the anaerobic data
an_dataset = collect_data(anaerobic_images)

### 4. Visualize data

Now we visualize the development of anerobic and aerobic experiments by plotting their GFP development over time. The error bars indicate the spread of GFP in cells at the timestep.

In [None]:
# define the maximum time we consider
time_limit = 9  # in hours

In [None]:
import pandas as pd

# prepare aerobic
frame = pd.DataFrame({'time': np.array(dataset["time"].pint.magnitude), 'gfp': np.array(dataset["mean gfp / area"].pint.magnitude)})
frame = frame[frame['time'] <= time_limit]

# prepare anaerobic
an_frame = pd.DataFrame({'time': np.array(an_dataset["time"].pint.magnitude), 'gfp': np.array(an_dataset["mean gfp / area"].pint.magnitude)})
an_frame = an_frame[an_frame['time'] <= time_limit]

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt
fig, axes = plt.subplots(2, 1, figsize=(15, 20))

# plot aerobic
axes[0].axvspan(0, 9, facecolor='#add8e6', alpha=0.25)
sns.lineplot(data=frame, x="time", y="gfp", errorbar="sd", err_style="bars", color="green", ax=axes[0], marker='o')
plt.xlim((0, time_limit))
axes[0].set_title("Aerboic GFP development", fontsize=20)
axes[0].set_ylabel(r"GFP fluorescence [$\frac{a.u.}{pixel^2}$]", fontsize=15)
axes[0].set_xlabel(f'Time [${dataset["time"].pint.u:~L}$]', fontsize=15)
axes[0].set_xlim((0, time_limit))


# plot anaerobic
import seaborn as sns
axes[1].axvspan(8, 9, facecolor='#add8e6', alpha=0.25)
axes[1].axvspan(0, 8, facecolor='#e9baaa', alpha=0.25)
sns.lineplot(data=an_frame, x="time", y="gfp", errorbar="sd", err_style="bars", color="green", ax=axes[1], marker='o')
axes[1].set_xlim((0, time_limit+.05))
#axes[1].set_ylim((0,70))
axes[1].set_ylabel(r"GFP fluorescence [$\frac{a.u.}{pixel^2}$]", fontsize=15)
axes[1].set_title("Anaerobic GFP development", fontsize=20)
axes[1].set_xlabel(f'Time [${an_dataset["time"].pint.u:~L}$]', fontsize=15)

fig.patch.set_facecolor('white')
plt.tight_layout()
plt.savefig("result.png")