# Grouping 

In [None]:
%run "./workflow.ipynb"

Grouping By Biological Replicate. You can manually change the things to group by, qualities you want to retain. 

In [None]:
group =["bio rep"]
by_bio_rep = (with_mu.groupby(group+["time_hours"], as_index=False)
    .agg(
        strain =("strain", "first"),
        od_mean=("od", "mean"),
        od_std=("od_smooth", "std"),
        mu_mean=("mu", "mean"),
        mu_std=("mu", "std"),
        n=("od_smooth", "size"),
    )
)
by_bio_rep.head()

If you want to further group by condition, you can, but it's important to first group by bio rep and only then by condition -- to give each bio rep an equal weight even if the number of wells is not equal (let's say you removed some)

In [None]:
group2 =["strain"]
by_strain = (with_mu.groupby(group2+["time_hours"], as_index=False)
    .agg(
        od_mean=("od", "mean"),
        od_std=("od_smooth", "std"),
        mu_mean=("mu", "mean"),
        mu_std=("mu", "std"),
        n=("od_smooth", "size"),
    )
)

# Plotting Helpers

In [None]:
from plot_helpers import *

`plot_grouped`

plots by group on the same axes 

`df`: The data frame you want to draw from. If you want only particular wells/timepoints/conditions, you need to create a mask first (shown). 

`group_by`: colors by 

`plot_by`: If provided, it gives unique line to each (lower level of orginization like "well" or "bio rep" if grouped by bio rep already). if not provided it creates a mean with standard deviation shadows. 

`ylim`, `xlim`: Optional. lower and upper bounds for axis in case you want to change it manually.

In [None]:
threshold_od = 0.01 # Set your OD threshold if you want one
mask = with_mu["od_smooth"] > threshold_od

plot_grouped(df=with_mu[mask],
             group_by="strain",
             x_col="time_hours",
             y_col="od_smooth",
             x_label="time (hours)",
             y_label="mu",
             title="OD vs Time for selected wells",
             plot_by= None,
             marker=None,
             log=True,
             font_size=14,
             ylim=None,
            save=True,
            palette= ["green", "orange", "red"]
             )

`plot_seperate_by_group`

If you want to plot different conditions in the same plot but seperately:

`group_by` : each group gets its own subplot

`df`: your FataFrame

`title`: one title for evreyone

`plot_by`: draw individual lines per plot_by within each group_by. for example, within each strain, draw individual lines per well. if remains empty, a single mean line is created with the standard deviation shadowed. Notice you can only go 'time' level of aggregation (to plot by bio rep in each strain, you'd have to first aggregate df by bio rep)

`log`: if True, plots in log scale 

`col_wrap` : number of columns inthe subplot grid

`sharey` : share the y axis across panels (uniform scale)


In [None]:

plot_separate_by_group(
    group_by="strain",
    df=with_mu[mask],
    x_col="time_hours",
    y_col="od_smooth",     
    x_label="Time (hours)",
    y_label="OD",
    title="OD vs Time by Strain",
    plot_by=None,     
    marker=None,
    log=False,
    font_size=9,
    col_wrap=3, 
    sharey=True,
    save=True,
    palette= ["green", "orange", "red"]
)