Step 2 - Plot Conditions

This script provides a code for plotting a lineplot for all strains with certain modification in one subfigure (plotting mean values from raplicates), and deviding the subfigures according to the applied environmental conditions.

This is a first visualisation of your data, which is designed to help you get a quick glance of the results you obtained :)

You need to run the following cell only once. Than you can hash it (#), because it takes long time to install again the package, which you don't need to waste.

In [11]:
import altair as alt
from datetime import datetime
import glob
import pandas as pd
import os
import vl_convert as vlc

If you want to use the latest excel file generated by the 'Step 1' script, you don't need to change anything in the following cell.

However, you might want to go back to some previous files generated by you - in such case mark the first part of the script with triple quotes, unquote the second part, and enter the filename that interests you.

In the output of the following cell you can see which file is used for the analysis.

In [12]:
# Import the data

IMPORT_PATH = os.path.join(os.getcwd(), "output_data")

# Use the latest excel file as an input (default option)
available_results = glob.glob(os.path.join(IMPORT_PATH, '*growth_data.xlsx'))
latest_file = max(available_results)
data = pd.read_excel(latest_file)

'''
Or enter the exact filename generated in the 01_data_import.ipynb script as an input

data = pd.read_excel('XXX_growth_data.xlsx')
'''

print(latest_file)

\\wsl.localhost\Ubuntu\home\marysia\stress_resistance_msc\output_data\2025-05-13_growth_data.xlsx


In [13]:
data_copy = data.copy()

# Separate control data
control_data = data_copy[data_copy['modification'] == 'control']

# Duplicate control data for each modification
control_for_oe = control_data.copy()
control_for_oe['plot_config'] = 'OE'

control_for_oe_prot = control_data.copy()
control_for_oe_prot['plot_config'] = 'OE_prot'

control_for_ko = control_data.copy()
control_for_ko['plot_config'] = 'KO'

# Prepare OE and KO data with new facet group
oe_data = data_copy[data_copy['modification'] == 'OE'].copy()
oe_data['plot_config'] = 'OE'

oe_prot_data = data_copy[data_copy['modification'] == 'OE_prot'].copy()
oe_prot_data['plot_config'] = 'OE_prot'

ko_data = data_copy[data_copy['modification'] == 'KO'].copy()
ko_data['plot_config'] = 'KO'

# Combine everything
data_for_plotting = pd.concat([control_for_oe, control_for_ko, control_for_oe_prot, oe_data, ko_data, oe_prot_data], ignore_index=True)

In [14]:
width = 300
height = 300

In the following cell you can enter the order in which you want your rows (type of modification) and columns (variant of condition) to appear.
By default is will be displayed alphabetically. You can just pass the empty brackets and it will change to default, or specify the order.
To do so, you need to enter the exact names of modifications/conditions in quotes, separated by a comma, e.g ['test3', 'test', 'test2'].

To prompt you from what values you can choose, the following code will also print all the available modifications and conditions (but don't include 'control' in specifying rows, as it is plotted in each row regardless)

In [15]:
print(data['modification'].unique())
print(data['condition'].unique())

row_order = ['OE', 'OE_prot', 'KO']
column_order = []

['control' 'OE' 'KO' 'OE_prot']
['test' 'test2' 'test3']


In [18]:
base = alt.Chart(data_for_plotting).encode(
    x=alt.X('time:Q', title='Time [h]'),
    y=alt.Y('average(growth):Q', title='Average Growth [OD\u2086\u2080\u2080]'),
    color=alt.Color('TF:N', title='TF')
).properties(
    width=200,
    height=200
)

# Step 3: Control layer
control_layer = alt.Chart(data_for_plotting).transform_filter(
    alt.datum.modification == 'control'
).mark_line(strokeDash=[4, 4], color='black').encode(
    x=alt.X('time:Q', title='Time [h]'),
    y=alt.Y('average(growth):Q', title='Average Growth [OD\u2086\u2080\u2080]'),
    detail='TF:N'
)

# Step 4: Modified strains layer (OE/KO)
mod_layer = base.transform_filter(
    alt.datum.modification != 'control'
).mark_line(point=True)

# Step 5: Layer and facet using mod_facet_group
layered = alt.layer(mod_layer, control_layer)

chart = layered.facet(
    row=alt.Row('plot_config:N', sort = row_order, title='Modification'),
    column=alt.Column('condition:N', sort = column_order, title='Condition'),
    spacing=10
).configure_view(
    continuousWidth=width,
    continuousHeight=height
).resolve_scale(
    y='independent'
)

chart

You can change the name of the figure in the following cell, as well as the number of pixels per inch (ppi).
You need to remember that each time you run the code the saved figure will overwrite itself, so if you want to keep the previous versions of your figures you have to change the name each time.

In [17]:
VISUALISATION_PATH = os.path.join(os.getcwd(), "visualisations")

current_date = datetime.now().strftime("%Y-%m-%d")
output_filename = f"{current_date}_figure_1.png"
output_path = os.path.join(VISUALISATION_PATH, output_filename)

chart.save(output_path, ppi=600)