# Plotting 2.0

Last session, 003_Plotting, we introduced scatterplots. We will continue in this session with more real data and more complex scatter plot configurations, including:
1) Multiple axes
2) Error bands (vs error bands)
3) Saving your figure

__Additional Concepts__:
- Skipping rows when reading in data(frames)
- MultiIndexing in headers (https://pandas.pydata.org/docs/user_guide/advanced.html)

*Last edited: Isabella Casini 20.06.2025*

## 1) Import required packages

In [None]:
# import relevant packages (entire line is commented out)

import pandas as pd # call pandas "pd" for short (midline comment)

import matplotlib.pyplot as plt # import pyplot from matplotlib and call it "plt"

import numpy as np

## 2) Download data to graph

Go to the following link and save the Data_S3.xlsx file (*remember the location you save it*).

https://github.com/isacasini/Casini_2023_GEM/tree/main

Take a few minutes to look at the "ReadMe" sheet (in Excel).

## 3) Read in data of interest as a Pandas dataframe

We will use "Sheet14" (biomass - gCDW), "Sheet17" (hydrogen specific uptake),"Sheet19" (carbon dioxide specific uptake), "Sheet21" (methane specific production).

The sheets are not very clean, so we need to alter parameters how to read in our data in a useful fashion.

In [None]:
# Path to the file (change your path to where you save your file)
pathin = r"C:\Users\uqicasin\Documents\Teaching\Program_Workshop\Sample_Data\Data_S3.xlsx"

### 3.1) Read in biomass data

In [None]:
# Read in the biomass data, ("skiprows" ->drop the first two rows; set the next two rows as combined header - MULTIINDEXING)
gcdw_df = pd.read_excel(pathin, sheet_name='Sheet14', skiprows=[0,1], header=[0,1])

In [None]:
# Take a look at the dataframe
gcdw_df

In [None]:
# Look at the column names
print(gcdw_df.columns)

### 3.2) Extract out biomass data that we want to graph

In [None]:
# Note the structure of the column names: ('level1', 'level2')
columns_to_copy = [('Reactor','Elasped Days'),('ΔHAVG','gCDW'),('ΔHSTD','gCDW'),('ZZAVG','gCDW'),('ZZSTD','gCDW'),('MMAVG','gCDW'),('MMSTD','gCDW')]
gcdw_graph_df = gcdw_df[columns_to_copy].iloc[:9].copy() # select the rows with .iloc

In [None]:
# Check your dataframe
gcdw_graph_df

### 3.3) Read in H2 Data

In [None]:
# Read in the biomass data, ("skiprows" ->drop the first two rows, and the third one; no combined header)
h2_df = pd.read_excel(pathin, sheet_name='Sheet17', skiprows=[0,1,3], header=[0])

In [None]:
# Check your dataframe
h2_df

In [None]:
# Check the columns, when the dataframe was read in, suffixes were added to the column to prevent duplicates
print(h2_df.columns)

# Use the duplicated command to check for duplicate columns
print(h2_df.columns.duplicated())

### 3.4) Extract out the H2 data we want to graph

Average and STDP columns for the three microbes ($\Delta$ H, ZZ, MM)

In [None]:
# Note the different quotations
columns_to_copy = ['Elasped Days','Average H2','STDP H2','Average H2.1','STDP H2.1','Average H2.2','STDP H2.2']
h2_graph_df = h2_df[columns_to_copy].iloc[:9].copy() # select the rows with .iloc


In [None]:
# Check your dataframe
h2_graph_df

In [None]:
# Rename the columns using a dictionary (oldname:newname)
rename_dict = {'Average H2':'Average H2 DH','STDP H2':'STDP H2 DH','Average H2.1':'Average H2 ZZ',
               'STDP H2.1':'STDP H2 ZZ','Average H2.2':'Average H2 MM','STDP H2.2':'STDP H2 MM'}

h2_graph_df.rename(columns=rename_dict, inplace=True)

In [None]:
# Check your dataframe
h2_graph_df

### 3.5) Read in CO2 Data

In [None]:
# Read in the biomass data, ("skiprows" ->drop the first two rows, and the third one; no combined header)
co2_df = pd.read_excel(pathin, sheet_name='Sheet19', skiprows=[0,1,3], header=[0])

In [None]:
# Check your dataframe
co2_df

### 3.6) Extract out CO2 data that we want to graph

Average and STDP columns for the three microbes ($\Delta$ H, ZZ, MM)

In [None]:
# Note the different quotations
columns_to_copy = ['Elasped Days','Average CO2','STDP CO2','Average CO2.1','STDP CO2.1','Average CO2.2','STDP CO2.2']
co2_graph_df = co2_df[columns_to_copy].iloc[:9].copy() # select the rows with .iloc


In [None]:
# Rename the columns using a dictionary (oldname:newname)
rename_dict = {'Average CO2':'Average CO2 DH','STDP CO2':'STDP CO2 DH','Average CO2.1':'Average CO2 ZZ',
               'STDP CO2.1':'STDP CO2 ZZ','Average CO2.2':'Average CO2 MM','STDP CO2.2':'STDP CO2 MM'}

co2_graph_df.rename(columns=rename_dict, inplace=True)

In [None]:
# Check your dataframe
co2_graph_df

### 3.7) Read in CH4 data

In [None]:
# Read in the biomass data, ("skiprows" ->drop the first two rows, and the third one; no combined header)
ch4_df = pd.read_excel(pathin, sheet_name='Sheet21', skiprows=[0,1,3], header=[0])

In [None]:
# Check your dataframe
ch4_df

### 3.8) Extract out CH4 data that we want to graph

Average and STDP columns for the three microbes ($\Delta$ H, ZZ, MM)

In [None]:
# Note the different quotations
columns_to_copy = ['Elasped Days','Average CH4','STDP CH4','Average CH4.1','STDP CH4.1','Average CH4.2','STDP CH4.2']
ch4_graph_df = ch4_df[columns_to_copy].iloc[:9].copy() # select the rows with .iloc

In [None]:
# Rename the columns using a dictionary to add the microbe ID (oldname:newname)
rename_dict = {'Average CH4':'Average CH4 DH','STDP CH4':'STDP CH4 DH','Average CH4.1':'Average CH4 ZZ',
               'STDP CH4.1':'STDP CH4 ZZ','Average CH4.2':'Average CH4 MM','STDP CH4.2':'STDP CH4 MM'}

ch4_graph_df.rename(columns=rename_dict, inplace=True)

In [None]:
ch4_graph_df

## 4) Graph strain ZZ data on a single graph, with multiple axes

In [None]:
# Define some graph parameters

# Colors for each compound
color_gcdw = "#00bb3e"
color_h2 = "#00a0a0"
color_co2 = "#3678fb"
color_ch4 = "#fc0966"

### 4.1) Start with biomass data

In [None]:
# Print the columns
gcdw_graph_df.columns

In [None]:
# Graph
# Create the figure
fig, ax1 = plt.subplots(figsize=(8, 5))  # set the figure dimensions inches

#-----------------------------------------------------------------------------------
# Plot Strain ZZ (gCDW) with error bars
ax1.errorbar(gcdw_graph_df[('Reactor','Elasped Days')], gcdw_graph_df[('ZZAVG','gCDW')], yerr=gcdw_graph_df[('ZZSTD','gCDW')], 
             color=color_gcdw, label="gDCW", marker='o', markersize=4, 
             linewidth=1, linestyle='-', capsize=3)

#-----------------------------------------------------------------------------------
# Remove errorbar handles from the legend by plotting invisible points for legend only
handles, labels = ax1.get_legend_handles_labels()
ax1.legend([plt.Line2D([0], [0], color=color_gcdw, marker='o', linestyle='', linewidth=1, markersize=4)],
           ["gDCW"])

#-----------------------------------------------------------------------------------
# Add grid
ax1.grid(True, which='both', linestyle='--', linewidth=0.5, alpha=0.7)

# Set axes labels and colors
ax1.set_xlabel("Time [d]")

# Label the first y-axis
ax1.set_ylabel("gDCW", color=color_gcdw)
ax1.tick_params(axis='y', labelcolor=color_gcdw)
#--------------------------------------------------------------------------------------
# Add a title
ax1.set_title("Average Biomass")

# Clean up the layout
fig.tight_layout()

# Show the plot
plt.show()


### 4.2) Add in CO2 on a second axis

In [None]:
co2_graph_df.columns

In [None]:
# Graph
# Create the figure
fig, ax1 = plt.subplots(figsize=(8, 5))  # set the figure dimensions inches

#-----------------------------------------------------------------------------------
# Plot Strain ZZ gCDW with error bars
ax1.errorbar(gcdw_graph_df[('Reactor','Elasped Days')], gcdw_graph_df[('ZZAVG','gCDW')], yerr=gcdw_graph_df[('ZZSTD','gCDW')], 
             color=color_gcdw, label="gDCW", marker='o', markersize=4, 
             linewidth=1, linestyle='-', capsize=3)

#-----------------------------------------------------------------------------------
# Plot CO2

# Create second y-axis
ax2 = ax1.twinx()

# Plot Strain ZZ CO2 with error bars
ax2.errorbar(co2_graph_df['Elasped Days'], co2_graph_df['Average CO2 ZZ'], yerr=co2_graph_df['STDP CO2 ZZ'], 
             color=color_co2, label="CO2", marker='x', markersize=4, 
             linewidth=1, linestyle='-', capsize=3)

#-----------------------------------------------------------------------------------
# Legend
# Manually create legend handles for both datasets
gcdw_handle = plt.Line2D([0], [0], color=color_gcdw, marker='o', linestyle='', linewidth=1, markersize=4)
co2_handle = plt.Line2D([0], [0], color=color_co2, marker='x', linestyle='', linewidth=1, markersize=4)

ax1.legend([gcdw_handle, co2_handle], ["gDCW", "CO2"],loc="lower right")
#-----------------------------------------------------------------------------------

# Add grid
ax1.grid(True, which='both', linestyle='--', linewidth=0.5, alpha=0.7)


# Set axes labels and colors
ax1.set_xlabel("Time [d]")

# Label the first y-axis
ax1.set_ylabel("gDCW", color=color_gcdw)
ax1.tick_params(axis='y', labelcolor=color_gcdw)

# # Label the second y-axis
ax2.set_ylabel("CO$_{2}$ [mmol/g/h]", color=color_co2)
ax2.tick_params(axis='y', labelcolor=color_co2)
#--------------------------------------------------------------------------------------
# Add a title
ax1.set_title("Average Specific Uptake and Production Rates")

# Clean up the layout
fig.tight_layout()

# Show the plot
plt.show()


### 4.3) Add CH4 on a third axis

In [None]:
# Graph
# Create the figure
fig, ax1 = plt.subplots(figsize=(8, 5))  # set the figure dimensions inches

#-----------------------------------------------------------------------------------
# Plot Strain ZZ gCDW with error bars
ax1.errorbar(gcdw_graph_df[('Reactor','Elasped Days')], gcdw_graph_df[('ZZAVG','gCDW')], yerr=gcdw_graph_df[('ZZSTD','gCDW')], 
             color=color_gcdw, label="gDCW", marker='o', markersize=4, 
             linewidth=1, linestyle='-', capsize=3)

#-----------------------------------------------------------------------------------
# Plot CO2
# Create second y-axis
ax2 = ax1.twinx()

# Plot Strain ZZ CO2 with error bars
ax2.errorbar(co2_graph_df['Elasped Days'], co2_graph_df['Average CO2 ZZ'], yerr=co2_graph_df['STDP CO2 ZZ'], 
             color=color_co2, label="CO2", marker='x', markersize=4, 
             linewidth=1, linestyle='-', capsize=3)
#-----------------------------------------------------------------------------------
# Plot CH4
# Create a third y-axis
ax3 = ax1.twinx()
ax3.spines['right'].set_position(('outward', 60))  # Offset the third axis


ax3.errorbar(ch4_graph_df['Elasped Days'], ch4_graph_df['Average CH4 ZZ'], yerr=ch4_graph_df['STDP CH4 ZZ'], 
             color=color_ch4, label="CH4", marker='v', markersize=4, 
             linewidth=1, linestyle='-', capsize=3)

#-----------------------------------------------------------------------------------
# Legend
# Manually create legend handles for both datasets
gcdw_handle = plt.Line2D([0], [0], color=color_gcdw, marker='o', linestyle='', linewidth=1, markersize=4)
co2_handle = plt.Line2D([0], [0], color=color_co2, marker='x', linestyle='', linewidth=1, markersize=4)
ch4_handle = plt.Line2D([0], [0], color=color_ch4, marker='v', linestyle='', linewidth=1, markersize=4)

ax1.legend([gcdw_handle, co2_handle,ch4_handle], ["gDCW", "CO2","CH4"],loc='upper left')
#-----------------------------------------------------------------------------------

# Add grid
ax1.grid(True, which='both', linestyle='--', linewidth=0.5, alpha=0.7)

# Set axes labels and colors
ax1.set_xlabel("Time [d]")

# Label the first y-axis
ax1.set_ylabel("gDCW", color=color_gcdw)
ax1.tick_params(axis='y', labelcolor=color_gcdw)

# # Label the second y-axis
ax2.set_ylabel("CO$_{2}$ [mmol/g/h]", color=color_co2)
ax2.tick_params(axis='y', labelcolor=color_co2)

# Label the third y-axis
ax3.set_ylabel("CH$_{4}$ [mmol/g/h]", color=color_ch4)
ax3.tick_params(axis='y', labelcolor=color_ch4)
#--------------------------------------------------------------------------------------
# Add a title
ax1.set_title("Average Specific Uptake and Production Rates")

# Clean up the layout
fig.tight_layout()

# Show the plot
plt.show()


### 4.4) Add H2 on a fourth axis

In [None]:
# Graph
# Create the figure
fig, ax1 = plt.subplots(figsize=(8, 5))  # set the figure dimensions inches

#-----------------------------------------------------------------------------------
# Plot Strain ZZ gCDW with error bars
ax1.errorbar(gcdw_graph_df[('Reactor','Elasped Days')], gcdw_graph_df[('ZZAVG','gCDW')], yerr=gcdw_graph_df[('ZZSTD','gCDW')], 
             color=color_gcdw, label="gDCW", marker='o', markersize=4, 
             linewidth=1, linestyle='-', capsize=3)

#-----------------------------------------------------------------------------------
# Plot CO2
# Create second y-axis
ax2 = ax1.twinx()

# Plot Strain ZZ CO2 with error bars
ax2.errorbar(co2_graph_df['Elasped Days'], co2_graph_df['Average CO2 ZZ'], yerr=co2_graph_df['STDP CO2 ZZ'], 
             color=color_co2, label="CO2", marker='x', markersize=4, 
             linewidth=1, linestyle='-', capsize=3)
#-----------------------------------------------------------------------------------
# Plot CH4
# Create a third y-axis
ax3 = ax1.twinx()
ax3.spines['right'].set_position(('outward', 60))  # Offset the third axis


ax3.errorbar(ch4_graph_df['Elasped Days'], ch4_graph_df['Average CH4 ZZ'], yerr=ch4_graph_df['STDP CH4 ZZ'], 
             color=color_ch4, label="CH4", marker='v', markersize=4, 
             linewidth=1, linestyle='-', capsize=3)

#-----------------------------------------------------------------------------------
# Plot H2
# Create a fourth y-axis
ax4 = ax1.twinx()
ax4.spines['left'].set_position(('outward', 60))  # Offset the third axis
ax4.yaxis.set_label_position('left')
ax4.yaxis.set_ticks_position('left')

ax4.errorbar(h2_graph_df['Elasped Days'], h2_graph_df['Average H2 ZZ'], yerr=h2_graph_df['STDP H2 ZZ'], 
             color=color_h2, label="H2", marker='s', markersize=4, 
             linewidth=1, linestyle='-', capsize=3)
#-----------------------------------------------------------------------------------
# Legend
# Manually create legend handles for both datasets
gcdw_handle = plt.Line2D([0], [0], color=color_gcdw, marker='o', linestyle='', linewidth=1, markersize=4)
co2_handle = plt.Line2D([0], [0], color=color_co2, marker='x', linestyle='', linewidth=1, markersize=4)
ch4_handle = plt.Line2D([0], [0], color=color_ch4, marker='v', linestyle='', linewidth=1, markersize=4)
h2_handle = plt.Line2D([0], [0], color=color_h2, marker='s', linestyle='', linewidth=1, markersize=4)


ax1.legend([gcdw_handle, co2_handle,ch4_handle,h2_handle], ["gDCW", "CO2","CH4","H2"],loc='lower right',framealpha=1)
#-----------------------------------------------------------------------------------
# Add grid
ax1.grid(True, which='both', linestyle='--', linewidth=0.5, alpha=0.7)


# Set axes labels and colors
ax1.set_xlabel("Time [d]")

# Label the first y-axis
ax1.set_ylabel("gDCW", color=color_gcdw)
ax1.tick_params(axis='y', labelcolor=color_gcdw)

# # Label the second y-axis
ax2.set_ylabel("CO$_{2}$ [mmol/g/h]", color=color_co2)
ax2.tick_params(axis='y', labelcolor=color_co2)

# Label the third y-axis
ax3.set_ylabel("CH$_{4}$ [mmol/g/h]", color=color_ch4)
ax3.tick_params(axis='y', labelcolor=color_ch4)

# Label the fourth y-axis
ax4.set_ylabel("H$_{2}$ [mmol/g/h]", color=color_h2)
ax4.tick_params(axis='y', labelcolor=color_h2)

#--------------------------------------------------------------------------------------
# Add a title
ax1.set_title("Average Specific Uptake and Production Rates")

# Clean up the layout
fig.tight_layout()

# Show the plot
plt.show()


## 5) Graph using a different style error bar - error bands.

__Note1: This type of graph requires numpy numbers (*e.g.* np.float64), so we need to convert out datatypes__

In [None]:
# Create the figure
fig, ax1 = plt.subplots(figsize=(8, 5))  # inches

#-----------------------------------------------------------------------------------
# Plot Biomass with error band
ax1.plot(gcdw_graph_df[('Reactor','Elasped Days')], gcdw_graph_df[('ZZAVG','gCDW')], 
             color=color_gcdw, label="gDCW", marker='o', markersize=4, 
             linewidth=1, linestyle='-')

# CONVERT TO NUMPY FLOAT (each time for the bands)
ax1.fill_between(np.float64(gcdw_graph_df[('Reactor','Elasped Days')]),
                 np.float64(gcdw_graph_df[('ZZAVG','gCDW')] - gcdw_graph_df[('ZZSTD','gCDW')]),
                 np.float64(gcdw_graph_df[('ZZAVG','gCDW')] + gcdw_graph_df[('ZZSTD','gCDW')]),
                 color=color_gcdw, alpha=0.2)
#----------------------------------------------------------------
# Plot CO2 with error band

# Create second y-axis
ax2 = ax1.twinx()

ax2.plot(co2_graph_df['Elasped Days'], co2_graph_df['Average CO2 ZZ'], 
             color=color_co2, label="CO2", marker='x', markersize=4, 
             linewidth=1, linestyle='-')

ax2.fill_between(np.float64(co2_graph_df['Elasped Days']),
                 np.float64(co2_graph_df['Average CO2 ZZ'] - co2_graph_df['STDP CO2 ZZ']),
                 np.float64(co2_graph_df['Average CO2 ZZ'] + co2_graph_df['STDP CO2 ZZ']),
                 color=color_co2, alpha=0.2)
#----------------------------------------------------------------
# Plot CH4 with error band

# Create third y-axis
ax3 = ax1.twinx()
ax3.spines['right'].set_position(('outward', 60))  # Offset the third axis

ax3.plot(ch4_graph_df['Elasped Days'], ch4_graph_df['Average CH4 ZZ'], 
             color=color_ch4, label="CH4", marker='v', markersize=4, 
             linewidth=1, linestyle='-')

ax3.fill_between(np.float64(ch4_graph_df['Elasped Days']),
                 np.float64(ch4_graph_df['Average CH4 ZZ'] - ch4_graph_df['STDP CH4 ZZ']),
                 np.float64(ch4_graph_df['Average CH4 ZZ'] + ch4_graph_df['STDP CH4 ZZ']),
                 color=color_ch4, alpha=0.2)
#-----------------------------------------------------------------------------------
# Plot H2 with error band

# Create fourth y-axis
ax4 = ax1.twinx()
ax4.spines['left'].set_position(('outward', 60))  # Offset the fourth axis
ax4.yaxis.set_label_position('left')
ax4.yaxis.set_ticks_position('left')

ax4.plot(h2_graph_df['Elasped Days'], h2_graph_df['Average H2 ZZ'], 
             color=color_h2, label="H2", marker='s', markersize=4, 
             linewidth=1, linestyle='-')

ax4.fill_between(np.float64(h2_graph_df['Elasped Days']),
                 np.float64(h2_graph_df['Average H2 ZZ'] - h2_graph_df['STDP H2 ZZ']),
                 np.float64(h2_graph_df['Average H2 ZZ'] + h2_graph_df['STDP H2 ZZ']),
                 color=color_h2, alpha=0.2)
#-----------------------------------------------------------------------------------
# Legend
# Manually create legend handles for both datasets
gcdw_handle = plt.Line2D([0], [0], color=color_gcdw, marker='o', linestyle='', linewidth=1, markersize=4)
co2_handle = plt.Line2D([0], [0], color=color_co2, marker='x', linestyle='', linewidth=1, markersize=4)
ch4_handle = plt.Line2D([0], [0], color=color_ch4, marker='v', linestyle='', linewidth=1, markersize=4)
h2_handle = plt.Line2D([0], [0], color=color_h2, marker='s', linestyle='', linewidth=1, markersize=4)


ax1.legend([gcdw_handle, co2_handle,ch4_handle,h2_handle], ["gDCW", "CO2","CH4","H2"],loc='lower right',framealpha=1)
#-----------------------------------------------------------------------------------
# Add grid
ax1.grid(True, which='both', linestyle='--', linewidth=0.5, alpha=0.7)


# Set axes labels and colors
ax1.set_xlabel("Time [d]")

# Label the first y-axis
ax1.set_ylabel("gDCW", color=color_gcdw)
ax1.tick_params(axis='y', labelcolor=color_gcdw)

# # Label the second y-axis
ax2.set_ylabel("CO$_{2}$ [mmol/g/h]", color=color_co2)
ax2.tick_params(axis='y', labelcolor=color_co2)

# Label the third y-axis
ax3.set_ylabel("CH$_{4}$ [mmol/g/h]", color=color_ch4)
ax3.tick_params(axis='y', labelcolor=color_ch4)

# Label the fourth y-axis
ax4.set_ylabel("H$_{2}$ [mmol/g/h]", color=color_h2)
ax4.tick_params(axis='y', labelcolor=color_h2)

#--------------------------------------------------------------------------------------
# Add a title
ax1.set_title("Average Specific Uptake and Production Rates")

# Clean up the layout
fig.tight_layout()

# Save the figure
# Give a file location and name and extension (.svg) --> Change yours (this is for my system)
pathout = r"C:\Users\uqicasin\Documents\Teaching\Program_Workshop\Uptake_Production_figure.svg"
fig.savefig(pathout, dpi=300, bbox_inches='tight') # give the resolution


# Show the plot
plt.show()
