# Spectrophotometric Analysis
This notebooks is written to analyze data from a colorimetric techniqes. The data must be in the proper format to read by this script.

In [None]:
import numpy as np
import pandas as pd
from IPython import display
from scipy.optimize import curve_fit

# How many samples do you have? What kind of experiment was this?

In [None]:
# replace the value with the number of samples you have to analyze
samples = 18

# name the experiment 'bacteria' or 'mineral'
experiment = 'bacteria'

# creation of a dataframe to hold your data plus one additional line for experimental parameters
df = pd.DataFrame(
    index=range(0, samples + 1),
    columns=[
        "solidconc_gl",
        "std",
        "abs",
        "sample_name",
        "pH",
        "sampabs",
        "tot_zn_ppm",
    ],
)

# Data input
Replace the data for `sample_name`, `pH`, `std`, `abs`, and `sampleabs` with your own data. Other parameters are already filled in for you.

In [None]:
# Solid concentration in your experiment (already correct)
solidconc = 5
# Total concentration of zinc in the experiment (already correct)
tot_zn_ppm = 3
# Zinc concentrations of the standards (already correct)
standards = [0.46, 1.03, 3.02, 6.04]
# Replace the data with the absorbance of your standards, lowest concentration to highest
stdabs = [0.012, 0.034, 0.112, 0.230]
# Replace the data with your sample names, be sure the names are within quotes
sample_name = ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11"]
# Replace the data with your sample pH values
pH = [2.25, 2.33, 3.05, 4.33, 4.84, 5.04, 5.06, 5.77, 5.99, 6.16, 6.83]
# Replace the data with the absorbance values for your samples
sampabs = [0.235, 0.220, 0.234, 0.17, 0.123, 0.055, 0.054, 0.038, 0.009, 0.001, 0.004]

# Check input data
Run the next cell and make sure it looks like the example data above

In [None]:
df.loc[0, "solidconc_gl"] = solidconc
df.loc[0, "tot_zn_ppm"] = tot_zn_ppm
df.loc[0:3, "std"] = standards
df.loc[0:3, "abs"] = stdabs
df.loc[0 : len(sample_name) - 1, "sample_name"] = sample_name
df.loc[0 : len(sample_name) - 1, "pH"] = pH
df.loc[0 : len(sample_name) - 1, "sampabs"] = sampabs
df

# Calibration curve
Now that the data is uploaded we need to perform a linear regression to create the calibration curve. We will use the Beer-Lambert law (or Beer's law), where $A$ is absorbance, $\epsilon$ is the molar absorptivity, $l$ is path length, and $C$ is concentration, to create our calibration curve.  

$$A=\epsilon{l}C$$

We can see that Beer's law is the equation for a straight line with an intercept of zero. Theoretically this makes sense in that if there are no absorbing molecules present then the absorbance should be zero. Realistically this doesn't happen. For example, analytical instruments don't behave ideally (i.e., there is always some background noise in signals) and blank substractions aren't perfect. So our best fit line will include an intercept. A calibration shouldn't go through the origin unless it demonstrated that the intercept is not statistically different than zero by calculating a t-ratio. For our spectrophotometric analysis the intercept will be statistically different than zero so we must include it.

There are many different ways to fit linear models in Python. I've chosen to use the SciPy module just demonstrate a different way of fitting than is found in the ferrozine analysis script.

In [None]:
# Use SciPy to determine our calibration curve


# define a function for the equation of a straight line
def func(x, p1, p2):
    return x * p1 + p2


# rename the calibration data x and y and select range with data leaving out the nans
x = df["std"][0:4]
y = df["abs"][0:4]
# run the regression and save the output parameter (slope and intercept) in popt and save the covariance matrix in pcov
popt, pcov = curve_fit(func, x, y, p0=[0.003, 0])
# calculate the coefficient of determination also known as r^2
residuals = y - func(x, popt[0], popt[1])
ss_res = np.sum(residuals**2)
ss_tot = np.sum((y - np.mean(y)) ** 2)
r_squared = 1 - (ss_res / ss_tot)

# intercept
b = popt[1]
# slope
m = popt[0]
# print fit parameters and error
print("Calibration equation\t r-squared")
print("y = %.4f*x + %.4f\t %f" % (m, b, r_squared))
print("\n")
# inverse prediction of concentration from absorbance measurements
df["sampconc"] = (df["sampabs"] - b) / m
print(df)

# Zinc adsorbed
Now you have the sample concentration, which in this case is zinc concentration in ppm. To calculate the amount of zinc adsorbed to the geomedia (bacteria or Hfo) we need to substract the concentration of zinc remaining in solution from the total concentration of zinc in the experiment. You'll be given total zinc with your dataset.

In [None]:
# total zinc in this experiment was 6.1 ppm and located in the data file
totZn = df["tot_zn_ppm"][0]
# zinc adsorbed
df["Znads"] = totZn - df["sampconc"]
# percent Zn adsorbed
df["Znadsper"] = df["Znads"] / totZn * 100
# Automatically remove nonsensical data
df = df.drop(df.index[df['sampconc'] < 0])
df = df.drop(df.index[df['sampconc'] > df['tot_zn_ppm'][0]])
    

# Now I want to plot this data and save the plot as a pdf to include in a Latex document.
# I'll need to import a some modules for the plotting and downloading the plot.
import matplotlib.pyplot as plt

# to make the plots look nicer we will import the seaborn module
import seaborn as sns

sns.set()

# Plotting the data using the variables I defined earlier, I also give this dataset a label to be used in a legend
plt.scatter(df["pH"], df["Znadsper"])

# Label the x and y axes
plt.xlabel("pH")
plt.ylabel("Zn (% adsorbed)")
plt.ylim([0,100])

# Save and download the plot
plt.savefig("ads_plot.pdf")
plt.savefig("ads_plot.png")

# save the dataframe
df.to_csv("df_"+experiment+".csv")