# Manipulating Titration Data

We will walk through this notebook as a class, and discuss how to get your data formatted to use it in your post-lab calculations!

First, let's just take a look a the pH as a function of titrant added, and make sure our data has imported correctly!



In [None]:
import math
import matplotlib.pyplot as plt
import numpy as np


# This is how we'll import our data; it should be saved as a .csv file, in the same folder as this notebook. 
# Make sure you have volume in the first column and pH in the second column, with no headings on the data!

csv = np.genfromtxt ('sample_titration.csv', delimiter=",")
volume = csv[:,0]
pH = csv[:,1]

plt.plot(volume, pH, 'ro')

# Add labels on the x and y axis, always including units.
plt.xlabel('titrant added (mL)')
plt.ylabel('pH')


#lets just take a look!
plt.show()


        

## Finding equivalence points

First, lets take a closer look at the data.

1. Based on the graph above, do you think is acid is a monoprotic, diprotic or triprotic acid?


2. How many equivalence points should we find?


3. Can you see them? Estimate by eye approximately where you think those equivalence points are.




It can be hard to see the equivalence points, remember our pH meters can be slow to respond, and the equivalence point is where the pH is changing the fastest! We can manipulate this data to make it a little easier to find the equivalence point. There are two strategies we can use.

1. First or second derivitives
2. Gran plot

Let's start with the derivitive method!

In [None]:
from numpy import diff
# take the derivitive of pH with respect to volume

dpHdV = diff(pH)/diff(volume)

volume_update = np.delete(volume, (0), axis=0)

plt.plot(volume_update, dpHdV, 'ro')

# Add labels on the x and y axis, always including units.
plt.xlabel('titrant added (mL)')
plt.ylabel('d (pH) / d(V) ')


#lets just take a look!
plt.show()

Notice that the sharp peaks are at the equivalence points! Remember, a derivitive is just a measure of how quickly your function is changing, so it is largest where the slope of the line is largest! This makes the equivalnce points a lot easier to see! 
We can automatically report the volume of the highest peak, using a very simple numpy command; but notice it may return the 1st or the 2nd equivalence point, depending on which is higher, so be sure to keep them straight!

In [None]:
# finding the index of the highest value in the first derivitive array
ind = np.argmax(dpHdV)

# finding the volume at the same index
endpoint_volume = volume_update[ind]

print ("the endpoint volume is " + str(endpoint_volume) + " mL")

Your textbook also discusses 2nd derivitives as an option for finding an endpoint. This can work very nicely, because the endpoints are where the data crosses zero! Take a look:

In [None]:
d2pHdV = diff(dpHdV)/diff(volume_update)

volume_update2 = np.delete(volume_update, (0), axis=0)

plt.plot(volume_update2, d2pHdV, 'ro')

# Add labels on the x and y axis, always including units.
plt.xlabel('titrant added (mL)')
plt.ylabel('d(d (pH) / d(V)) / dV ')


#lets just take a look!
plt.show()

This is your second derivitive plot! Now our equivalence point is where the plot crosses zero. Unfortunantly, there is a lot of noise in the data, so optimizing the search for these inflection points is a harder computational problem than I would like to address here, but it can be a good reality check for the data you got from the first derivitive plot!

## Gran Plots

Our second method for finding endpoints is the Gran plot. This works by highlighting the region near the equivalence point and zeroing in on the slope of the line as it approaches the endpoint. <b> It is helpful to already have an idea where your endpoint is, roughly, before you begin this section. </b>


Gran plots can be helpful, because they take advantage of additional points leading up to the endpoint. Since the pH meter is least accurate near the endpoint, including these additional points in your calculation can help reduce some of the error coming from the sluggish kinetics of the pH meter itself.




In [None]:
# trim the data to the points approaching the first equivalence point
# we'll take the endpoint we found earlier, and use it to estimate where our 1st equivalence points is:



Gran_y = volume * 10**(-pH)

plt.plot(volume, Gran_y, 'ro')

# Add labels on the x and y axis, always including units.
plt.xlabel('titrant added (mL)')
plt.ylabel('d(d (pH) / d(V)) / dV ')


#lets just take a look!
plt.show()

The endpoint is where this graph begins to approach zero, in the steepest part of the curve. To fit this to a line, ideally we would use a linear regression on the data just before the endpoint (in this case, ~ 10-17 mL looks about right)



In [None]:
# trim the array to zero in on the region between the first equivalence point, and halfway before the first equivalence points

# these can be just a starting point, and you can adjust them based on the plot
trim = ind//2
start = trim//2

Gran_trim = Gran_y[start:trim]
volume_trim = volume[start:trim]

plt.plot(volume_trim, Gran_trim, 'ro')

# Add labels on the x and y axis, always including units.
plt.xlabel('titrant added (mL)')
plt.ylabel('Vb * 10^(-pH) ')


#lets just take a look!
plt.show()

Now we want the x-intercept of this plot, which will be our first equivalence point!

Remember, if $ y= mx +b $ we solve for the x-intercept by plugging in a zero for y, and solve for x. 
Rearranging, you'll get $$ x_{intercept} = \frac{-b}{m} $$

In [None]:
# getting the equation of the line

import scipy.stats

# the linear regression function in the scipy stats module returns 5 values: slope, intercept, R-squared and then two uncertainty values p and s_m
# we'll ignore the last two for the moment, since all we really need right now is the equation of the line
m, b, R2, p, s_m = scipy.stats.linregress(volume_trim, Gran_trim)

# solve for the x-intercept (where y = 0)

x_intercept = (-b/m)

print ("the equivalence point, determined by the Gran plot is " + str(x_intercept) + " mL") 

## Results and Analysis notes

Use both methods to determine an endpoint for your data, and then compare the results you get for the formula weight of your compound! Which one got you closer to the true value? 