# Interpolating and fitting data using Python


Often, data are provided in the form of plots.  While this is useful if we are interested in qualitative trends, this makes using the data in calculations awkward.  In these situations, we can use digitization programs, such as
[Web Plot Digitizer](https://apps.automeris.io/wpd/)
to convert images into tables of data, which as more convenient to utilize.

In this workbook, we will be using Python to interpolate and fit data that are provided in tabular format, such as in CSV files.  This encodes the data in a form that can be easily evaluated by the computer and subsequently used in more sophistocated manners.

First we cover fitting of mathematical models using the method of least squares.  This will use the optimization routine `minimize` from the `scipy.optimize` library.  Then we will look at interpolating data using cubic splines.  This utilizes the routines `splrep` and `splev` from the `scipy.interpolate` library.



## Least squares fit of data



Let's suppose that based on some theory, we know that data should follow the following mathematical expression 
$$
y(x) = A e^{-a x} + b
$$
where $A$, $a$, and $b$ are parameters of the model, which are unknown to us.

In [None]:
import numpy as np
import pylab as plt
from scipy.optimize import minimize


x_data = np.linspace(0.0, 10.0, 100)
y_exact = [np.exp(-x) for x in x_data]
sig = 0.1
#yerr = np.random.normal(0.0, sig, size=len(y_exact)
y_data = [y + np.random.normal(0.0, sig) for y in y_exact]


plt.plot(x_data, y_data, 'o')
plt.show()




In [None]:
def model(x, params):
    A, a, b = params
    return A*np.exp(-a*x) + b


def fobj(params):
    err2 = 0.0
    for x, y in zip(x_data, y_data):
        err = y - model(x, params)
        err2 += err*err
    return err2    


params0 = [10, 3, 1]

popt = minimize(fobj, params0)

print(popt)
print(popt['x'])

x_fit = np.linspace(0.0, 10.0, 100)
y_fit = [model(x, popt['x']) for x in x_fit]
plt.plot(x_fit, y_fit)
plt.plot(x_data, y_data, 'o')
plt.show()
    

## Interpolation of data




In [None]:
x_data = np.random.uniform(0.0, 10.0, 20)
x_data.sort()
y_data = [np.sin(x)*np.exp(-x) for x in x_data]


plt.plot(x_data, y_data, 'o')
plt.xlabel(r'$x$')
plt.ylabel(r'$y$')
plt.show()

In [None]:
from scipy.interpolate import splrep, splev, splint


fit = splrep(x_data, y_data)

x_interp = np.linspace(0.0, 10, 100)
y_interp = [splev(x, fit) for x in x_interp]
y_exact = [np.sin(x)*np.exp(-x) for x in x_interp]

plt.plot(x_data, y_data, 'o')
plt.plot(x_interp, y_interp, label='cubic interpolation')
plt.plot(x_interp, y_exact, color='blue', ls='dashed', label='exact')
plt.xlabel(r'$x$')
plt.ylabel(r'$y$')
plt.legend()
plt.show()



We can also obtain integrals over our spline fit function.  To get the integral between points $a$ and $b$ of the interpolated function, we just type: 

`splint(a, b, fit)`

This will return the value of the integral of the function given by `fit` between the limits $a$ and $b$.  

In [None]:
def Y(x):
    return -0.5*np.exp(-x)*(np.sin(x)+np.cos(x))

x0 = 5

yint_interp = [splint(x0, x, fit) for x in x_interp]
yint_exact = [Y(x) - Y(x0) for x in x_interp]

plt.plot(x_interp, yint_interp, label='cubic interpolation', color='orange')
plt.plot(x_interp, yint_exact, color='blue', ls='dashed', label='exact')
plt.xlabel(r'$x$')
plt.ylabel(r"$\int_{x_0}^x dx'\,y(x')$")
plt.legend()
plt.show()




In [None]:
y1_interp = [splev(x, fit, der=1) for x in x_interp]
y1_exact = [np.cos(x)*np.exp(-x)-np.sin(x)*np.exp(-x) for x in x_interp]

plt.plot(x_interp, y1_interp, label='cubic interpolation', color='orange')
plt.plot(x_interp, y1_exact, color='blue', ls='dashed', label='exact')
plt.xlabel(r'$x$')
plt.ylabel(r'$dy/dx$')
plt.legend()
plt.show()

## Fitting adsorption isotherms

In this exercise, we will examine adsorption data for water vapor in zeolite 13X as published by [Zabielska et al. (2020)](https://dx.doi.org/10.24425/cpe.2020.132542).  

To begin, digitize the data shown in Fig. 3 of the paper, using [Web Plot Digitizer](https://apps.automeris.io/wpd/), to create a CSV file with the adsorption of water $q$ (in units of mol/kg) as a function of partial pressure (in units of Pa) for different temperatures.  



Once you have uploaded the adsorption data, first try to fit the various isotherms with the Langmuir model: 
$$
\begin{align*}
\frac{q}{q_{\rm max}}
&= \frac{K p}{1+K p}
\end{align*}
$$
where the parameters of the model are $q_{\rm max}$, the maximum adsorption, and $K$, an adsorption constant related to the free energy of adsorption.

In [None]:
import pandas as pd
from scipy.optimize import curve_fit, minimize


# Langmuir isotherm
def get_q(p, params):
    qmax, K = params
    q = qmax*
    p = x / (1.0-x)**a  / (a*K)
    return q


def fobj(params):
    err2 = 0.0
    for p, q in zip(p_data, q_data):
        err = q - get_q(p, params)
        err2 += err*err
    return err2


params0 = [qmax, K]
popt = minimize(fobj, params0)




plt.plot(p_data, q_data, 'o')
plt.show()



Now try to fit the adsorption data with the BET isotherm
$$
\begin{align*}
\frac{q}{q_{\rm max}}
&= \frac{c p}{(1-p_0)(p_0+p(c-1))}
\end{align*}
$$
where the parameters of the model are $q_{\rm max}$, the adsorption of a monolayer, $p_0$, the condensation pressure, and $c$, the BET constant. 

## Interpolation of the enthalpy of mixtures

The figure below plots the specific enthalpy of mixtures of sulfuric acid and water at various temperatures.

![Enthalpy of mixing](./sulfuricacid-water.png)


In this exercise, we will first digitize the data in the figure, isotherm by isotherm using 
[Web Plot Digitizer](https://apps.automeris.io/wpd/), and download the data in a CSV file.  Afterward, upload the data into a Pandas dataframe.  Note it will be convenient for you to change the header of the CSV file before you try to upload it using Pandas.





1. Plot the variation of the heat capacity with temperature of a 80wt% sulfuric acid solution.  Use SI units.  Note that $1\,{\rm Btu}=1055.06\,{\rm J}$, and $1\,{\rm lb}=0.453592\,{\rm kg}$.

2. One kilogram of water at $0^\circ{\rm C}$ is mixed with one kilogram of an 90wt% sulfuric acid solution at $25^\circ{\rm C}$.  If this is performed adiabatically, what is the final temperature of the mixture?

### Questions