### Neshyba & de Miguel, 2023

# An Introduction to Statistical Mechanics

## Introduction
Statistical Mechanics is the name coined by Josiah Willard Gibbs back in 1884 to describe a set of methods people were inventing at the time, which were aimed at using *microscopic* properties of substances to explain their *macroscopic* properties (see https://en.wikipedia.org/wiki/Statistical_mechanics). Nowadays, those microscopic properties tend to be atomic-level properties that we can observe from spectroscopy experiments (like IR absorption experiments), or that we can predict from calculations (e.g., by solving Schrödinger's Equation). The macroscopic properties that StatMech tries to explain span a huge range of possibilities, including heat capacities, thermoconductivity -- any bulk property you can measure. 

To use StatMech for the application we are going to tackle here, we'll need to develop some foundational tools, starting with how molecules vibrate.

## The frequency of molecular vibrations
You may recall from your O-chem days that molecular vibrations can be separated into various vibrational "modes." An example is shown in Fig. 1 -- this is called a "scissor" vibrational mode. Other modes involve stretching of bonds, others bending; in yet other cases, we can identify a kind of extended vibrational mode involving multiple molecules, vibrating synchronously with one another. (An extreme example of the latter occurs in crystals, in which case the vibrational modes are called *phonon modes* -- see https://en.wikipedia.org/wiki/Phonon). For our present purpose, the important thing is that each vibrational mode vibrates with a certain frequency, typically expressed in wavenumbers and given the symbol $\overline \nu$. A popular unit of $\overline \nu$ is $cm^{-1}$.

<p style='text-align: center;'>
<img src="http://webspace.pugetsound.edu/facultypages/nesh/Notebook/glycine in water, NH2 scissors.gif" height="275" width="275"/>
<strong>Figure 1</strong>. NH$_2$ scissors vibrational mode of aqueous glycine in which the water solvent is represented as a homogeneous dielectric medium, corresponding to a high-temperature limit. The frequency of this vibration is referred to in the text as $\overline \nu_{HT}$. 
</p>

In some circumstances, the wavenumber of the motion shown in Fig. 1 is temperature-dependent. Crazy, yes. How could such a thing happen? One way is via mechanical coupling to solvent molecules, as shown in Fig. 2. The thinking goes like this: when the temperature is low, molecules are moving around slowly enough that the motion of some of the surrounding solvent molecules has become synchronized with the motion of the solute. Since this "delocalization" of motion is different from the motion of an isolated molecule, it's to be expected that $\overline \nu$ would be different.

<p style='text-align: center;'>
<img src="http://webspace.pugetsound.edu/facultypages/nesh/Notebook/glycine with 5 waters, NH2 scissors.gif" height="400" width="400"/>
<strong>Figure 2</strong>. NH$_2$ scissors motion of glycine in which the solute's vibrational motion is mechanically coupled to vibrations of nearby solvent molecules, corresponding to a low-temperature limit. The frequency of this vibration is referred to in the text as $\overline \nu_{LT}$. 
</p>

Our first task is to determine values of the high-temperature wavenumber shown in Fig. 1, which we'll call $\overline \nu_{HT}$, and the low-temperature wavenumber shown in Fig. 2, which we'll call $\overline \nu_{LT}$. We'll do that with the help of an electronic structure package (e.g., Spartan$^{TM}$).

## Thermophoresis
*Thermophoresis* is a phenomenon first documented in the mid-1800s, but of considerable contemporary interest too. But what is it? As described in https://en.wikipedia.org/wiki/Thermophoresis,

"The phenomenon is observed at the scale of one millimeter or less. An example that may be observed by the naked eye with good lighting is when the hot rod of an electric heater is surrounded by tobacco smoke: the smoke goes away from the immediate vicinity of the hot rod. As the small particles of air nearest the hot rod are heated, they create a fast flow away from the rod, down the temperature gradient. They have acquired higher kinetic energy with their higher temperature. When they collide with the large, slower-moving particles of the tobacco smoke they push the latter away from the rod."

The kind of thermophoresis just described is called *thermophobic*, because the smoke moves *away* from the warmth. There's also *thermophylic* thermophoresis, in which particles are drawn *toward* the warmer side. 

Here we'll consider the thermophoresis of solutes dissolved in a solvent that is sandwiched between two thermal reservoirs, as shown in Fig. 3.

<p style='text-align: center;'>
<img src="http://webspace.pugetsound.edu/facultypages/nesh/Notebook/Thermophoretic forces.jpg" height="500" width="500"/>
<strong>Figure 3</strong>. Thermophoretic forces in a temperature gradient.
</p>

Here, we'll assume $T$ depends linearly on $x$,

$$
T(x) = T_{cold} + (T_{hot}-T_{cold}){x \over L} \ \ \ \ (1)
$$

## Quantization of vibrational motion
In Quantum Mechanics (and in real life), molecular vibrations exhibit a property called *energy quantization*, in which the amount of energy a given molecule can have is restricted to a set of discrete values. An example is shown in Fig. 4.

<p style='text-align: center;'>
<img src="https://www.researchgate.net/profile/Safwan_Al_Shara/publication/228531722/figure/fig1/AS:302054049370112@1449026705369/Energy-Levels-of-the-one-dimensional-harmonic-oscillator.png" height="400" width="400"/>
<strong>Figure 4</strong>. Vibrational energy levels.
</p>

Figure 4 is typical for vibrational motion, in that the vibrational energy levels form a kind of ladder, in which the rungs of the ladder indicate energies of *vibrational states* of the vibration. The first rung of the ladder is called *ground vibrational state*, and is assigned a quantum number $n=0$. The next rung up is the *first excited vibrational state*, with $n=1$, and so on. It turns out that the gaps between successive energy levels (rungs on the ladder) are more or less constant for any given vibrational mode. What's more, that gap is found to be a function of the frequency of vibration we were just talking about: it's $\hbar \omega$, where $\omega = 2 \pi c \overline \nu$, and $\hbar$ is called the *reduced Planck's constant*. Here, we'll be using a per-mole form of this gap by multipying by Avogadro's number:

$$
E_{gap} = N_A \hbar \omega  = N_A \hbar 2 \pi c \overline \nu \ \ \ \ (2)
$$

That means $E_n = E_o + \Delta E_n$, where $n=0,1,2$ and so on. Because the energy gaps are all the same, we can write the energies above the ground state as 

$$
\Delta E_n = n E_{gap}  \ \ \ \ (3)
$$

## A new state space
Because of the foregoing, it'll be convenient to define a state space consisting of one dimension ($x$) that says where the solute is in our thermophoresis experiment (Fig. 3), and another dimension ($n$) that says which vibrational quantum state the solute is in (Fig. 4). We'll call this our $(x,n)$ state space.


## The Partition Function
The overarching goal of this CGI is to predict whether the solute of interest -- such as the one shown in Figs. 1 and 2 -- is thermophobic or thermophylic. To do that, we're going need the apparatus of Statistical Mechanics. And at the very foundation of Statistical Mechanics is a quantity called the *partition function*, which we'll represent with the symbol $Z$. 

What's a partition function? It turns out that we can think of it as having two parts, $Z=Z' \times exp\bigl ({-E_o \over RT} \bigr )$, where $Z'$ (the *unscaled partition function*) has a convenient physical interpretation: it is the total number of quantum states available to a molecule at any given temperature. An example is shown in Fig. 5.

<p style='text-align: center;'>
<img src="http://webspace.pugetsound.edu/facultypages/nesh/Notebook/PartitionFunctionOfHO.png" height="500" width="500"/>
<strong>Figure 5</strong>. Unscaled ($Z'$) part of the partition function $Z$ for a vibrating molecule having an $E_{gap}=1 {kJ \over mol}$.
</p>

Let's see if we can make sense of this figure. We'll start out with the idea that a molecule of interest is in its ground state (bottom rung of Fig. 4). Now another molecule hits it. Will that collision be able to bump our molecule up to the next rung up? We can make the following educated guesses:

- The Equipartition Theorem tells us that, on average, intermolecular collisions have an energy of ${3 \over 2}RT$. At low temperatures, ${3 \over 2}RT << E_{gap}$, so intermolecular collisions would hardly ever have enough energy to bump molecules out of their ground vibrational state. So we expect $Z'\approx 1$ at low temperature. That's what we see on the left-hand-size of Fig. 5: $Z'\approx 1$.
- As we raise the temperature, we'll eventually reach a point at which enough intermolecular collisions have enough energy to bump molecules out of the lowest-energy vibrational states, and into higher-energy states (remember, ${3 \over 2}RT$ is an *average*, so some collisions will have more energy than that). With more states available to our molecule, $Z'$ should start to rise. In Fig. 5, we see this start to happen at around $25 K$ (but in other cases, it's higher, because $E_{gap}$ is bigger).

It turns out that the quantitative formulation of $Z'$ is surprisingly simple. We start with

$$
Z'_n = e^{-\Delta E_n / RT} \ \ \ \ (4)
$$

Then $Z'$ is the sum of $Z'_n$ over all the quantum states (rungs in the ladder) shown in Fig. 4:

$$
Z' = \sum_{n} Z'_n \ \ \ \ (5)
$$

With these ingredients for making a vibrational partition function in hand, our next step is to show how it sheds light on thermophoresis. We'll tackle how to do that next.

## Using the partition function to determine a solute's thermophoretic properties 
How does Statistical Mechanics help us determine whether our solute molecule is thermophobic or thermophylic? The route is through the relationship between $Z$ and the *free energy of solvation*, which we'll call $\Delta U_{sol}$ from Thermodynamics. First we construct the actual partition function from the unscaled $Z'$ we mentioned above,

$$
Z = Z' \times exp\bigl ({-E_o \over RT} \bigr ) \ \ \ \ (6)
$$

where (according to Fig. 4) $E_o={E_{gap} \over 2}$. From there we can compute 

$$
\Delta U_{sol} = \bigl (RT^2 {d \over dx} lnZ \bigr ) \times {dx \over dT} \ \ \ \ (7)
$$

The idea is, if $\Delta U_{sol}$ slopes down toward the cold temperature side (to the left in Fig. 3), then solute molecules trying to find their lowest energy will tend to move in that direction. That's *thermophobic* behavior! Alternatively, if $\Delta U_{sol}$ slopes down toward the warm side, solute molecules will tend to move in *that* direction, so *thermophylic*! 

The equations presented so far provide all the formulas we need to calculate $U_{sol}(x)$, but there are two derivatives in Eq. 7 that we do have to evaluate, and those are worth talking about a bit:

- ${dx \over dT}$ obviously depends on particulars of the thermal gradient, Eq. 1. An easy way to find ${dx \over dT}$ is to take the derivative ${dT \over dx}$, and invert it.
- ${d \over dx} lnZ$ can be got analytically, but it's easier to do so numerically. That can be accomplished with a Numpy function called called np.gradient (an example is given in the code below).


## Idea of this CGI
The idea of this CGI is to use Statistical Mechanics to predict the thermophoretic properties of a solute. There are Spartan and Python components:

1. In Spartan, you'll construct a solute molecule containing an amine group, and find the frequency of its scissors vibration assuming water as a solvent. You'll do that first with an "implicit" solvent as shown in Fig. 1, representing high-temperature vibrations; that'll get you $\overline \nu _{HT}$. Then you'll set up another Spartan run,  with explicit representation of the solvent as shown in Fig. 2, representing low-temperature vibrations; that'll get you $\overline \nu _{LT}$. 
1. In Python, you'll construct a $(x,n)$ state space, and compute vibrational wavenumbers on that state space in a way that transitions smoothly between $\overline \nu _{LT}$ and $\overline \nu _{HT}$. Then you'll find the partition function of your solute over a range of temperatures corresponding to the hot and cold temperatures shown in Fig. 3. From the partition function you'll find $\Delta U_{sol}$ across that same span of temperatures, and therefore be able to make a prediction about thermophoresis. 


## Learning goals
1. I can explain what thermophoresis is.
1. I can compute the energies, partition functions, and internal energy of vibrational motion. 
1. I can use the shape of $\Delta U_{sol}(T)$ to predict whether a solute will exibit thermophobic or thermophylic thermophoresis.

In [1]:
import pint; from pint import UnitRegistry; AssignQuantity = UnitRegistry().Quantity
import numpy as np
import sys; sys.path.append('/home'); import PchemLibrary as PL
import matplotlib.pyplot as plt

In [2]:
%matplotlib notebook

In [3]:
# Constants
hbar = AssignQuantity(1,'atomic_unit_of_time * hartree')
R = AssignQuantity(8.314e-3,'kjoule/mol/K')
NA = AssignQuantity(6.02e23,'1/mol')
c = AssignQuantity(3.0e8,'m/s')

### Specifying our system's state space
In the cell below, we create an $(x,n)$ state space consisting of 101 distances ($x$) spanning 0 to 1 mm, and 11 dimensionless vibrational quantum numbers ($n$) spanning 0 to 10. 

In [4]:
# This defines the state space
L = AssignQuantity(1,'mm')
nmax = 10
xgrid,ngrid = PL.Statespace([0,L,200],[0,nmax,nmax+1])
xgrid = AssignQuantity(xgrid,L.units)
ngrid = AssignQuantity(ngrid,'dimensionless')
print(xgrid.units)
print(ngrid.units)

# This will be handy later on
xarray = xgrid[:,0]

millimeter
dimensionless


### Visualizing $T(x)$
In the cell below, use Eq. 1 to calculate Tgrid as a function of xgrid. Then use PL.plot_surface to visualize it in our $(x,n)$ state space.

In [5]:
# The temperature range
T_cold = AssignQuantity(273,'K')
T_hot = AssignQuantity(350,'K')

# Calculate Tgrid as a function of xgrid
### BEGIN SOLUTION
Tgrid = T_cold + (T_hot-T_cold)*xgrid/L
ax = PL.plot_surface(xgrid,ngrid,Tgrid)
### END SOLUTION

# Annotating
ax.set_xlabel('x ('+str(xgrid.units)+')')
ax.set_ylabel('n')
ax.set_zlabel('Temperature '+str(Tgrid.units))

# This will be handy later on
Tarray = Tgrid[:,0]

<IPython.core.display.Javascript object>

### Visualizing $\overline \nu(x,n)$
In the cell below, Assign values for the low- and high-temperature wavenumbers you got from Spartan. 

Below that (done for you) we use a specially-prepared sigmoid function to construct $\overline \nu (x,n)$ in our $(x,n)$ state space, and PL.plot_surface to visualize it. 

In [6]:
# Specify nubar_LT and nubar_HT
### BEGIN SOLUTION
nubar_LT = AssignQuantity(1610,'1/cm')
nubar_HT = AssignQuantity(1588,'1/cm')
nubar_LT = AssignQuantity(1615,'1/cm')
nubar_HT = AssignQuantity(1581,'1/cm')

# These are other ways to shape nubar(x,T)
# nubar = PL.f_sigmoid(nubar_LT, nubar_HT, Tgrid, AssignQuantity,T_interval_magnitude=2)
# nubar = PL.f_sigmoid(nubar_LT, nubar_HT, Tgrid, AssignQuantity,T_interval_magnitude=10)
# nubar = PL.f_sigmoid(nubar_LT, nubar_HT, Tgrid, AssignQuantity,T_interval_magnitude=1000)
# nubar = PL.f_sigmoid(nubar_LT, nubar_HT, Tgrid, AssignQuantity,T_transition_magnitude=330)

### END SOLUTION

# Use our sigmoid function to generate a surface of nubar values
nubar = PL.f_sigmoid(nubar_LT, nubar_HT, Tgrid, AssignQuantity)


# Plot nubar(x,n)
ax = PL.plot_surface(xgrid,ngrid,nubar)
ax.set_xlabel("x ("+str(xgrid.units)+")")
ax.set_ylabel("n")
ax.set_zlabel('nubar '+str(nubar.units))

<IPython.core.display.Javascript object>

Text(0.5, 0, 'nubar 1 / centimeter')

### Visualizing $E_{gap}(x,n)$
In the cell below, use Eq. 2 to calculate $E_{gap}$ in our $(x,n)$ state space, and PL.plot_surface to visualize it.

In [7]:
### BEGIN SOLUTION
Egap = NA*hbar*2*np.pi*c*nubar
Egap.ito('kJ/mol')
ax = PL.plot_surface(xgrid,ngrid,Egap)
ax.set_xlabel("x ("+str(xgrid.units)+")")
ax.set_ylabel("n")
ax.set_zlabel("Egap "+str(Egap.units))
### END SOLUTION

<IPython.core.display.Javascript object>

Text(0.5, 0, 'Egap kilojoule / mole')

### Pause for analysis
Compare the gap shown in this figure with the energy available in a typical thermal collision at room temperature. Is it bigger? Is it smaller? What does that imply about what you'd expect the value of the partition function to be?

### BEGIN SOLUTION

$19 {kJ \over mol}$ is much bigger than ${3\over2} RT$ at room temperature, so we expect $Z'\approx1$.

### END SOLUTION

### Visualizing $\Delta E_n(x,n)$
In the cell below, your task is to use Eq. 3 to construct $\Delta E_n$ in our $(x,n)$ state space. Then use PL.plot_surface to visualize it .

In [8]:
### BEGIN SOLUTION

DeltaEn = ngrid*Egap

DeltaEn.ito('kJ/mole')
ax = PL.plot_surface(xgrid,ngrid,DeltaEn)
ax.set_xlabel("x ("+str(xgrid.units)+")")
ax.set_ylabel("n")
ax.set_zlabel("DeltaEn "+str(DeltaEn.units))
### END SOLUTION

<IPython.core.display.Javascript object>

Text(0.5, 0, 'DeltaEn kilojoule / mole')

### Visualizing $Z'_n$ 
In the cell below, your task is to use Eq. 4 to construct $Z'_n$ (call it "Zprime_n") in our $(x,n)$ state space. Then use PL.plot_surface to visualize it .

In [9]:
### BEGIN SOLUTION

Zprime_n = np.exp(-DeltaEn/R/Tgrid)

ax = PL.plot_surface(xgrid,ngrid,Zprime_n)
ax.set_xlabel("x ("+str(xgrid.units)+")")
ax.set_ylabel("n")
ax.set_zlabel("Z'_n "+str(Zprime_n.units))

# Not part of the solution, but what if we eliminate the dependence on n>0?
# for i in range(0,nmax+1):
#     print(ngrid[0,i],Zprime_n[1,i])
# Zprime_n[:,1:nmax+1]=0
# for i in range(0,nmax+1):
#     print(ngrid[0,i],Zprime_n[1,i])
# ax = PL.plot_surface(xgrid,ngrid,Zprime_n)
# ax.set_xlabel("x ("+str(xgrid.units)+")")
# ax.set_ylabel("n")
# ax.set_zlabel("Z'_n "+str(Zprime_n.units))

### END SOLUTION

<IPython.core.display.Javascript object>

Text(0.5, 0, "Z'_n dimensionless")

### Visualizing $Z'(x)$
In the cell below, your task is to use Eq. 5 to construct $Z'(x)$ by carrying out the summation shown in that equation. How do you do that? For a 2-dimensional array like $Z_n$, summation over the first index is accomplished by

    Zprime_array = np.sum(Zprime_n,axis=0)

Summation over the second index is accomplished by 

    Zprime_array = np.sum(Zprime_n,axis=1)
    
Since we want to sum over $n$-values, the second of these is the way to go. 

Because Zprime_array is just a function of $x$ (not $n$), we plot it as a function of xarray.

In [10]:
### BEGIN SOLUTION
Zprime_array = np.sum(Zprime_n,axis=1)
### END SOLUTION

# Visualizing Zarray
plt.figure()
plt.plot(xarray,Zprime_array)
plt.xlabel("x (" +str(xarray.units)+")")
plt.ylabel("Z' ("+str(Zprime_array.units)+")")
plt.grid(True)
plt.ylim(1-.0001,1.0015)

<IPython.core.display.Javascript object>

  return np.asarray(x, float)


(0.9999, 1.0015)

### Pause for analysis
You'll probably notice that $Z(x)$ hardly budges from 1. What's the physical reason behind that? *Hint: it has to do with ${3 \over 2}RT$*.

### BEGIN SOLUTION
Because the energy gap is a lot bigger than ${3 \over 2}RT$
### END SOLUTION

### Visualizing $\Delta U_{sol}(x)$
The cell below has you take the last few steps to getting to our goal. The first step, $E_o=E_{gap}/2$, is done for you. After that:

- Calculate $Z(x)$ using Eq. 6
- Calculate $\Delta U_{sol}(x)$ using Eq. 7

For the calculation of $\Delta U_{sol}(x)$, you'll need to take some derivatives. As mentioned in the Introduction, $dx/dT$ can be got analytically, but for ${d \over dx} lnZ \bigr )$, it's convenient to use a numerical method. Something like

    dlogZarray_dx = np.gradient(logZ_array,dx,edge_order=2)

will do the trick.

In [11]:
# Getting the ground-state vibrational energies across x
E0 = Egap[:,0]/2

# Getting Z(x) and U(x)
### BEGIN SOLUTION
Z_array = Zprime_array * np.exp(-E0/(R*Tarray))

# logZarray = np.log(Zarray)
logZ_array = np.log(Z_array)

dx = xarray[1]-xarray[0]
dlogZarray_dx = np.gradient(logZ_array,dx,edge_order=2)
dx_dT = L/(T_hot-T_cold)
Uarray = R*Tarray**2*dlogZarray_dx*dx_dT
plt.figure()
plt.plot(xarray,Uarray)
plt.grid(True)
plt.xlabel("x ("+str(xarray.units)+")")
plt.ylabel("Delta U_sol ("+str(Uarray.units)+")")
plt.title("nu_LT = "+str(nubar_LT)+' and nu_HT = '+str(nubar_HT))

# Not part of the solution, but curious
Uarray2 = E0 - Tarray*np.gradient(E0,dx,edge_order=2)*dx_dT
plt.figure()
plt.plot(xarray,Uarray2)
plt.grid(True)
plt.xlabel("x ("+str(xarray.units)+")")
plt.ylabel("Delta U_sol (Z\'=1) ("+str(Uarray.units)+")")
plt.title("nu_LT = "+str(nubar_LT)+' and nu_HT = '+str(nubar_HT))

plt.figure()
plt.plot(xarray,Uarray2-Uarray)
plt.grid(True)
plt.xlabel("x ("+str(xarray.units)+")")
plt.ylabel("Error ("+str(Uarray.units)+")")
plt.title("nu_LT = "+str(nubar_LT)+' and nu_HT = '+str(nubar_HT))

### END SOLUTION

<IPython.core.display.Javascript object>

  return np.asarray(x, float)


<IPython.core.display.Javascript object>

  return np.asarray(x, float)


<IPython.core.display.Javascript object>

  return np.asarray(x, float)


Text(0.5, 1.0, 'nu_LT = 1615 / centimeter and nu_HT = 1581 / centimeter')

### Pause for analysis
1. After studying the picture of $\Delta U_{sol}$ displayed above, what's your conclusion? Is your solute thermophylic or thermophobic? Or a little bit of both? How do you come to that conclusion?
1. Do a little experimentation: what if $\overline \nu$ had no temperature dependence (i.e., $\overline \nu_{LT}=\overline \nu_{HT}$)? Would the solute be thermophylic or thermophobic? 

### BEGIN SOLUTION

1. The answer depends on which is greater, $\overline \nu_{LT}$ or $\overline \nu_{HT}$:
- If $\overline \nu_{LT} > \overline \nu_{HT}$, the solute is thermophobic to the left of the bump in the middle, thermophylic to the right. Solute will vacate the middle.
- If $\overline \nu_{LT} < \overline \nu_{HT}$, the solute is thermophylic to the left of the dip in the middle, thermophobic to the right. Solute will collect in the middle.

2. When $\overline \nu_{LT}$ is set to equal $\overline \nu_{HT}$, strictly thermophobic behavior results.

### END SOLUTION

### Refresh/save/validate/close/submit/logout