### Computational Guided Inquiry for PChem (Neshyba, 2024)

# Moments of probability densities

## Introduction

An important quantitative use of probability densities is to calculate averages -- like a molecule's average speed, for example, or its average kinetic energy. To get to that idea, we're first going to explore a necessary precondition, namely, that probability densities have to be *normalized*.

## The idea of normalization
Normalization means the area under a probability density isotherm equals one. For the Boltzmann probability density we'd express this as property as

$$ 
\int\limits_{-\infty}^{\infty} f_B(T,v_x) \ dv_x = 1 \ \ \ \ (1) 
$$

while for the Maxwell probability density we'd write

$$ 
\int\limits_0^{\infty} f_M(T,v) \ dv = 1 \ \ \ \ (2) 
$$

## Getting moments from probability densities
*Moments* in thermodynamics are denoted using the notation $\langle ...\rangle$. For example, the first moment of the speed is given by

$$ 
\langle v \rangle = \int\limits_0^{\infty} f_M(T,v) \ v \ dv  \ \ \ \ (3) 
$$

Generalizing this idea, we could write

$$ 
\langle v^n \rangle = \int\limits_0^{\infty} f_M(T,v) \ v^n \ dv \ \ \ \ (4) 
$$

where obviously $n=1$ gives the first moment of the speed.

These moments can be evaluated analytically, which means a closed-form expression is available. There are integral tables for that. But you can also evaluate them numerically -- which is nice because doing so can provide a check on your skill at using an integral table. The trapezoidal rule works very nicely for this purpose too (see, e.g., https://en.wikipedia.org/wiki/Trapezoidal_rule). In this CGI, we're going to focus on numerical integration; we'll take an analytical approach later.

## Getting averages from moments
These moments have different dimensions, and therefore different units: the units of $\langle v \rangle$ in SI would be $m/s$, whereas $\langle v^2 \rangle$ would be $m^2/s^2$. That makes it difficult to compare them to one another. To get around that, we can raise the moments to  appropriate exponents (like 1, 1/2, 1/3, etc.). When we do that, we also assign special names to the results:

- The first moment of the speed raised to the power "1" is just the *average speed*. We symbolize it as $\bar c$,

$$
\bar c = \langle v \rangle \ \ \ \ (5)
$$ 

- The second moment of the speed raised to the power "1/2" is the *root mean square* speed. We symbolize it as $c$,

$$
c = \langle v^2 \rangle ^\frac{1}{2} \ \ \ \ (6)
$$ 

- The third moment of the speed raised to the power "1/3" is the *cubed-root-mean-cubed speed*. We symbolize it as $\tilde c$,

$$
\tilde c = \langle v^3 \rangle ^\frac{1}{3} \ \ \ \ (7)
$$

## Analytical expressions for moments of the Maxwell density
It's possible to derive *analytical* expressions for the preceding moments. For example,

$$
\bar c = \bigl( \frac{8RT}{\pi M} \bigr)^{1/2} \ \ \ \ (8)
$$ 

and

$$ 
c = \bigl( \frac{3RT}{M} \bigr)^{1/2} \ \ \ \ (9)
$$ 

You'll use these expressions in the last part of this exercise to compare to your numerical results.

## The kinetic energy connection
Averages like those described here are useful in various ways. One that we'll explore here is the connection to kinetic energy of molecules -- which, as you might imagine, is based on the idea the kinetic energy of something (like a molecule) moving in the $x$-direction equals ${1 \over 2} m v_x^2$. More math to follow.


## The idea of metadata
The idea behind *Metadata* often needs a little explanation. For example, you might be supplied with a grid of speeds, but in what units? The datasets we used in a previous CGI (fB.txt, fM.txt, etc.) have simple metadata attached. Here, we extract that metadata from those files by means of some Linux (operating system) commands. 

## Learning Goals
1. I can explain what it means to say that a probability density should be *normalized*, and how to test whether a given function really is normalized.
1. I can write integral formulas for moments of the speed (and velocity components).
1. I can describe how the trapezoidal rule works.
1. I can numerically evaluate integrals describing moments (using np.trapz), and can describe what it means to verify whether such integrations are converged.
1. I am familiar with deriving and evaluating analytical expressions for $\bar c$, $c$, $\tilde c$, and $\langle \epsilon \rangle$.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from pint import UnitRegistry; AssignQuantity = UnitRegistry().Quantity
import warnings; warnings.filterwarnings("ignore", "The unit of the quantity is stripped when downcasting to ndarray")
import sys; sys.path.append('/home'); import PchemLibrary as PL

In [None]:
%matplotlib notebook

### Checking out the metadata
The code below executes a Linux command -- i.e., an operating system command -- that reads metadata from two of the files we created in a previous exercise. You're prompted to duplicate it for other gridded variables from that exercise.

In [None]:
# Extract metadata for the velocity component
%cat 'vx.txt' | grep "#"

# Do the same for TB and fB
# Your code here 


# Extract metadata for the speed
%cat 'v.txt' | grep "#"

# Do the same for TM and fM
# Your code here 


### Loading the state space and probability density
Now that we know what the units are, we'll load all six the data files, and attach units using AssignQuantity. It'll smooth things along if you name these variables "TB", "fB", "TM", and "fM".

In [None]:
# Load the velocity component file and attach units
vx = np.loadtxt('vx.txt'); vx = AssignQuantity(vx,'m/s'); print(np.shape(vx))

# Do the same for TB and fB
# Your code here 



# Load the speed file & attach units
v = np.loadtxt('v.txt'); v = AssignQuantity(v,'m/s'); print(np.shape(v))

# Do the same for TM and fM
# Your code here 


### Graphing
In the cell below, we make a surface plot (using PL.plot_surface) of $f_B(T_B,v_x)$. This is mainly to make sure that what we thought we loaded, really is that.

In [None]:
# Graphing fB(T,vx)
llist = ["TB","vx","fB"]
PL.plot_surface1(TB, vx, fB, color='blue',title='fB(T,vx)',labellist=llist).show()

### Your turn
Do something similar, but this time a surface plot of $f_M(T_M,v)$

In [None]:
# Graphing fM(T,v)
# Your code here 


### Visualizing isotherms
The cell below takes slices of the first and last temperatures of $f_B(T_B,v_x)$ and graphs them on the same plot. The first slice is at a lower temperature, so it's colored blue; the last is colored red because it's hot.

In [None]:
# Slicing
vxfirst = vx[0,:]
fBfirst = fB[0,:]
vxlast = vx[-1,:]
fBlast = fB[-1,:]

# Specifying labels 
xlabel = 'velocity component ' + str(vxfirst.units)
ylabel = 'Boltzmann probablity density ' + str(fBfirst.units)

# Plot first and last using the label/legend method
plt.figure()
plt.plot(vxfirst,fBfirst,'blue',label='Low T')
plt.plot(vxlast,fBlast,'red',label='High T')
plt.legend()
plt.grid(True)
plt.xlabel(xlabel)
plt.ylabel(ylabel)

### Your turn
Now slice and plot the first and last isotherms of $f_M(v)$. Use the "label/legend" method to annotate.

In [None]:
# Slice out the first and last temperatures of the Maxwell function
# Your code here 


# Graph them together (using the label/legend method)
# Your code here 


### Verifying normalization
Before we look at the moments of the velocity components, let's have a look at whether our Boltzmann functions are actually *normalized*. For this purpose, we'll use numpy's *trapz* function, which implements the trapezoidal rule for integration. Because this is a numerical method, we don't expect the area to be exactly one -- but we'll be happy if it's pretty close.

In [None]:
# Testing for normalization of fB, low-temperature
integrand = fBfirst
Area_under_fBfirst = np.trapz(integrand,vxfirst)
print(Area_under_fBfirst)

# Testing for normalization of fB, high-temperature
integrand = fBlast
Area_under_fBlast = np.trapz(integrand,vxlast)
print(Area_under_fBlast)

### Your turn
Below, test for normalization of our two Maxwell curves (low-temperature and high-temperature).

In [None]:
# Test for normalization of fM, low-temperature
# Your code here 


# Test for normalization of fM, high-temperature
# Your code here 


### Moments 
Now we'll take a look at the *first moment* of the Maxwell density function. As you'll be able to see from the cell below, we can do this numerically using the same trapezoidal rule that we used to test for normalization of $f_M$: if you compare Eqs. (2) and (3), you'll see that the only difference is that the integrand is $f_M \times v$ (rather than $f_M$).

You'll also see in the cell below that we're graphing this integrand ($f_M \times v$) as a function of $v$. The purpose of graphing the integrand in this way is to examine whether the integration is converged -- so that we know we're not missing anything.

In [None]:
# Labels and a title
xlabel = 'speed (m/s)'
ylabel = 'fM x v'
title = 'Integrands for calculating the first moment of speed'

# Computing the integrand of the low-temperature speed moment, and integrating
integrandfirst = fMfirst*vfirst
moment1 = np.trapz(integrandfirst,vfirst)
cbar = moment1
print('c-bar low T ', cbar)

# Same in the high-temperature limit
integrandlast = vlast*fMlast
moment1 = np.trapz(integrandlast,vlast)
cbar = moment1
print('c-bar high T ', cbar)

# Graphing the integrands
plt.figure()
plt.plot(vfirst,integrandfirst,'blue',label='Low T')
plt.plot(vlast,integrandlast,'red',label='High T')
plt.xlabel(xlabel)
plt.ylabel(ylabel)
plt.title('Integrands for finding c-bar')
plt.grid(True)
plt.legend()

### Your turn: the second moment of the speed, and the rms speed ($c$)
Now do something analogous for the *second* moment of the speed, at the first and last temperatures. Start with computing and plotting the relevant integrand. You should also print $c$, i.e., the *square root* of $<v^2>$, for each temperature (see Eq. 6). 

In [None]:
# Labels and a title
xlabel = 'speed (m/s)'
ylabel = 'fM x v^2'
title = 'Integrands for calculating the second moment of speed'

# Compute the second moment of the speed, and the square root of it, at low temperature
# Your code here 


# Same but at high temperature
# Your code here 


# Graphing the integrands
# Your code here 


### Your turn (again)
Now find values of the *third* moment, for the first and last temperatures. Also, print $\tilde c$, i.e., the corresponding *cubed root* of $<v^3>$ (see Eq. 7), at both temperatures.

In [None]:
# Labels and a title
xlabel = 'speed (m/s)'
ylabel = 'v^3 x fM'
title = 'Integrands for calculating the third moment of speed'

# Compute the third moment of the speed, and the cubed root of it, at low temperature
# Your code here 


# Same but at high temperature
# Your code here 


# Graphing the integrands
# Your code here 


### Comparison to analytical results
In the cell below, calculate values of $\bar c$, $c$, and $\tilde c$ from analytical expressions (with a minimum of parentheses), and print the results. For the first two of these, see the introduction. For $\tilde c =$, use your own analytical result.

In [None]:
# Constants for T=500 K, M=28 g/mol
T = AssignQuantity(500,'K')          # Assuming the high-temperature results
M = AssignQuantity(0.028,'kg/mol')   # Assuming N2 gas
R = AssignQuantity(8.314,'J/mol/K')  # SI gas constant

# cbar
# Your code here 


# c
# Your code here 


# ctilde
# Your code here 


### Pause for analysis
One check on your code and theory here is whether you got values close to your numerical results. Another check is to verify that $\bar c < c < \tilde c$. If you find any big discrepancies along these lines, you might want to go back and see what might have gone wrong.

### The kinetic energy connection
Here we'll develop the connection that you have probably already intuited, that an understanding of moments gives us insight into the kinetic energy of molecules. We already noted that the kinetic energy of a molecule whose mass is $m$, moving in the $x$-direction with velocity $v_x$, equals ${1 \over 2} m v_x^2$. Assuming similar expressions apply in the $y$ and $z$ directions, this means we can write the molecule's *average* kinetic energy as

$$
\langle \epsilon \rangle _{one \ molecule} =  \langle {1 \over 2} m v_x^2 \rangle +  \langle {1 \over 2}m v_y^2 \rangle +  \langle {1 \over 2}m v_z^2 \rangle \ \ \ \ (10)
$$ 

where, by definition, $\langle {1 \over 2} m v_x^2 \rangle =  \int\limits_{-\infty}^{\infty} f_B(T,v_x) \ {1 \over 2} m v_x^2 \ dv_x$ (and similarly for $v_y$ and $v_z$). 

There are two simplifications here that will help us out a lot. First, since it doesn't matter whether we multiply by ${1 \over 2} m$ before or after we solve these integrals, we can write each as ${1 \over 2} m \langle  v_x^2 \rangle$ (etc.). And second, if we suppose that space is isotropic, the averages in $y$, and $z$ should be the same as in $x$. That all adds up to

$$
\langle \epsilon \rangle _{one \ molecule} = {3 \over 2}m \langle v_x^2 \rangle \ \ \ \ (11)
$$

We know how to use integral tables to solve integrals like this; the result looks like $\langle \epsilon \rangle _{one \ molecule} = const \times k_B  T$. To scale up to a mole of molecules, we would need to multiply by Avogadro's number. Writing that molar quantity as $\langle \epsilon \rangle$, 

$$
\langle \epsilon \rangle = {3 \over 2} R T  \ \ \ \ (12)
$$

Your task in the following cell is to use this expression to evaluate $\langle \epsilon \rangle$ (your choice of molecule) at a temperature of $298 K$. Report (print) your result in units $kJ/mol$.

In [None]:
# Your code here 


### Refresh/save/validate/submit/logout
Almost done! To double-check everything is OK, repeat the "Three steps for refreshing and saving your code," and press the "Validate" button (as usual). Then close, submit and log out.