# Building predominance diagrams using ThermoFun

## Run the next two code cells if using Google Colab (setting up virtual environment)

The Google Colab virtual environment comes with many python libraries pre-installed. However, some of the libraries we're interested in are not widely used and as a result are not pre-installed. To add them, we must first add `condacolab` to the virtual environment. 

In [None]:
!pip install -q condacolab
import condacolab
condacolab.install_miniforge()

We can now use `conda`, a widely used package manager, to install the geochemistry libraries of interest in the virtual environment. 

In [None]:
!conda install reaktoro
!conda install thermofun
!conda install thermohubclient

## Main body of the notebook

Import the necessary python libraries to the notebook.

In [None]:
import thermofun as fun # thermo. properties at T,P
import thermohubclient as client # get thermo. data

import numpy as np # facilitate math 
import matplotlib.pyplot as plt # plotting

Create a database object using the mines16 data file, which we can access from the ThermoFun server. 

In [None]:
# download data file 
dbc = client.DatabaseClient()
dbc.saveDatabase("mines16")

# create database object from data file
database = fun.Database('./mines16-thermofun.json') 

For reference, the cell below produces a list of the symbols used to represent each substance in the database. 

In [None]:
substances = database.mapSubstances()
print(substances.keys())

Create a ThermoFun "engine" object to calculate thermodynamic properties at the T,P of interest. 

In [None]:
engine = fun.ThermoEngine(database)

Set the temperature and pressure at which thermodynamic properties should be calculated. We'll use these variables throughout the rest of the notebook. 

In [None]:
# set temperature and pressure
T = 150 + 273.15 # [K]
P = rkt.waterSaturatedPressureWagnerPruss(T).val # [Pa] saturated vapor pressure of water

Specify the range of $pH$ and $log(fO_2)$ values for the plot. `np.arange` create a list of values—from the first value to one before the last—incremented by one. 

In [None]:
pH_range = np.arange(1,14) 
logfO2_range = np.arange(-50,-19) 

Take the following reaction as an example: 

$$2O_{2(aq)} + HS^-_{(aq)} \leftrightarrow SO_{4(aq)}^{-2} + H^+_{(aq)}$$

Note that we're interested in $O_{2(g)}$ rather than $O_{2(aq)}$; we can convert to the former by adding the reaction $O_{2(g)} \leftrightarrow O_{2(aq)}$ to the one above to get: 

$$2O_{2(g)} + HS^-_{(aq)} \leftrightarrow SO_{4(aq)}^{-2} + H^+_{(aq)}$$


We can write an expression for the equilibrium constant, $K$, as follows: 

$$K = \frac{a_{SO_{4(aq)}^{-2}}a_{H^+_{(aq)}}}   {f_{O_{2(g)}}^2 a_{HS^-_{(aq)}}}$$

By taking the log of each side, we can create a linear relationship between the log of each variable: 

$$logK = loga_{SO_{4(aq)}^{-2}} + loga_{H^+_{(aq)}} - 2logf_{O_{2(g)}} - loga_{HS^-_{(aq)}}$$

and replace $loga_{H^+_{(aq)}}$ with pH using the definition for pH ($pH = -loga_{H^+_{(aq)}}$): 

$$logK = loga_{SO_{4(aq)}^{-2}} -pH - 2logf_{O_{2(g)}} - loga_{HS^-_{(aq)}}$$


In order to plot this reaction in $log(fO_2)$-pH space, we need to reduce the number of variables to two (i.e., $log(fO_2)$ and pH). We can calculate the logK of the reaction for a specific temperature and pressure using ThermoFun, reducing the number of variables by one. We can reach our target variables by making this reaction a predominance boundary for S species (i.e., on either side of the boundary one of the two S species is more abundant, and at the boundary they are found in equal proportions). This allows us to cancel out $loga_{SO_{4(aq)}^{-2}}$ and $loga_{HS^-_{(aq)}}$, as the values for each are equal at the predominance boundary. We are left with: 

$$logK = -pH - 2logf_{O_{2(g)}}$$

Rearranging so that our dependent (i.e., y-axis) variable is on the left and our independent (i.e., x-axis) variable is on the right, we have: 

$$logf_{O_{2(g)}} = -\frac{1}{2}pH - \frac{1}{2}logK$$

We can then calculate $logK$ for the reaction and subsequently calculate $logf_{O_{2(g)}}$, using that value, for a range of pH values: 

In [26]:
logK = engine.thermoPropertiesReaction(T, P, "2O2 + HS- = SO4-2 + H+").log_equilibrium_constant.val
logfO2 = -0.5*pH_range - 0.5*logK

Finally, we can plot the results as follows: 

In [None]:
fig, ax  = plt.subplots(figsize=[12,8]) # create figure and axis objects

ax.plot(pH_range, logfO2) # plot values

# set x and y axis labels
ax.set_xlabel('pH')
ax.set_ylabel('logfO2')