# lab 10d

In [1]:
# Import necessary packages.
import re
import chemparse
import pandas as pd

In [2]:
# Helper func for styling dataframes.
def print_styled_df(df, caption):
    df = df.style.set_caption(caption).format(precision=2)
    display(df)

Let's read in the reactant data we were given or that we procured and take a look at it:

In [3]:
# Reading in external data.
solutionset_1 = pd.read_csv('solutionset-1.csv').set_index('Stock solutions')
solutionset_2 = pd.read_csv('solutionset-2.csv').set_index('Stock solutions')
solutionset_3 = pd.read_csv('solutionset-3.csv').set_index('Stock solutions')
pubchem_elem = pd.read_csv('pubchem-elements.csv').set_index('Symbol')

Note: the $CuCl_2$ used in Solution Set I was anhydrous.

In [4]:
print_styled_df(solutionset_1, 'Solution Set I')

Unnamed: 0_level_0,Mass,Volume
Stock solutions,Unnamed: 1_level_1,Unnamed: 2_level_1
Cu(NO3)2,0.87,30.86
CuCl2,0.5,31.18
K2CO3,0.71,42.7
NaNO3,0.52,29.69
KCl,0.45,30.45


Note that we state $BaCl_2$ here, despite the official listing being $Ba(NO_3)_2$. $BaCl_2$ is what was on the label of the bottle of solution used in our experiment.

In [5]:
print_styled_df(solutionset_2, 'Solution Set II')

Unnamed: 0_level_0,Mass,Volume
Stock solutions,Unnamed: 1_level_1,Unnamed: 2_level_1
(Na2SO4)(10H2O),1.16,29.75
Sr(NO3)2,0.91,30.0
BaCl2,0.8,30.53
AlCl3,0.86,30.54
K2SO4,1.11,30.6


In [6]:
print_styled_df(solutionset_3, 'Solution Set III')

Unnamed: 0_level_0,Mass,Volume
Stock solutions,Unnamed: 1_level_1,Unnamed: 2_level_1
(Fe(NO3)3)(9H2O),1.25,29.9
Co(NO3)2,1.12,30.48
CoCl2,0.7,30.24
NaOH,0.25,30.1
KOH,0.34,30.0
NaNO3,0.5,29.87


In [7]:
pubchem_elem

Unnamed: 0_level_0,AtomicNumber,Name,AtomicMass,CPKHexColor,ElectronConfiguration,Electronegativity,AtomicRadius,IonizationEnergy,ElectronAffinity,OxidationStates,StandardState,MeltingPoint,BoilingPoint,Density,GroupBlock,YearDiscovered
Symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
H,1,Hydrogen,1.008000,FFFFFF,1s1,2.20,120.0,13.598,0.754,"+1, -1",Gas,13.81,20.28,0.000090,Nonmetal,1766
He,2,Helium,4.002600,D9FFFF,1s2,,140.0,24.587,,0,Gas,0.95,4.22,0.000179,Noble gas,1868
Li,3,Lithium,7.000000,CC80FF,[He]2s1,0.98,182.0,5.392,0.618,+1,Solid,453.65,1615.00,0.534000,Alkali metal,1817
Be,4,Beryllium,9.012183,C2FF00,[He]2s2,1.57,153.0,9.323,,+2,Solid,1560.00,2744.00,1.850000,Alkaline earth metal,1798
B,5,Boron,10.810000,FFB5B5,[He]2s2 2p1,2.04,192.0,8.298,0.277,+3,Solid,2348.00,4273.00,2.370000,Metalloid,1808
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Fl,114,Flerovium,290.192000,,[Rn]7s2 7p2 5f14 6d10 (predicted),,,,,"6, 4,2, 1, 0",Expected to be a Solid,,,,Post-transition metal,1998
Mc,115,Moscovium,290.196000,,[Rn]7s2 7p3 5f14 6d10 (predicted),,,,,"3, 1",Expected to be a Solid,,,,Post-transition metal,2003
Lv,116,Livermorium,293.205000,,[Rn]7s2 7p4 5f14 6d10 (predicted),,,,,"+4, +2, -2",Expected to be a Solid,,,,Post-transition metal,2000
Ts,117,Tennessine,294.211000,,[Rn]7s2 7p5 5f14 6d10 (predicted),,,,,"+5, +3, +1, -1",Expected to be a Solid,,,,Halogen,2010


## Calculating Molarity of Reactants

Let's try out calculating molarity of items from Solution Set I. We need to:

- find the molar mass of the compound

In [8]:
# Finding molar mass of a given chemical formula.
def formula_to_molar_mass(formula):
    parsed_formula = chemparse.parse_formula(formula)
    total_molar_mass = 0
    for item in parsed_formula.keys():
        amount = parsed_formula[item]
        mm = pubchem_elem['AtomicMass'][item]
        total_molar_mass += mm * amount
    return total_molar_mass

- find the number of mols of the compound in the solution
- find the number of mols per L

In [9]:
# Finding molarity from molar mass.
def molar_mass_to_molarity(symbol, molar_mass, solutionset_df):
    mols_count = solutionset_df['Mass'][symbol] / molar_mass
    molarity = mols_count / (solutionset_df['Volume'][symbol] / 1000)
    return molarity

These operations need to be performed over the whole set:

In [10]:
def solution_set_to_molarity(ss_df):
    molarity_set = {}

    for index, row in ss_df.iterrows():
        molar_mass = formula_to_molar_mass(index)
        molarity_set[index] = molar_mass_to_molarity(index, molar_mass, ss_df)

    ss_molarity_df = pd.DataFrame.from_dict(
        molarity_set,
        orient='index',
        columns=['Molarity (M)']
    )
    return ss_molarity_df

Let's look at our results:

In [11]:
ss1_molarities = solution_set_to_molarity(solutionset_1)
print_styled_df(ss1_molarities, 'Molarity in Solution Set I')

Unnamed: 0,Molarity (M)
Cu(NO3)2,0.15
CuCl2,0.12
K2CO3,0.12
NaNO3,0.21
KCl,0.2


In [12]:
ss2_molarities = solution_set_to_molarity(solutionset_2)
print_styled_df(ss2_molarities, 'Molarity in Solution Set II')

Unnamed: 0,Molarity (M)
(Na2SO4)(10H2O),0.27
Sr(NO3)2,0.14
BaCl2,0.13
AlCl3,0.21
K2SO4,0.21


In [13]:
ss3_molarities = solution_set_to_molarity(solutionset_3)
print_styled_df(ss3_molarities, 'Molarity in Solution Set III')

Unnamed: 0,Molarity (M)
(Fe(NO3)3)(9H2O),0.22
Co(NO3)2,0.2
CoCl2,0.18
NaOH,0.21
KOH,0.2
NaNO3,0.2


## Calculating Trial Ion Products

Let's look at the double replacements for Set I:

In [14]:
set1_col = [
    'Cu(NO3)2',
    'CuCl2',
    'K2CO3',
    'NaNO3',
    'KCl'
]

set1_drs = {
    'Cu(NO3)2': ['N/A', ['Cu(NO3)2', 'CuCl2'], ['CuCO3', 'K2(NO3)2'], ['Cu(NO3)2', '2NaNO3'], ['CuCl2', '2KCl']],
    'CuCl2': ['Same', 'N/A', ['CuCO3', '2KCl'], ['Cu(NO3)2', '2NaCl'], ['CuCl2', '2KCl']],
    'K2CO3': ['Same', 'Same', 'N/A', ['2KNO3', 'Na2CO3'], ['K2CO3', '2KCl']],
    'NaNO3': ['Same', 'Same', 'Same', 'N/A', ['NaCl', 'KNO3']],
    'KCl': ['Same', 'Same', 'Same', 'Same', 'N/A']
}

ss1_matrix = pd.DataFrame.from_dict(set1_drs, orient='index', columns=set1_col)
print_styled_df(ss1_matrix, 'Solution Set I Trials')

Unnamed: 0,Cu(NO3)2,CuCl2,K2CO3,NaNO3,KCl
Cu(NO3)2,,"['Cu(NO3)2', 'CuCl2']","['CuCO3', 'K2(NO3)2']","['Cu(NO3)2', '2NaNO3']","['CuCl2', '2KCl']"
CuCl2,Same,,"['CuCO3', '2KCl']","['Cu(NO3)2', '2NaCl']","['CuCl2', '2KCl']"
K2CO3,Same,Same,,"['2KNO3', 'Na2CO3']","['K2CO3', '2KCl']"
NaNO3,Same,Same,Same,,"['NaCl', 'KNO3']"
KCl,Same,Same,Same,Same,


For each product, we need to calculate the Trial Ion Product, and compare it to the Solubility Product Constant. To calculate trial ion products, we need to:
- Find the molarities of each item based on previous molarity calculations
- Calculate based on coefficients

First, let's read in a dataset of the products and the ions that make them up:

In [15]:
ss1_reactant_ions = pd.read_csv("ss1-reactant-ions.csv")
print_styled_df(ss1_reactant_ions, "Solution Set I Reactant Ions")

Unnamed: 0,Product,First Reactant Ion,Second Reactant Ion,First Reactant,Second Reactant,Soluble?
0,Cu(NO3)2,Cu,(NO3)2,CuCl2,Cu(NO3)2,True
1,CuCl2,Cu,Cl2,Cu(NO3)2,CuCl2,False
2,CuCO3,Cu,CO3,Cu(NO3)2,K2CO3,False
3,K2(NO3)2,K2,(NO3)2,K2CO3,Cu(NO3)2,True
4,CuCO3,Cu,CO3,CuCl2,K2CO3,False
5,2KCl,2K,Cl,K2CO3,CuCl2,True
6,Cu(NO3)2,Cu,(NO3)2,Cu(NO3)2,NaNO3,True
7,2NaNO3,2Na,NO3,NaNO3,Cu(NO3)2,True
8,Cu(NO3)2,Cu,(NO3)2,CuCl2,NaNO3,True
9,2NaCl,2Na,Cl,NaNO3,CuCl2,True


First, let's filter out the products that we know are soluble from the solubility table, and then remove the `Soluble?` column for easier viewing.

In [16]:
ss1_reactant_ions = ss1_reactant_ions[ss1_reactant_ions['Soluble?']==False]
ss1_reactant_ions.drop('Soluble?', axis=1, inplace=True)
ss1_reactant_ions

Unnamed: 0,Product,First Reactant Ion,Second Reactant Ion,First Reactant,Second Reactant
1,CuCl2,Cu,Cl2,Cu(NO3)2,CuCl2
2,CuCO3,Cu,CO3,Cu(NO3)2,K2CO3
4,CuCO3,Cu,CO3,CuCl2,K2CO3
12,CuCl2,Cu,Cl2,Cu(NO3)2,KCl
14,CuCl2,Cu,Cl2,CuCl2,KCl


Let's compute the amount of each reactant ion for each product:

In [17]:
def ion_counter(ion):
    # Find the rightmost number in a formula.
    nums = re.findall(r'\d+', ion)
    try:
        return int(nums[-1])
    except IndexError:
        return 1

In [18]:
def add_ion_counts_to_df(df):
    first_ions_amnt = []
    second_ions_amnt = []

    for index, row in df.iterrows():
        ion1 = row['First Reactant Ion']
        ion2 = row['Second Reactant Ion']
        first_ions_amnt.append(ion_counter(ion1))
        second_ions_amnt.append(ion_counter(ion2))

    df.insert(
        len(df.columns),
        "Amount of First Ion",
        first_ions_amnt
    )
    df.insert(
        len(df.columns),
        "Amount of Second Ion",
        second_ions_amnt
    )
    return df

In [19]:
ss1_reactant_ions = add_ion_counts_to_df(ss1_reactant_ions)
ss1_reactant_ions

Unnamed: 0,Product,First Reactant Ion,Second Reactant Ion,First Reactant,Second Reactant,Amount of First Ion,Amount of Second Ion
1,CuCl2,Cu,Cl2,Cu(NO3)2,CuCl2,1,2
2,CuCO3,Cu,CO3,Cu(NO3)2,K2CO3,1,3
4,CuCO3,Cu,CO3,CuCl2,K2CO3,1,3
12,CuCl2,Cu,Cl2,Cu(NO3)2,KCl,1,2
14,CuCl2,Cu,Cl2,CuCl2,KCl,1,2


Let's run a lookup for the ion molarities, and then calculate Trial Ion Product for each product:

In [20]:
def find_molarity(reactant_df, reactant):
    return reactant_df['Molarity (M)'][reactant]

In [21]:
def calculate_qsps(row, molarities_df):
    reactant1 = row['First Reactant']
    ion1_count = row['Amount of First Ion']
    reactant2 = row['Second Reactant']
    ion2_count = row['Amount of Second Ion']
    molarity_ion1 = find_molarity(molarities_df, reactant1)
    molarity_ion2 = find_molarity(molarities_df, reactant2)
    qsp = ((molarity_ion1 ** ion1_count) * (molarity_ion2 ** ion2_count))
    return qsp

In [22]:
ss1_qsps = []

for index, row in ss1_reactant_ions.iterrows():
    qsp = calculate_qsps(row, ss1_molarities)
    ss1_qsps.append(f'{qsp:.2E}')

ss1_reactant_ions.insert(
    len(ss1_reactant_ions.columns),
    "Trial Ion Product",
    ss1_qsps
)

ss1_reactant_ions

Unnamed: 0,Product,First Reactant Ion,Second Reactant Ion,First Reactant,Second Reactant,Amount of First Ion,Amount of Second Ion,Trial Ion Product
1,CuCl2,Cu,Cl2,Cu(NO3)2,CuCl2,1,2,0.00214
2,CuCO3,Cu,CO3,Cu(NO3)2,K2CO3,1,3,0.000262
4,CuCO3,Cu,CO3,CuCl2,K2CO3,1,3,0.000208
12,CuCl2,Cu,Cl2,Cu(NO3)2,KCl,1,2,0.00591
14,CuCl2,Cu,Cl2,CuCl2,KCl,1,2,0.00469


Let's read in a standard Solubility Product Table:

In [23]:
KSP_DF = pd.read_csv('ksp-table.csv').set_index('Substance')
KSP_DF

Unnamed: 0_level_0,$K_{sp}$ at $25^{o}C$
Substance,Unnamed: 1_level_1
Al(OH)3,2.00E-32
BaCO3,1.60E-09
BaC2O4*2H2O,1.10E-07
BaSO4,2.30E-08
BaCrO4,8.50E-11
...,...
Tl(OH)3,6.30E-46
Sn(OH)2,3.00E-27
SnS,1.00E-26
Sn(OH)4,1.00E-57


For each of our precipitates, we can compare the $Q_{sp}$ to the corresponding $K_{sp}$:

In [24]:
def compare_qsp_and_ksp(row):
    product = row['Product']
    qsp = row['Trial Ion Product']
    try:
        ksp = KSP_DF['$K_{sp}$ at $25^{o}C$'][product]
    except KeyError:
        ksp = KSP_DF['$K_{sp}$ at $25^{o}C$'][product[:-1]]
    return (ksp, float(qsp) > float(ksp))

In [25]:
def add_ksp_comparison_to_df(df):
    corresponding_ksps = []
    theoretically_oversaturated = []

    for index, row in df.iterrows():
        ksp, comparison = compare_qsp_and_ksp(row)
        corresponding_ksps.append(ksp)
        theoretically_oversaturated.append(comparison)

    df.insert(
        len(df.columns),
        'Solubility Product Constant',
        corresponding_ksps
    )
    df.insert(
        len(df.columns),
        'Theoretically Oversaturated?',
        theoretically_oversaturated
    )

    return df

In [26]:
ss1_reactant_ions = add_ksp_comparison_to_df(ss1_reactant_ions)
print_styled_df(ss1_reactant_ions, 'Precipitates in Solution Set I')

Unnamed: 0,Product,First Reactant Ion,Second Reactant Ion,First Reactant,Second Reactant,Amount of First Ion,Amount of Second Ion,Trial Ion Product,Solubility Product Constant,Theoretically Oversaturated?
1,CuCl2,Cu,Cl2,Cu(NO3)2,CuCl2,1,2,0.00214,1.2e-06,True
2,CuCO3,Cu,CO3,Cu(NO3)2,K2CO3,1,3,0.000262,2.5e-10,True
4,CuCO3,Cu,CO3,CuCl2,K2CO3,1,3,0.000208,2.5e-10,True
12,CuCl2,Cu,Cl2,Cu(NO3)2,KCl,1,2,0.00591,1.2e-06,True
14,CuCl2,Cu,Cl2,CuCl2,KCl,1,2,0.00469,1.2e-06,True


Finally, we can strip the intermediate data to produce a cleaner table:

In [27]:
ss1_reactant_ions.drop(columns=['First Reactant Ion', 'Second Reactant Ion', 'First Reactant', 'Second Reactant', 'Amount of First Ion', 'Amount of Second Ion'], inplace=True)
print_styled_df(ss1_reactant_ions, 'Precipitates in Solution Set I')

Unnamed: 0,Product,Trial Ion Product,Solubility Product Constant,Theoretically Oversaturated?
1,CuCl2,0.00214,1.2e-06,True
2,CuCO3,0.000262,2.5e-10,True
4,CuCO3,0.000208,2.5e-10,True
12,CuCl2,0.00591,1.2e-06,True
14,CuCl2,0.00469,1.2e-06,True


---
---
Repeating this analysis for Solution Set II:

In [28]:
ss2_reactant_ions = pd.read_csv("ss2-reactant-ions.csv")
print_styled_df(ss2_reactant_ions, "Solution Set II Reactant Ions")

Unnamed: 0,Product,First Reactant Ion,Second Reactant Ion,First Reactant,Second Reactant,Soluble?
0,K2SO4,K2,SO4,K2SO4,(Na2SO4)(10H2O),True
1,(Na2SO4)(10H2O),Na2,SO4,(Na2SO4)(10H2O),K2SO4,True
2,2KNO3,2K,NO3,K2SO4,Sr(NO3)2,True
3,SrSO4,Sr,SO4,Sr(NO3)2,K2SO4,False
4,2KCl,2K,Cl,K2SO4,BaCl2,True
5,BaSO4,Ba,SO4,BaCl2,K2SO4,False
6,6KCl,6K,Cl,K2SO4,AlCl3,True
7,Al2(SO4)3,Al2,(SO4)3,AlCl3,K2SO4,True
8,Al2(SO4)3,Al2,(SO4)3,AlCl3,(Na2SO4)(10H2O),True
9,6NaCl,6Na,Cl,(Na2SO4)(10H2O),2AlCl3,True


- Filtering to insoluble only, and then removing the `Soluble?` column
- Counting ions
- Calculating Trial Ion Product

In [29]:
ss2_reactant_ions = ss2_reactant_ions[ss2_reactant_ions['Soluble?']==False]
ss2_reactant_ions.drop('Soluble?', axis=1, inplace=True)
ss2_reactant_ions = add_ion_counts_to_df(ss2_reactant_ions)

ss2_qsps = []

for index, row in ss2_reactant_ions.iterrows():
    qsp = calculate_qsps(row, ss2_molarities)
    ss2_qsps.append(f'{qsp:.2E}')
    
ss2_reactant_ions.insert(
    len(ss2_reactant_ions.columns),
    "Trial Ion Product",
    ss2_qsps
)

ss2_reactant_ions

Unnamed: 0,Product,First Reactant Ion,Second Reactant Ion,First Reactant,Second Reactant,Amount of First Ion,Amount of Second Ion,Trial Ion Product
3,SrSO4,Sr,SO4,Sr(NO3)2,K2SO4,1,4,0.000269
5,BaSO4,Ba,SO4,BaCl2,K2SO4,1,4,0.000236
14,BaSO4,Ba,SO4,BaCl2,(Na2SO4)(10H2O),1,4,0.000714
18,SrSO4,Sr,SO4,Sr(NO3)2,(Na2SO4)(10H2O),1,4,0.000814


Adding in comparisons to $K_{sp}$s and stripping intermediate data:

In [30]:
ss2_reactant_ions = add_ksp_comparison_to_df(ss2_reactant_ions)
ss2_reactant_ions.drop(columns=['First Reactant Ion', 'Second Reactant Ion', 'First Reactant', 'Second Reactant', 'Amount of First Ion', 'Amount of Second Ion'], inplace=True)
print_styled_df(ss2_reactant_ions, 'Precipitates in Solution Set II')

Unnamed: 0,Product,Trial Ion Product,Solubility Product Constant,Theoretically Oversaturated?
3,SrSO4,0.000269,3.2e-07,True
5,BaSO4,0.000236,2.3e-08,True
14,BaSO4,0.000714,2.3e-08,True
18,SrSO4,0.000814,3.2e-07,True


---
---
And finally, repeating this analysis for Solution Set III:

In [31]:
ss3_reactant_ions = pd.read_csv("ss3-reactant-ions.csv")
print_styled_df(ss3_reactant_ions, "Solution Set III Reactant Ions")

Unnamed: 0,Product,First Reactant Ion,Second Reactant Ion,First Reactant,Second Reactant,Soluble?
0,(Fe(NO3)3)(9H2O),Fe,(NO3)3,(Fe(NO3)3)(9H2O),NaNO3,True
1,3NaNO3,3Na,NO3,NaNO3,(Fe(NO3)3)(9H2O),True
2,Fe(OH)3,Fe,(OH)3,(Fe(NO3)3)(9H2O),KOH,False
3,3KNO3,3K,NO3,KOH,(Fe(NO3)3)(9H2O),True
4,Fe(OH)3,Fe,(OH)3,(Fe(NO3)3)(9H2O),NaOH,False
5,3NaNO3,3Na,NO3,NaOH,(Fe(NO3)3)(9H2O),True
6,2FeCl3,2Fe,Cl3,(Fe(NO3)3)(9H2O),CoCl2,True
7,3Co(NO3)2,3Co,(NO3)2,CoCl2,(Fe(NO3)3)(9H2O),True
8,2(Fe(NO3)3)(9H2O),2Fe,(NO3)3,(Fe(NO3)3)(9H2O),Co(NO3)2,True
9,3Co(NO3)2,3Co,(NO3)2,Co(NO3)2,(Fe(NO3)3)(9H2O),True


- Filtering to insoluble only, and then removing the `Soluble?` column
- Counting ions
- Calculating Trial Ion Product

In [32]:
ss3_reactant_ions = ss3_reactant_ions[ss3_reactant_ions['Soluble?']==False]
ss3_reactant_ions.drop('Soluble?', axis=1, inplace=True)
ss3_reactant_ions = add_ion_counts_to_df(ss3_reactant_ions)

ss3_qsps = []

for index, row in ss3_reactant_ions.iterrows():
    qsp = calculate_qsps(row, ss3_molarities)
    ss3_qsps.append(f'{qsp:.2E}')
    
ss3_reactant_ions.insert(
    len(ss3_reactant_ions.columns),
    "Trial Ion Product",
    ss3_qsps
)

ss3_reactant_ions

Unnamed: 0,Product,First Reactant Ion,Second Reactant Ion,First Reactant,Second Reactant,Amount of First Ion,Amount of Second Ion,Trial Ion Product
2,Fe(OH)3,Fe,(OH)3,(Fe(NO3)3)(9H2O),KOH,1,3,0.00185
4,Fe(OH)3,Fe,(OH)3,(Fe(NO3)3)(9H2O),NaOH,1,3,0.00201
12,Co(OH)2,Co,(OH)2,Co(NO3)2,KOH,1,2,0.0082
14,Co(OH)2,Co,(OH)2,Co(NO3)2,NaOH,1,2,0.00866
20,Co(OH)2,Co,(OH)2,CoCl2,KOH,1,2,0.00728
22,Co(OH)2,Co,(OH)2,CoCl2,NaOH,1,2,0.00769


Adding in comparisons to $K_{sp}$s and stripping intermediate data:

In [33]:
ss3_reactant_ions = add_ksp_comparison_to_df(ss3_reactant_ions)
ss3_reactant_ions.drop(columns=['First Reactant Ion', 'Second Reactant Ion', 'First Reactant', 'Second Reactant', 'Amount of First Ion', 'Amount of Second Ion'], inplace=True)
print_styled_df(ss3_reactant_ions, 'Precipitates in Solution Set II')

Unnamed: 0,Product,Trial Ion Product,Solubility Product Constant,Theoretically Oversaturated?
2,Fe(OH)3,0.00185,4e-38,True
4,Fe(OH)3,0.00201,4e-38,True
12,Co(OH)2,0.0082,2.5e-16,True
14,Co(OH)2,0.00866,2.5e-16,True
20,Co(OH)2,0.00728,2.5e-16,True
22,Co(OH)2,0.00769,2.5e-16,True


## Notes on Computational Methodology

**[Gist link](https://gist.github.com/ThatNerdSquared/e9a8acce736474536cd2f0db5cd0ba55)**

As far as I can tell, we really need to find two things:
- Trial Ion Product
    - To find this, we need to find the molarity of each element
    - To find molarity of each element, we need to find molarity of the compound
    - To find molarity of the compound, we need to find mols of compound
    - To find mols of compound, we need to convert grams to mols via `grams over molar mass`
    - So the pipeline is basically:
        - Take grams, convert to mols using $\frac{\text{1 mol}}{\text{(molar mass of X)g}}$, convert to molarity by dividing that by volume in L (convert that from mL), find molarity of each element by looking at the ratios, and then calculate reaction quotient using the proper exponents
- Solubility Product Constant
    - This is experimentally determined, isn't it?
    - We could use the grams per L solubility from PubChem...
        - essentially follow [this method](https://www.chemteam.info/Equilibrium/Calc-Ksp-FromMolSolub.html)
        - convert it to mols per L (molar solubility)
        - calculate $K_{sp}$ via reaction quotient formula
        - We [can't grab solubility programmatically](https://github.com/mcs07/PubChemPy/issues/16) though so it could be a pain
- Possibly useful libs:
    - https://pypi.org/project/chemparse/
    - https://github.com/cgohlke/molmass
    - https://pypi.org/project/pyvalem/
    - automated chemical equation balancing: https://www.wikiwand.com/en/Chemical_equation#System_of_linear_equations, https://pythonnumericalmethods.berkeley.edu/notebooks/chapter14.05-Solve-Systems-of-Linear-Equations-in-Python.html

## Notes on Data Cleaning

I managed to procure a CSV of the Solubility Product Table by taking the HTML data from the [provided table](https://openstax.org/books/chemistry-atoms-first-2e/pages/j-solubility-products) and performing a series of transformations on it using `vim`:

1) `:g/td colspan="2"/-1,+1d` to remove all element headings
2) `:g/<span class="MathJax_Preview"/.+1,+65d` to remove all MathJax styling
3) `:%s/<sub>//g` to remove all subscript begin elements
4) `:%s/<\/sub>//g` to remove all subscript end elements
5) `:%s/·/*/g` to replace the `cdot` character with a standard asterisk
6) `:%s/10<sup>/E/g` to convert the MathJax scientific notation into E-notation
7) `:g/<tr valign/d` to remove unnecessary HTML table data
8) `:g/<\/td>/.+1,+1d` to remove unnecessary HTML table data
9) `:g/<\/sup>/.+1,+1d` to remove exponentiation HTML data
10) `:%s/<\/sup>//g` to remove exponentiation HTML data
11) `:%s/<td data-align="left">//g` to remove table alignment data
12) to remove unnecessary table/styling data:
    - `:%s/<\/td>//g`
    - `:%s/ <span class="MathJax_Preview" style="color: inherit"><\/span//g`
    - `:%s/<em data-effect="italics">//g`
    - `:%s/<\/em>//g`
13) recorded a macro to `@m` resulting in the command string `$a,^[JxJxj0`, to merge lines into properly-formatted CSV rows
14) some manual edits:
    - for $\frac{1}{2}Ag_2O(Ag^+ + OH^-)$, there was a ton of MathJax styling so cleaning it by hand was unavoidable

## Remaining Todos
- [x] generalize $Q_{sp}$ calculation
- [x] run $Q_{sp}$ calculations on other data sets
- [x] clean up the notebook
- [x] upload a new version to gist
- [ ] figure out what to do for sets IV and V

## Results
- tables of actual data we collected
- tables w calculations comparing reaction quotients to $K_{sp}$s
- summary of what was obtained, basically

## Discussion

- hypothesis => hootonian intervention required
- results significance => how the results relate back to core point of lab ($K_{sp}$s? solutbility?)
- scientific explanation => calculate $K_{sp}$s to explain why things played out the way they did
- error analysis => basically saying visual/contaimination/temp, and mostly saying they're negligible