<left>
<table style="margin-top:0px; margin-left:0px;">
<tr>
  <td><img src="https://gitlab.com/worm1/worm-figures/-/raw/master/style/worm.png" alt="WORM" title="WORM" width=50/></td>
  <td><h1 style=font-size:30px>Aqueous Geochemical Speciation</h1><h2>Exploration Notebook</h2></td>
</tr>
</table>
<\left>

### Sample datasets:
- <u>peru.csv</u> - hot spring samples from the Peruvian Andes ([Newell & Scott 2020](https://doi.org/10.26022/IEDA/111569))
- <u>acidic_hotsprings.csv</u> - acidic hot springs from Yellowstone National Park
- <u>leong2021.csv</u> - pH 7-12 samples from serpentinizing systems in Oman ([Leong *et al.* 2021](https://doi.org/10.1029/2020JB020756))
- <u>S&C10vents.csv</u> - a variety of subsea hydrothermal vent fluids ([Shock & Canovas 2010](https://onlinelibrary.wiley.com/doi/full/10.1111/j.1468-8123.2010.00277.x))
- <u>singlesample.csv</u> - a single sample from the leong2021 dataset. Use this CSV as a template for adding new rows of custom sample data!

### Ideas to try

1) Open a sample dataset CSV and check out the kinds of variables used in a speciation calculation.
2) Refresh and run this notebook from top to bottom with Kernel > Restart Kernel and Run All Cells.
3) Look through the CSV that is generated ('report.csv' by default). How is it different than the sample dataset?
4) Plot the mass contribution of a different basis species, like Ca+2.
5) Try sort_by='Temperature' instead of 'pH' when creating a mass contribution plot. How does the plot change? What happens if you use another variable?
6) Purposefully mis-name a basis species and check what the error message says.
7) Plot the mineral saturation index of another sample.
8) Plot different variables against each other with a scatterplot. Try mis-naming a variable to see which are available.
9) Speciate another dataset. How did the results change?
10) Plot the mass contribution of Fe+2 in the acidic_hotspring.csv sample set. What oxidation states are in the speciated complexes? How does this change if you add a column called 'Fe+3' (subheader 'Molality') in acidic_hotspring.csv, and assign a zero for each sample?

In [None]:
# this cell makes plots appear larger
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [10, 5]

In [None]:
import AqEquil
ae = AqEquil.AqEquil()

In [None]:
speciation = ae.speciate(input_filename='peru.csv',
                         exclude=['Year', 'Area'], # exclude metadata columns
                         report_filename='report.csv') # create a CSV of results

In [None]:
# look up which sections are in the speciation report
speciation.lookup()

In [None]:
# look up which variables are in a section
speciation.lookup("charge_balance")

In [None]:
# look up a variable
speciation.lookup("charge imbalance").head() # .head() shows only the first 5 results

In [None]:
# look up multiple variables at once
speciation.lookup(["Temperature", "pH", "Na+", "Cl-"]).head() # .head() shows only the first 5 results

In [None]:
speciation.scatterplot('pH', 'Temperature')

In [None]:
speciation.plot_mass_contribution('HCO3-', sort_by='pH', width=1)

In [None]:
speciation.plot_mineral_saturation('DNCB17-21') # sample name

In [None]:
speciation.scatterplot('pH', ['CO3-2', 'HCO3-', 'CO2', 'Ca(HCO3)+'], interactive=True)

In [None]:
speciation.barplot(['Na+', 'Cl-'], plot_width=7)

# How-To Reference

A few examples of how to get started with aqueous speciation on WORM are given below. For those with more Python experience, or those who wish to see the full range of options for each function, the documentation for the AqEquil Python package can be found [here](https://worm-portal.asu.edu/AqEquil-docs/AqSpeciation.html).

---

### Speciating water chemistry data

Get the equilibration code loaded with:

```python
import AqEquil
ae = AqEquil.AqEquil()
```

Speciate samples in 'myfile.csv' with:

```python
speciation = ae.speciate(input_filename='myfile.csv')
```

This creates the **speciation** object that stores the results. You can access these results in a few different ways, described below.

---

### Tables

Look up which sections are in the speciation results:

```python
speciation.lookup()
```

Look up which variables are in a section of the results:

```python
speciation.lookup("aq_distribution") # name of the section
```

Look up a desired variable with:

```python
speciation.lookup('O2')
```

Look up multiple variables at once by providing a *list* (several variable names separated by commas and enclosed in square brackets):

```python
speciation.lookup(['Mg+2', 'Mg(HCO3)+', 'Mg(HSiO3)+'])
```

Get the full speciation report with:

```python
speciation.report
```

The report might be truncated in the notebook. You can view the entire report by changing some settings with:

```python
import pandas as pd
pd.set_option('display.max_rows', None, 'display.max_columns', None)
speciation.report
```

You can undo this with:
```python
pd.reset_option('display')
```

Another way to view the entire report is by saving it to a CSV file during speciation with:

```python
speciation = ae.speciate(input_filename='myfile.csv',
                         report_filename='myreport.csv')
```

The example above creates a file called 'myreport.csv' in the same directory as the Jupyter notebook.

---

### Visualization

Create a pH-temperature scatterplot with:

```python
speciation.scatterplot()
```

Or plot two variables against each other by naming them:

```python
speciation.scatterplot('Na+', 'Cl-')
```

Plot multiple series along the same x-axis by providing a list:

```python
speciation.scatterplot('pH', ['CO2', 'HCO3-', 'CO3-2'])
```

Create a bar plot comparing a variable across all samples:

```python
speciation.barplot('Fe+2')
```

Create a grouped bar plot by providing a list of variables:

```python
speciation.barplot(['Fe+2', 'Fe+3'])
```

Plot the percent contribution of aqueous species to the mass balance of a desired *basis species* with:

```python
speciation.plot_mass_contribution('HCO3-')
```

Plot mineral saturation index of a sample with:

```python
speciation.plot_mineral_saturation('sample_name')
```

where 'sample_name' is the name of a sample.

If a static plot is desired over an interactive one, you can turn off interactivity with `interactive=False`. This is valid for scatterplots and mass contribution plots.

```python
speciation.plot_mass_contribution('sample_name', interactive=False)
```

Save any kind of plot as a PNG by adding the parameter `save_as='my_figure'` to the plotting function.

---

### Database download links:

Download a CSV of the WORM database: https://worm-portal.asu.edu/wrm-db/wrm_data.csv

Or in data0 format: https://worm-portal.asu.edu/wrm-db/data0.wrm.txt