# Week 6 Problem Set: Ocean Acidification
Author: Juliana Olsen-Valdez

Date: October 8th, 2020

## Introduction
Ocean acidification is a hot topic when it comes to present day climate change and global warming. In this problem set, we'll explore the how and why behind ocean acidification by using chemical reactions to explain the process and apply pH and saturation state calculations through R coding methods.


## Learning goals
By the end of this exercise, you will be able to:
1. Explain what ocean acidification is and how it works
2. Derive the chemical reactions that relate to ocean acidification and carbonate dissolution
3. Use the tibble function to make a data frame with various values related to ocean acidification in seawater
4. Make calculations using your data frame values to compare the effects of ocean acidification on different seawater chemistries


## Prerequisites
Prior to completing this Problem Set, it is expected that students: 

1. Have a basic understanding of general chemistry, specifically with terms like acid, pH, and dissolution
2. Have a basic familiarity with the structure and syntax of code in R
3. Know how to copy, cut, comment-out, and edit text and code in R


## Background and Defining Terms
$\textbf{Ocean acidification}$ refers to the process by which $\textit{increasing levels of carbon dioxide}$ ($CO_2$) correlate to a $\textit{decrease in the pH of the ocean}$. The earth has a natural process by which is sequesters carbon dioxide (in rocks, for example) and releases carbon dioxide (by trees, for example). However, anthropogenic ("human-caused") behavior has resulted in an unnatural increase in the amount of carbon dioxide in the atmosphere, which also leads to higher levels of $CO_2$ in the oceans. 

<img src= "data/Ocean-Acidification_image.png" width = 700>

But, what's "pH" again? Well, acidity is controlled by how many $H^+$ ions are in a solution (acids release $H^+$ ions) and the pH scale is used to describe the $\textit{concentration}$ of $H^+$ ions, which is represented by the term $[H^+]$. A solution is said to be acidic when the pH is below 7, and basic when the pH is above 7. Even a small change on the pH scale indicates a large change in H+ concentration of a solution (like the ocean); this is due to the log relationship between pH and $H^+$.

$$[H^+]=10^{-pH}$$

Ocean acidification is actually a misleading name for the process by which the oceans become more "acidic." That's because the increase in atmospheric $CO_2$ in the oceans has changed the ocean from alkaline (with a pH above 7) to $\textit{less alkaline}$ where the pH is still above 7 (so not acidic) but lower than it was before.

Now that we've defined some important terms and described $\textbf{how}$ ocean acidification works, let's look at some chemical reactions that break down $\textbf{why}$ it occurs. When looking at these reactions, remember that all of these are forward and reverse reactions.

The first chemical reaction describes what happens when carbon dioxide from the atmosphere mixes with seawater to form carbonic acid. Anthropogenic climate change has lead to an increase in the amount of carbon dioxide in the atmosphere, which pushes this equation to the right. 

$$1)\ CO_2 + H_2O \longleftrightarrow\  H_2CO_3^{2-}$$

The second chemical reaction describes how carbonic acid breaks down to form hydrogen $(H^+)$ ions and bicarbonate $(HCO_3^{-})$. This increases the concentration of $(H^+)$ ions in the water, which would lower the pH (hence, ocean acidification!).

$$2)\ H_2CO_3^{2-} \longleftrightarrow\ H^+ + HCO_3^{-}$$

The third chemical reaction describes how bicarbonate breaks down to form $H^+$ ions and carbonate $(CO_3^{2-})$. This reaction is important because we'll be thinking about it in $\textbf{reverse}$. When there is a high concentration of $H^+$ ions in the ocean, they bond to carbonate to form more bicarbonate, leaving less carbonate in the water. 

$$3)\ HCO_3^{-} \longleftrightarrow\  H^+ + CO_3^{2-}$$

The fourth chemical reaction describes the process of $\textbf{calcification}$. Calcifying organisms use calcium $(Ca^{2+})$ and carbonate $(CO_3^{2-})$ suspended in the ocean to form their shells, skeletons, and corals. Carbonate in the ocean comes from the weathering of rocks and the breakdown of old carbonate shells or skeletons, but these processes take time. This forward reaction can only occur if carbonate is saturated in the water and the reverse reaction can occur if there are a lot of $H^+$ ions in the water since hydrogen can better bond to carbonate compared to calcium.

$$4)\ Ca^{2+} + CO_3^{2-} \longleftrightarrow\  CaCO_3$$

Now that we've broken down our four reactions let's think about why they explain the process behind ocean acidification, and how its detrimental to the ocean. 
1. When atmospheric carbon dioxide increases in the atmopshere, it reacts with seawater to form carbonic acid (Reaction 1)

2. Carbonic acid breaks down to hydrogen ions and bicarbonate, increasing the $[H^+]$ and lowering pH, which can affect plants and animals in the ocean that are sensitive to changes in pH (Reaction 2)

3. But wait, there's more; with a greater concentration of hydrogen in the water, any suspended carbonate bonds with hydrogen to form more bicarbonate (remember, this is because hydrogen is better at bonding with carbonate than calcium)(reverse Reaction 3) 

4. Uh oh! Calcifying organisms that use carbonate to make their shells (Reaction 4) have a harder time or can't find any carbonate in the water since it's reacted with hydrogen to form bicarbonate!

5. But it gets worse; when the concentration of hydrogen in the water is so large and much of the carbonate has been reacted to form bicarbonate, the carbonate from organism shells or corals starts to break down and react with the hydrogen, leading to the corrosion of coral skeletons and shells (Reverse Reaction 4). 

### Example
We understand terms like pH, dissolution and the chemical reactions that define ocean acidification. Now, let's jump back to the source of this process: the increasing atmospheric carbon dioxide, also known as $pCO_2$ that lowers the pH of seawater and wreaks havoc on calcifying organisms trying to make their calcium carbonate shells and exoskeletons! 

We're going to investigate a data set that compares the mean level of gaseous carbon dioxide measured in the atmosphere $(pCO_2)$ (also known as partial pressure) in parts per million (ppm) over time (in years) (Source:NOAA Global Monitoring Laboratory, Mauna Loa):

In [59]:
# Import the libraries we'll use in this Problem Set
import numpy as np
import pandas as pd

We'll read in our 'txt' file with our $pCO_2$ information. When we read these files in, we do it from a local directory, meaning that the txt file has to be in the same folder or local environment as our Problem Set.

In [60]:
# read in our txt file from our folder
CO2_annual = pd.read_fwf('co2_annmean_mlo.txt', sep=" ", header=None) #we add some of these extra code things for formatting purposes
CO2_annual.columns = ["Year", "CO2_ppm", "unc"] #We manually name the columns as year, mean (pCO2) and unc (uncertainty)

# Check the first few rows of our data frame out below 
CO2_annual.head()

Unnamed: 0,Year,CO2_ppm,unc
0,1959,315.97,0.12
1,1960,316.91,0.12
2,1961,317.64,0.12
3,1962,318.45,0.12
4,1963,318.99,0.12


Let's use what's known as an if/else statement and a for loop on our data frame to comment on the magnitude of the atmospheric carbon dioxide (ppm) in different years. 
You won't be required to do this yourself, but I'll briefly explain what they are used for and what the code is doing (see the commented sections in the code lines). 

First, an $\textbf{if/else statement}$ can be assigned to a a vector of values to look for a *condition* within values of the vector (e.g. in a vector of colors, is the value blue?). The *expression* that follows assigns a characteristic to a TRUE condition (e.g. if the value is blue, print "it's blue") so that every value that follows the if conditional statement receives that expression. Further, an else statement can serve as a catch all for anything that doesn't follow that conditional statement (e.g. if the value is NOT blue, print "it's not blue"). 

Lastly, a $\textbf{for loop}$ can repeatedly assign the if/else command to a column in a data frame, not just a set of defined values that were written out in vector-form. That way, R will go through and look for the same conditional statement and follow the same expression when that statement is TRUE, but it can do it for HUGE data sets that you don't have to define values for! It's a powerful tool! 

Let's try it with the data frame we have from NOAA. We are going to use a for loop and if/else to make a conditional statement about atmospheric $CO_2$ (ppm). If you remember, it used to be assumed that the amount of atmospheric carbon dioxide would never reach above 400ppm and that if it did, the Earth would be in serious trouble. Well, it did... so we are going to make a for loop that looks through the "CO2_ppm" column in our data frame to assign a comment of "Less than 400ppm" if a value meets the conditional statement of "CO2_ppm < 400ppm". If a value does not meet that requirement (i.e. "CO2_ppm > 400ppm", it will assign a comment of "Yikes, we made it past the 400ppm threshold". Follow along with the comments on each line of code to understand each step!

In [61]:
for i in range(len(CO2_annual)): #create a for loop that references the data frame
    if CO2_annual["CO2_ppm"][i] < 400: #create an if statement that calls CO2 values in the "CO2_ppm" column that are < 400
        print('Less than 400ppm') #if this conditional statement is followed, we ask python to print this comment
    else:                         #we create an else statement that includes anything that doesn't follow the if statement rule
        print('Yikes, we made it past the 400ppm threshold') #if this conditional statement is followed, we ask python to print this comment   

Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Less than 400ppm
Yikes, we made it past the 400ppm threshold
Yike

We could also add a new column called 'comment' if we wanted these comments to come up in our data frame, which we've done below:

In [62]:
CO2_annual['comment'] = 'Less than 400ppm'#We create a new comment column with the default comment 
CO2_annual.loc[CO2_annual['CO2_ppm'] > 400, 'comment'] = 'Yikes, we made it past the 400ppm threshold' #We say that if the CO2_ppm > 400, the comment in our new column will be this insead
CO2_annual #We display the data frame below

Unnamed: 0,Year,CO2_ppm,unc,comment
0,1959,315.97,0.12,Less than 400ppm
1,1960,316.91,0.12,Less than 400ppm
2,1961,317.64,0.12,Less than 400ppm
3,1962,318.45,0.12,Less than 400ppm
4,1963,318.99,0.12,Less than 400ppm
...,...,...,...,...
56,2015,400.83,0.12,"Yikes, we made it past the 400ppm threshold"
57,2016,404.23,0.12,"Yikes, we made it past the 400ppm threshold"
58,2017,406.55,0.12,"Yikes, we made it past the 400ppm threshold"
59,2018,408.52,0.12,"Yikes, we made it past the 400ppm threshold"


And when we call the data frame, we see the 'comment' column with the statements we specified! If you look at the $CO_2$ (ppm) value for 2015 (400.83 ppm) the comment has switched from "Less than 400ppm" to "Yikes, we made it past the 400ppm threshold." Great, Python did it's job!

We can also use a $\textbf{while loop}$ to let Python make similar computations. A while loop will continuously evaluate an expression in a vector until that statement isn't TRUE. For example, we can apply a while loop to ask R to express whether or not the values in the "Year" column are less than 2000 (i.e., at what point we transition to the 2000s). Python will go through for each value (starting at i=0 and continuing down the vector of values until a value is no longer TRUE with the expression) and determine if the value meets the condition (Year < 2000). It will keep going until it hits a value that is no longer TRUE, i.e. Year < 2000. For big data sets, this can be a really useful tool, but take caution, a while loop can end up looping infinitely if it never finds a value that doesn't come back TRUE! We've included a back up for this that makes the while loop stop after it's done the same number of iterations equal to the number of rows in the data frame. 

Let's look at the code that corresponds to this example:

In [63]:
i = 1 #we tell Python what i will start at
while CO2_annual["Year"][i] < 2000:   #apply a while loop that says for all values of "Year" in our data frame, while "Year < 2000". Note we've tied "i" to our column "Year"
    print('Before 2000')      #apply an expression that will be included in the "Note" column if the while statement was TRUE|
    i += 1

Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000
Before 2000


We could also add a new column called 'Notes' if we wanted these comments to come up in our data frame, which we've done below:

In [64]:
CO2_annual['Note'] = 'Before 2000'
CO2_annual.loc[CO2_annual['Year'] > 2000, 'Note'] = 'After 2000'
CO2_annual

Unnamed: 0,Year,CO2_ppm,unc,comment,Note
0,1959,315.97,0.12,Less than 400ppm,Before 2000
1,1960,316.91,0.12,Less than 400ppm,Before 2000
2,1961,317.64,0.12,Less than 400ppm,Before 2000
3,1962,318.45,0.12,Less than 400ppm,Before 2000
4,1963,318.99,0.12,Less than 400ppm,Before 2000
...,...,...,...,...,...
56,2015,400.83,0.12,"Yikes, we made it past the 400ppm threshold",After 2000
57,2016,404.23,0.12,"Yikes, we made it past the 400ppm threshold",After 2000
58,2017,406.55,0.12,"Yikes, we made it past the 400ppm threshold",After 2000
59,2018,408.52,0.12,"Yikes, we made it past the 400ppm threshold",After 2000


When we call our data frame, we see that there is a new "Note" column with the expression "Before 2000" up until 1999, and then from 2000 onward there is no comment! Yay, the while loop worked!

You won't have to use a for loop, while loop, or if/else statement in the questions, but hopefully you were able to get a sense for their utility in parsing through and completing iterative operations on big data sets. 


## Questions

1) Shortly explain (2-3 sentences), in your own words, why ocean acidification occurs, and how it affects ocean ecosystems, particularly calcifying organisms. Use the terms included in the Problem Set but keep it concise!

_______________________________________________________

2) Do your own research! Find a marine organism that has been affected by ocean acidification and write a short (2-3 sentence) description of how they're affected, using at least one term that was outlined in the Problem Set. If you don't know where to start, look at the sources provided at the bottom of the Problem Set. 

________________________________________________________

3) Though the pH of seawater remained mostly constant for millions of years before present, in the last hundred years, the pH of surface sea water has fallen from 8.2 to 8.1. Further, it's been hypothesized that the pH will drop another 0.4 (to a value of 7.7) by the year 2100 (Orr et al., 2005). Calculate the percent rise in acidity for the change in pH for past (8.2) to modern (8.1) and modern to future (7.7), keeping in mind that $[H^{+}] = 10^{pH}$. Make your calculation by creating a small data frame. Hint: if you're having trouble knowing where to start, go back to the code from the example in the Week 2 Problem Set. Once you get your percent changes, comment on the magnitude of the values; were they what you were expecting based on the 0.1 and 0.4 changes in pH? 

_________________________________________________________

## Sources
-https://www.esrl.noaa.gov/gmd/ccgg/trends/data.html

-https://ocean.si.edu/ocean-life/invertebrates/ocean-acidification

-https://theotherco2problem.wordpress.com/what-happens-chemically/

-https://www.st.nmfs.noaa.gov/Assets/Nemo/documents/lessons/Lesson_3/Lesson_3-Teacher's_Guide.pdf

-https://www.oceanacidification.de/calcite-aragonite/?lang=en

## Solutions to Questions

1) Shortly explain (2-3 sentences), in your own words, why ocean acidification occurs, and how it affects ocean ecosystems, particularly calcifying organisms. Use the terms included in the Problem Set but keep it concise!

**Question 1 Solution**

Ocean acidification occurs when the atmospheric $CO_2$ increases and leads to a greater amount of carbonic acid $H_2CO_3^{2-}$ in seawater. Carbonic acid dissolves in seawater to create $[H^+]$ ions that bond to suspended carbonate $CO_3^{2-}$ and can dissolve carbonate that is bound in organism shells. Therefore, increased $[H^+]$ leads to lower pH (making seawater more acidic) and limits the $CO_3^{2-}$ available to calcifying organisms. 

_______________________________________________________

2) Do your own research! Find a marine organism that has been affected by ocean acidification and write a short (2-3 sentence) description of how they're affected, using at least one term that was outlined in the Problem Set. If you don't know where to start, look at the sources provided at the bottom of the Problem Set. 

**Question 2 Solution**
Mussels, clams, urchins, starfish: These organisms are especially affected by ocean acidification because their shells are made from high-magnesium calcium carbonate, not calcium carbonate, which is even more sensitive to dissolution. The weaker shell means they are at a higher risk for being eaten, crushed, or having their shells corrode. The Smithsonian states that, "Mussels and oysters are expected to grow less shell by 25 percent and 10 percent respectively by the end of the century." (Source: https://ocean.si.edu/ocean-life/invertebrates/ocean-acidification)

________________________________________________________

3) Though the pH of seawater remained mostly constant for millions of years before present, in the last hundred years, the pH of surface sea water has fallen from 8.2 to 8.1. Further, it's been hypothesized that the pH will drop another 0.4 (to a value of 7.7) by the year 2100 (Orr et al., 2005). Calculate the percent rise in acidity for the change in pH for past (8.2) to modern (8.1) and modern to future (7.7), keeping in mind that $[H^{+}] = 10^{pH}$. Make your calculation by creating a small data frame. Hint: if you're having trouble knowing where to start, go back to the code from the example in the Week 2 Problem Set. Once you get your percent changes, comment on the magnitude of the values; were they what you were expecting based on the 0.1 and 0.4 changes in pH? 

**Question 3 Solution:**

In [65]:
d = {'Description':  ['past', 'modern','future'], #create a list of descriptors and pH values
    'pH': [8.2, 8.1,7.7]}

pH_change = pd.DataFrame(data=d) #turn your list into a pandas data frame
pH_change['pH']=pd.to_numeric(pH_change.pH) #make sure that the pH column is numeric 

pH_change #display the current data frame

Unnamed: 0,Description,pH
0,past,8.2
1,modern,8.1
2,future,7.7


In [66]:
pH_change["Hconc"] = 10 ** (-(pH_change["pH"])) #add a column with calculated H+ concentration

pH_change #display the updated data frame

Unnamed: 0,Description,pH,Hconc
0,past,8.2,6.309573e-09
1,modern,8.1,7.943282e-09
2,future,7.7,1.995262e-08


In [67]:
pH_change["diff_mod_past"] = pH_change.iloc[1,2] / pH_change.iloc[0,2] #add a column with the calculated difference in H+ concentration between modern and past 
pH_change["diff_fut_mod"] = pH_change.iloc[2,2] / pH_change.iloc[1,2] #add a column with the calculated difference in H+ concentration between future and modern

pH_change #display the updated data frame

Unnamed: 0,Description,pH,Hconc,diff_mod_past,diff_fut_mod
0,past,8.2,6.309573e-09,1.258925,2.511886
1,modern,8.1,7.943282e-09,1.258925,2.511886
2,future,7.7,1.995262e-08,1.258925,2.511886


In [68]:
pH_change["diff_mod_past_perc"] = (pH_change["diff_mod_past"] - 1) * 100 #add a column with the calculated associated percent increase between modern and past
pH_change["diff_fut_mod_perc"] = (pH_change["diff_fut_mod"] - 1) * 100 #add a column with the calculated associated percent increase between future and modern 

pH_change #display the updated data frame

Unnamed: 0,Description,pH,Hconc,diff_mod_past,diff_fut_mod,diff_mod_past_perc,diff_fut_mod_perc
0,past,8.2,6.309573e-09,1.258925,2.511886,25.892541,151.188643
1,modern,8.1,7.943282e-09,1.258925,2.511886,25.892541,151.188643
2,future,7.7,1.995262e-08,1.258925,2.511886,25.892541,151.188643


We can see that the increase of pH by 0.1 from past to modern and 0.4 from modern to predicted future results in a 25% increase in Hydrogen ion concentration (or acidity) from past to modern and a predicted 150% increase in the acidity in the future. This is a lot more than may be first expected with a change in pH of less than 1 between both time ranges.