### Computational Guided Inquiry for Modeling Earth's Climate (Neshyba & Posta, 2024)

# CumulativeAnalysis

## Introduction
Here, we're going to focus on the accumulated sum of emissions over time, written mathematically as

$$
E(t) = \int_{-\infty}^t \epsilon(t) \ dt \ \ \ \ \ (1)
$$

where $\epsilon(t)$ is the annual emission rate we created in ScheduledFlows.

Why would we care about this accumulation? There's a school of thought in climate science, that what really counts is not so much how much carbon goes into the air any given year, but total amount humans put there.

There's an easy way to do this integration in Python. The key is a numpy function called "np.cumsum," which adds up the emissions year after year. To convert the result of np.cumsum into a total accumulated emission in GtC, we need to multiply it by the time interval between steps in the array we're using to represent the emissions. Details are given in the code below.

## Goals

- I can use pandas and dictionaries to read data with metadata.
- I can use np.cumsum to numerically integrate a function.
- I am familiar with quantitative features of cumulative anthropogenic carbon emissions.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import h5io

In [None]:
%matplotlib inline
plt.rcParams["figure.figsize"] = (12, 8)
plt.rcParams['font.size'] = 18

## Cumulative flow

### Load your scenario
To get the file you generated in *ScheduledFlows* into your current work space, _download_ it to your laptop or desktop, then _upload_ it to the current folder. 

Once you've done that, in the cell below, use h5io.read_hdf5 to load the scenario into Python as a dictionary named epsdictionary_fromfile (just as you did in *ScheduledFlows*). 

In [None]:
# Load in your scenario as a dictionary, using h5io.read
# your code here 


### From dictionary -> dataframe -> Numpy data arrays
The cell below extracts time and emissions arrays in the scenario.

In [None]:
# Here we're using "display" to double-check the metadata in your dictionary
display(epsdictionary_fromfile)

# This extracts the dataframe from the dictionary
epsdf = epsdictionary_fromfile['dataframe']

# This extracts the time and emissions from the dataframe
time = np.array(epsdf['time'])
eps = np.array(epsdf['emissions'])

### Plotting to see what we stored
In the cell below, plot the emissions we just extracted as a function of time (this will remind us what the scenario looks like, and also verify that the data haven't been corrupted).

In [None]:
# your code here 


### Calculating the accumulated anthropogenic carbon emission
The code below attempts to carry out the integration indicated in Eq. 1 numerically. It's *almost* correct. Execute the cell, then read on to see what the error is.

In [None]:
# This specifies beginning accumulated amount of anthropogenic carbon in the atmosphere
E = 0

# Initialize an empty numpy array that will hold new values over time
E_array = np.empty(0)

# Loop over all the times
for i in range(len(time)):
    E += eps[i]
    E_array = np.append(E_array,E)

# Graph it
plt.figure()
plt.plot(time,E_array)
plt.grid()
plt.xlabel('year')
plt.ylabel('GtC')
plt.title('(Improperly) integrated anthropogenic carbon')

### Your turn 
OK, what's the error in the above code? There's nothing in the loop that takes into account the length of each time step! (Like, is it a year? Half a year? What?) 

To fix that, we need to figure out what that time step is. An easy way to do that is to say 

    dt = t[1]-t[0])
    print('dt =', dt)
    
Then, in each pass through the loop, multiply the emission rate by that time interval, something along the lines of

    E += eps[i]*dt

In the cell below, make this correction, then graph your result.

In [None]:
# your code here 


### Pause for analysis
It is thought that known reserves of fossil carbon (mostly in the form of coal) tally up to around 4000 GtC (see https://www.nature.com/articles/nature14016). Hopefully, your cumulative total is less than that amount (otherwise, we have an unrealistic scenario!). In the cell below, calculate the percentage of known reserves of fossil carbon does your schedule *leaves in the ground*.

Some hints on how to do this: 
- The last value of the accumulated emission array you just calculated can be accessed by E_array[-1].
- The fraction of known reserves used by that time must be that number divided by 4000.
- The fraction of known reserves remaining by that time must be 1 minus that number; multiplying that by 100 will give us what we're looking for, namely, the percent of known reserves remaining in the ground by the time humans stop mining fossil fuels and burning them.

In [None]:
# your code here 


### Refresh/save/validate
Almost done! To double-check everything is OK, repeat the "Three steps for refreshing and saving your code," and press the "Validate" button (as usual).

### Close/submit/logout
1. Close this notebook using the "File/Close and Halt" dropdown menu
1. Using the Assignments tab, submit this notebook
1. Press the Logout tab of the Home Page