# Oro Verde Carbon Offset Project

This notebook implements a full solution to the mini-case.

In [1]:
# let's import the pandas module to read files
import pandas as pd

# Data available from the case, stored as a dictionary
# 500,000 sq feet of land, 150,000 gallons of water annually, $0,000 budget for initial payments
available = { "Land" : 500000, "Water" : 150000, "Cost" : 3000 }

# Q1.
_Write Python code that reads the file “Oro_Verde_Data.csv” and creates a DataFrame with the index corresponding to the names of the tree species. Display the DataFrame and have a look!_

We create a DataFrame by reading from the CSV file using the `read_csv` function. Note the second argument `index_col = 0`, which tells `pandas` to get the index from the first column of the table.

In [2]:
# create the dataframe by reading from the Excel file
df = pd.read_csv('Oro_Verde_Data.csv', index_col=0)

# display the dataframe
display(df)

Unnamed: 0_level_0,Sequestration_Rate,Water,Land,Survival_Rate,Cost
Tree_Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Maple,4.3,76,130,0.7,10.0
Elm,3.7,48,3600,0.6,6.5
Spruce,2.8,34,400,0.8,9.0
Pine,3.5,58,200,0.65,8.0
Oak,5.2,80,1000,0.75,12.5
Birch,3.1,50,300,0.68,7.5
Redwood,12.0,150,10000,0.9,25.0


# Q2.
_Create a Python list with the names of the columns of the DataFrame and a separate list with the names of tree species. Print these lists to check that they are correct._

In [3]:
# create a list with the columns
columns = list(df.columns)
print(columns)

# and another list for the index
tree_types = list(df.index)
print(tree_types)

['Sequestration_Rate', 'Water', 'Land', 'Survival_Rate', 'Cost']
['Maple', 'Elm', 'Spruce', 'Pine', 'Oak', 'Birch', 'Redwood']


# Q3.
_Suppose Oro Verde wants to plant 400 Redwoods and 500 trees of every other tree type. Create a Python dictionary with keys corresponding to the tree species and values corresponding to the number of trees planted. Print the dictionary to see its contents._

In [4]:
# let's create the dictionary
planted_trees = {}             # first an empty dictionary
for t in tree_types:           # loop through every tree type
    if t == "Redwood":
        planted_trees[t] = 40     # 40 Redwoods
    else:
        planted_trees[t] = 30     # 30 each for the rest

print(planted_trees)

{'Maple': 30, 'Elm': 30, 'Spruce': 30, 'Pine': 30, 'Oak': 30, 'Birch': 30, 'Redwood': 40}


# Q4.
_Suppose Oro Verde follows the planting scheme in Q3. Calculate and print the following quantities:_<br>
   _a) the expected total pounds of CO2 that would be sequestered annually when trees are mature;_<br>
   _c) the total cost of purchasing and planting seedlings._<br>
_Input the value that you obtained in (a) for CO2 sequestration in the Google sheet._

**IMPORTANT NOTE**. In this case, the number of trees that survive to become mature is **unknown**: the case gives us the survival rate, so presumably we could talk about a probabilistic model where trees survive with some probability (we will do that later on in the course). For purposes of calculating the expected amount of CO2 sequestered, only the expected number of trees that survive matters. 

Let's use the following notation:
 - $\mbox{Tree-types} = \{Maple, Elm, ..., Redwood\}$ denotes the set of trees, indexed by $t$
 - $\mbox{co2}[t]$, $\mbox{survive}[t]$ denote the annual CO2 sequestration rate and survival rate for a mature tree
 - $\mbox{cost}[t]$ : the cost for a seedling of tree type $t$
 - $\mbox{planted}[t]$ : how many seedlings of type $t$ are planted

Then, the expressions we need to calculate are:
\begin{aligned}
 \mbox{Total CO2 sequestered} & = \sum_{t \in \mbox{Tree-types}} \mbox{planted}[t] \cdot \mbox{survive}[t] \cdot \mbox{co2}[t] \\
 \mbox{Total cost} & = \sum_{t \in \mbox{Tree-types}} \mbox{planted}[t] \cdot \mbox{cost}[t]
\end{aligned}

In [5]:
# it helps to re-print the head of the dataframe with a few rows, so we know how to loop!
df.head(3)

Unnamed: 0_level_0,Sequestration_Rate,Water,Land,Survival_Rate,Cost
Tree_Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Maple,4.3,76,130,0.7,10.0
Elm,3.7,48,3600,0.6,6.5
Spruce,2.8,34,400,0.8,9.0


**Option 1.** We can calculate the total sum using loops that go through all the tree types.

In [6]:
# calculate sequestration
total_sequestration = 0
for t in tree_types:
    total_sequestration += planted_trees[t] * df.loc[t,"Survival_Rate"] * df.loc[t,"Sequestration_Rate"]
print(f"Total expected sequestration: {total_sequestration:,.2f} pounds of CO2 annually.")

# calculate planting cost
total_cost = 0
for t in tree_types:
    total_cost += planted_trees[t] * df.loc[t,"Cost"]
print(f"Total cost: ${total_cost:,.2f}.")

Total expected sequestration: 904.59 pounds of CO2 annually.
Total cost: $2,605.00.


<font color=blue>**Note**. If you have never seen operators like `+=`, this is shorthand for adding the right-hand-side on top of the left-hand-side and storing it there.</font>
<font color=blue>**Note**. So instead of writing `a = a + b`, we can simply write `a += b`.</font>

**Option 2.** We can use the `sum` function applied to the right list. To create the list, we use list comprehensions.

In [7]:
# create a list that stores the amount sequestered by each tree type
sequestration_by_tree_type = [ planted_trees[t] * df.loc[t,"Survival_Rate"] * df.loc[t,"Sequestration_Rate"] for t in tree_types ]

# calculate the total
total_sequestration = sum(sequestration_by_tree_type)
print(f"Total expected sequestration: {total_sequestration:,.2f} pounds of CO2 annually.")

Total expected sequestration: 904.59 pounds of CO2 annually.


The two steps can be combined into a single step. In fact, in that case you don't even need to create a list! You can just enumerate the terms. 

Here is how we do this, when calculating the total cost:

In [8]:
total_cost = sum( planted_trees[t] * df.loc[t,"Cost"] for t in tree_types )
print(f"Total cost: ${total_cost:,.2f}.")

Total cost: $2,605.00.


<font color=blue>**Note**. There is no bracket operator `[...]` inside the `sum()` function! We are just enumerating the terms we want to sum using a `for` statement.</font>

## Q5.
_Is the planting scheme in Q3 feasible, i.e., can it be executed when considering all the physical or financial resource constraints that Oro Verde is facing? If not, why not? (Answer in Google doc)_

To determine feasibility, we must first decide how to model the water and land requirements. A few thoughts: 
 - For **water**, the requirements are listed for both a seedling and a mature tree. Presumably, if we don't give enough water to a seedling, its growth would be impacted. So we should make sure that we have enough water to support seedling and also mature trees. Because there are always more seedlings than mature trees (because some seedlings do not survive), **we can simply write our water requirement for seedlings**.
 - For **land**, the requirement is listed for mature trees. One option is to just igore what is happening with seedlings and calculate the land requirement for mature trees, by taking the survival rate into account. However, in this case, if all the seedlings we planted were to mature, they would not have enough land available! So a more conservative requirement is to ensure that we have enough land even if all the planted seedlings become mature. (That calculation would ignore the survival rate.) **Subsequently, we adopt the more conservative requirement: we require that enough land should be available to support all planted seedlings as if they all survive to be mature.**

The comparisons we need to make then are:
 - **land**: is the total land used by **mature trees** -- <font color=red>**assuming that all planted seedlings survive**</font> -- exceeding the land available?
 - **water**: is the total water used by **seedlings** exceeding the land available? 
 - **cost**: is the total cost exceeding the budget available?

Let's calculate total land, water using the SECOND METHOD from Q4

In [9]:
what_to_compare = ["Land", "Water", "Cost"]

# for every element that we need to compare:
for resource in what_to_compare:
    # calculate what is used / needed
    usage = sum( planted_trees[t] * df.loc[t,resource] for t in tree_types )

    # compare with what is available
    if usage > available[resource]:
        print(f"Required {resource:<10s} is {usage}, which exceeds what is available, namely {available[resource]}")

Required Land       is 568900, which exceeds what is available, namely 500000


To determine whether the problem is feasible, we should also set up a boolean flag that becomes `False` if any of the requirements is not satisfied.

In [10]:
what_to_compare = ["Land", "Water", "Cost"]

# for every element that we need to compare:
problem_feasible = True
for resource in what_to_compare:
    # calculate what is used / needed
    usage = sum( planted_trees[t] * df.loc[t,resource] for t in tree_types )

    # compare with what is available
    if usage > available[resource]:
        print(f"Required {resource:<10s} is {usage:<8.0f}, which exceeds what is available, namely {available[resource]}")
        problem_feasible = False

if problem_feasible:
    print("The planting scheme is feasible.")
else:
    print("The planting scheme is infeasible.")    

Required Land       is 568900  , which exceeds what is available, namely 500000
The planting scheme is infeasible.


# Q6. 
_Consider again the planting scheme in Q3, but suppose Oro Verde is willing to change its scheme by exactly 30 trees of a single type, up or down. In other words, they are willing to choose one tree type for which to deviate from their plan by exactly 30 trees. (For instance, they could plant 40-30=10 Redwoods and 30 of every other type; or 40+30=70 Redwoods and 30 of every other type; or 40 Redwoods and 30+30=60 Oaks, and 30 of every other type, etc.) Write some code that prints all the changes of this kind that would result in a feasible planting scheme. (Answer in Google sheet.)_

We use exactly the same model and assumptions as in **Q5**. But we embed that code inside a double outer `for` loop, that checks for every tree type and every kind of change (up/down) what would happen if we changed the planting scheme.

In [11]:
what_to_compare = ["Land", "Water", "Cost"]

# An outer `for` loop that loops through all the types of tree
for change_tree in tree_types:

    # Another `for` loop that loops through possible changes
    for possible_change in [-30,30]:

        # Here, we just change the dictionary of planted trees. But we must be careful to switch it back to the original value after we're done!
        planted_trees[change_tree] += possible_change  # <--- make the change 
    
        # for every element that we need to compare:
        problem_feasible = True
        for resource in what_to_compare:
            # calculate what is used / needed
            usage = sum( planted_trees[t] * df.loc[t,resource] for t in tree_types )
        
            # compare with what is available
            if usage > available[resource]:
                # print(f"Required {resource:<10s} is {usage:<8.0f}, which exceeds what is available, namely {available[resource]}")
                problem_feasible = False
        
        if problem_feasible:
            print(f"The planting scheme is feasible if we change {change_tree} by {possible_change}, i.e., plant {planted_trees[change_tree]} instead of {planted_trees[change_tree]-possible_change}.")

        # Now, change the dictionary of planted trees back!
        planted_trees[change_tree] -= possible_change  # <--- make the change 

The planting scheme is feasible if we change Elm by -30, i.e., plant 0 instead of 30.
The planting scheme is feasible if we change Redwood by -30, i.e., plant 10 instead of 40.
