In [1]:
# Import dependencies
import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
from IPython.display import Markdown, display
from  stats_helper import *
def dis_res(x):
    display(Markdown('___\n##### **Result**: \n\n' + x + '\n___'))

# Estimating the gobal mass of leaves

As part of our procedure for estimating the total number of Rubisco enzymes in the world, we first estimate the total mass of leaves globally.

To achieve a global estimate of leaf mass we rely on two independent methodologies. The first is based on measurement of the total plant biomass and the mass fraction of leaves out of the total leaf biomass, and the second is based on estimating the total leaf area and converting the total leaf area into leaf mass.

## Method 1 - leaf mass fraction


To estimate the total leaf mass based on mass frations, we combine estimates from Erb et al. on plant biomass in each biome, along with estimates of the average leaf mass fraction in each biome. Our estimates of the average leaf mass fraction in each biome are based a recent meta-analysis which collected data on the lead, shoot and root mass fractions in several different biomes [(Poorter et al. (2012))](http://dx.doi.org/10.1111/j.1469-8137.2011.03952.x). Here is the data:

In [2]:
# Load data from Poorter et al.
fractions = pd.read_excel('../data/literature_data.xlsx','Poorter',skiprows=1,index_col=0)
fractions

Unnamed: 0_level_0,LMF,0.05,median,95,lower-fold,upper-fold,95% Std,N,95% SEM
Biome,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Tundra,0.09,0.005698,0.031339,0.248575,5.5,7.931818,6.604922,15,1.628129
Grassland,0.17,0.009972,0.070513,0.509259,7.071429,7.222222,7.146428,10,1.862459
Boreal forest,0.04,0.00641,0.023504,0.095442,3.666667,4.060606,3.858612,40,1.238006
Temperate forest,0.03,0.00641,0.019231,0.049145,3.0,2.555556,2.768875,70,1.129446
Tropical forest,0.02,0.00641,0.019231,0.029915,3.0,1.555556,2.160247,40,1.129509
Woodland,0.06,0.033476,0.048433,0.10114,1.446809,2.088235,1.738182,15,1.153433
Shrubland,0.09,0.032051,0.096154,0.140313,3.0,1.459259,2.092314,15,1.21
Desert,0.09,0.023504,0.043447,0.262821,1.848485,6.04918,3.343923,10,1.46482


The data in Poorter et al. does not include values for croplands. To estimate the mean leaf mass fraction in crops, we use a recent dataset published by the same authors ([Poorter et al. (2015)](https://doi.org/10.1111/nph.13571)). We calculate the geometric mean of the leaf mass fraction across the top 20 largest crops based on FAO data.

In [3]:
# Define crop species
crop_species = ['Saccharum officinarum','Zea mays','Triticum aestivum','Triticum compactum','Triticum dicoccoides','Triticum dicoccoides x A. squarrosa','Triticum durum','Triticum monococcum','Triticum spelta','Triticum timopheevii','Triticum turgidum','Oryza sativa','Solanum tuberosum','Glycine max','Elaeis guineensis','Beta vulgaris','Manihot esculenta','Solanum lycopersicum','Hordeum vulgare','Musa spec.','Malus domestica','Cucumis sativus','Vitis vinifera']

# Load data from Poorter et al.
LMF_species = pd.read_excel('../data/literature_data.xlsx','Poorter2015')

# Lookup crop species in Poorter et al.
crop_LMF = LMF_species[LMF_species.Species.isin(crop_species)]

# Calculate the geometric mean of the leaf mass fraction and use it as the fraction for crops
fractions.loc['Cropland','LMF'] = gmean(crop_LMF.groupby('Species')['LMF'].mean())

We calculate weighted mean of leaf mass fraction. We use the fraction of total plant biomass in each biome as our weights from [Erb et al.](https://doi.org/10.1038/nature25138) for the weighted mean. Here is the data from Erb et al.:

In [4]:
# Load data on the total plant biomass in each biome from Erb et al.
biomes = pd.read_excel('../data/literature_data.xlsx','Erb',skiprows=1,index_col=0)
biomes

Unnamed: 0_level_0,Total biomass [Gt C],Categories included in Poorter,Remarks
Biome,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Tropical forests,257.0637,Tropical forest,Includes tropical forests under managements an...
Temperate forests,39.458988,Temperate forest,Includes tropical forests under managements an...
Boreal forests,35.977312,Boreal forest,Includes tropical forests under managements an...
Cropland,10.0,Cropland,
Other wooded land,57.0,Shrubland,
Natural grasslands,19.0,Grassland,
Artifitial grasslands,7.0,Grassland,
Unused non-forest,16.5,Tundra,"Includes the category ""Wilderness, productive,..."


The specific biomes in Erb et al. are not fully matching the biomes in Poorter et al., and thus we traslate between the biomes in the two studies, and then merge the data. After we merge the data, we calculate the weighted average of the leaf mass fraction using the biomass of each biome as the weights:

In [5]:
# Merge LMF data with the biomass of each biome
biome_LMF = biomes.merge(fractions,left_on='Categories included in Poorter',right_index=True)

# Calculate the weighted average of the LMF
mean_LMF = np.average(biome_LMF['LMF'],weights=biome_LMF['Total biomass [Gt C]'])

We also use the data in [Poorter et al. (2015)](https://doi.org/10.1111/nph.13571) to calculate the leaf mass fraction over all the grasses family except our crop species, to generate an additional estimate of the leaf mass fraction in grasslands:

In [6]:
# Create a copy of the leaf mass fraction dataframe
biome_LMF2 = biome_LMF.copy()

# Take from Poorter et al. (2015) only grass species which are not crops
grassland_species = LMF_species.loc[~(LMF_species.Species.isin(crop_species)) & (LMF_species.Family == 'Poaceae')]

# Calculate the geometric mean of the LMF of the grass species and use it as the leaf mass fractions for grasslands
biome_LMF2.loc[biome_LMF2['Categories included in Poorter'] == 'Grassland','LMF'] = gmean(grassland_species.LMF)

# Calculate the weighted average of the LMF
mean_LMF2 = np.average(biome_LMF2['LMF'],weights=biome_LMF2['Total biomass [Gt C]'])

As our best estimate of the leaf mass fraction, we use the geometric mean of our estimate with the grassland values based on Poorter et al. (2012) or Poorter et al. (2015):

In [7]:
best_leaf_frac = gmean([mean_LMF2,mean_LMF])
dis_res('Our global average for the leaf mass fraction is ≈%.0f percent' %(best_leaf_frac*100))

___
##### **Result**: 

Our global average for the leaf mass fraction is ≈6 percent
___

To estimate the total mass of leaves, we rely on our estimate of the total plant biomass ([Bar-On et al.](https://doi.org/10.1073/pnas.1711842115)) of ≈450 Gt C, and we assume carbon is about 50% of the dry weight of plants. We thus estimate ≈900 Gt plant dry weight. We multiply the global leaf mass fraction by the total dry mass of plants to get an estimate for the total mass of leaves.

In [8]:
global_plant_mass =900e15
global_leaf_mass = global_plant_mass*best_leaf_frac
dis_res('Our estimate for the global leaf mass based on leaf mass fraction measurements is ≈%.0f Gt' %(global_leaf_mass/1e15))

___
##### **Result**: 

Our estimate for the global leaf mass based on leaf mass fraction measurements is ≈52 Gt
___

## Estimating the fraction of leaf mass in herbaceous plants
For our calculations in the next section (estimating the mass fraction of Rubisco out of leaf mass), we also calculate here the fraction of leaf mass that is woody or herbaceous:

In [9]:
# Calculate the total mass of leaves in each biome (once for each value of the leaf mass fraction of grasslands)
biome_LMF['Total leaf mass [Gt]'] = biome_LMF['Total biomass [Gt C]']*biome_LMF['LMF']*2
biome_LMF2['Total leaf mass [Gt]'] = biome_LMF2['Total biomass [Gt C]']*biome_LMF2['LMF']*2

woody = ['Tropical forests','Temperate forests','Boreal forests','Other wooded land']

woody_frac = biome_LMF.loc[woody,'Total leaf mass [Gt]'].sum()/biome_LMF['Total leaf mass [Gt]'].sum()
woody_frac2 = biome_LMF2.loc[woody,'Total leaf mass [Gt]'].sum()/biome_LMF2['Total leaf mass [Gt]'].sum()
best_woody_frac = gmean([woody_frac,woody_frac2])
dis_res('Our best estimate is that leaves of herbaceous plants account for %.0f percent out of the total leaf mass based on leaf mass fraction' %((1-best_woody_frac)*100))

___
##### **Result**: 

Our best estimate is that leaves of herbaceous plants account for 49 percent out of the total leaf mass based on leaf mass fraction
___

We combine this estimate with our estimate of the fraction of leaf mass in herbaceous plants based on leaf area estimates, which we derive in the notebook [**```01_remote_sensing_based_leaf_mass.ipynb```**](01_remote_sensing_based_leaf_mass.ipynb) to be ≈20%. We use the geometric mean of the estimate based on leaf mass fraction and the estimate based on leaf are as our best estimate of the fraction of leaf mass that is herbabeous:

In [10]:
best_herb_frac = gmean([1-best_woody_frac,0.23])
dis_res('Our best estimate is that leaves of herbaceous plants account for %.0f percent out of the total leaf mass' %(best_herb_frac*100))

___
##### **Result**: 

Our best estimate is that leaves of herbaceous plants account for 34 percent out of the total leaf mass
___

In the notebook [**```01_remote_sensing_based_leaf_mass.ipynb```**](01_remote_sensing_based_leaf_mass.ipynb) we also estimate that C4 plant leaves account for ≈25% of the total hebaceous plant leaf mass. We apply this fraction here to arrive at our best estimate for the fraction of C4 leaf mass and C3 leaf mass out of the total leaf mass. This means C3 herbacoues plants account for ≈25% of the total leaf mass and C4 plants account for ≈9% of the total leaf mass.

## Method 2 - Remote sensing based leaf mass

In the notebook [**```remote_sensing_based_leaf_mass.ipynb```**](01_remote_sensing_based_leaf_mass.ipynb) we estimate that the total mass of leaves is ≈20 Gt.

As our best estimate for the total mass of leaves, we use the geometric mean of the two methods:

In [11]:
best_leaf_mass = gmean([global_leaf_mass,20e15])
dis_res('Our best estimate for the global leaf mass is ≈%.0f Gt' %(best_leaf_mass/1e15))

___
##### **Result**: 

Our best estimate for the global leaf mass is ≈32 Gt
___

# Uncertainty analysis
To project the uncertainty associated with the estimate of the total mass of leaves, we first calculate the uncertainty around our estimate of the total mass of leaves based on the leaf mass fraction in each biome. We had to different estimates for the mass of leaves based on leaf mass fraction per biome - each one with a different value for the leaf mass fraction in grasslands. For each of those estimates we calculate the uncertainty associated with it. In addition, as our best estimate for the leaf mass fraction based of biome leaf mass fractions was calculated as the geometric mean of the two estimates, we use the differnce between the two estimates as a measure of the uncertainty associated with our final estimate. We use the highest uncertainty among these three uncertainties (one for each estimate and one based on the difference between the estimates) as our best projection for the uncertainty associated with our estimate of the total leaf mass based on biome leaf mass fractions.

In [12]:
# Calculate the multiplicative uncertainty of the leaf mass in each biome (once for each value of the leaf mass fraction of grasslands)
# We use the geometric mean of the standard deviation and standard error as our best projection of the uncertainty
biome_LMF['mul_CI'] = [gmean([x,y]) for x,y in zip(biome_LMF['95% Std'],biome_LMF['95% SEM'])]
biome_LMF2['mul_CI'] = [gmean([x,y]) for x,y in zip(biome_LMF2['95% Std'],biome_LMF2['95% SEM'])]

# For crops calculate the uncertainty based on the data in Poorter et al. (2015)
biome_LMF.loc['Cropland','mul_CI'] = mul_CI(crop_LMF.groupby('Species')['LMF'].mean())
biome_LMF2.loc['Cropland','mul_CI'] = mul_CI(crop_LMF.groupby('Species')['LMF'].mean())

# For the second estiamte in which grassland values are based on Poorter et al. (2015), calculate the uncertainty 
# based on the values in Poorter et al. (2015)
biome_LMF2.loc[biome_LMF2['Categories included in Poorter'] == 'Grassland','mul_CI'] = mul_CI(grassland_species.LMF)

# Propagate the uncertainties to the final estimates
leaf_mass_frac_CI = CI_sum_prop(biome_LMF['Total leaf mass [Gt]'],biome_LMF['mul_CI'])
leaf_mass_frac_CI2 = CI_sum_prop(biome_LMF2['Total leaf mass [Gt]'],biome_LMF2['mul_CI'])

# Calculate uncertainty based on the difference between the estiamtes
inter_method_leaf_mass_fraction_CI = mul_CI([mean_LMF,mean_LMF2])

# Use the highest uncertainty as our best projection
best_leaf_mass_frac_CI = np.max([leaf_mass_frac_CI2,leaf_mass_frac_CI,inter_method_leaf_mass_fraction_CI])
dis_res('Our projection for the uncertainty associated with our estimate of the mass fraction of leaves is ≈%.1f-fold' %best_leaf_mass_frac_CI)


___
##### **Result**: 

Our projection for the uncertainty associated with our estimate of the mass fraction of leaves is ≈1.3-fold
___

As for the total mass of plants, [Bar-On et al.](https://dx.doi.org/10.1073/pnas.1711842115) projected an uncertainty of ≈1.2-fold associated with the total mass of plants. We combine the uncertainty associated with our estimate of the leaf mass fraction with the uncertainty associated with the total mass of plants:

In [13]:
tot_leaf_mass_frac_CI = CI_prod_prop([best_leaf_mass_frac_CI,1.2])
dis_res('Our projection for the uncertainty associated with our estimate of the total mass of leaves based on leaf mass fraction is ≈%.1f-fold' %tot_leaf_mass_frac_CI)

___
##### **Result**: 

Our projection for the uncertainty associated with our estimate of the total mass of leaves based on leaf mass fraction is ≈1.4-fold
___

In the notebook [**```remote_sensing_based_leaf_mass.ipynb```**](01_remote_sensing_based_leaf_mass.ipynb) we estimate that the uncertainty associated with the estimate of the total mass of leaves based on remote sensing is ≈2-fold. We can calculate the uncertainty based on the difference of the estimates based on the two different methodologies (leaf mass based and remote sensing based):


In [14]:
inter_method_CI = mul_CI([global_leaf_mass,20e15])
dis_res('Our projection for the uncertainty associated with our estimate of the total mass of leaves based on the difference between our two independent methodologies is ≈%.1f-fold' %inter_method_CI)

___
##### **Result**: 

Our projection for the uncertainty associated with our estimate of the total mass of leaves based on the difference between our two independent methodologies is ≈2.2-fold
___

Overall, we use the highest uncertainty out of the uncertainties reported for each estimate and the uncertainty based on the difference between the different methodologies, which is ≈2.2-fold.