# MS Analysis
By Cascade Tuholske June 2020

Notebook is to calculate statistics and information for the waste water MS. This will remain a notebook, not a .py file because it allows for easier documentation.

For global stats, we will be using the gdam boundaries, not the watersheds data, because the gdam boiundaries are used to produce the effluent rasters.

In [1]:
#### Dependencies
import numpy as np
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt

In [4]:
#### File Paths
DATA_IN = '../../data/'

In [5]:
#### Load Files
# countries_fn = 'processed/N_effluent_output/effluent_N_countries_all.shp' # has EEZs
watersheds_fn = 'processed/N_effluent_output/effluent_N_watersheds_all.shp'
countries_gdam_fn = 'processed/N_effluent_output/effluent_N_countries_gdam_all.shp'

# countries = gpd.read_file(DATA_IN+countries_fn)
watersheds = gpd.read_file (DATA_IN+watersheds_fn)
countries_gdam = gpd.read_file(DATA_IN+countries_gdam_fn)

In [6]:
countries_gdam.head()

Unnamed: 0,poly_id,ISO3,open_N,treated_N,septic_N,tot_N,open_N_pct,septic_N_p,treated_N_,tot_pct,geometry
0,0,ABW,564509.8,60470720.0,2269944.0,63305170.0,0.891728,3.585716,95.522561,100.000005,POLYGON ((-6910816.274421509 1537164.105192429...
1,1,AFG,11874620000.0,110714900.0,42028760.0,12027370000.0,98.729978,0.349443,0.920524,99.999945,"POLYGON ((6211413.408736277 3848509.713654268,..."
2,2,AGO,7808749000.0,301120000.0,407965700.0,8517835000.0,91.675286,4.789547,3.535171,100.000004,(POLYGON ((1159542.671178129 -2115445.01835647...
3,3,AIA,777235.4,6268219.0,2440849.0,9486304.0,8.193238,25.730242,66.07652,99.999999,(POLYGON ((-6134902.918835666 2230758.12862136...
4,4,ALA,2419805.0,35237530.0,107730.2,37765060.0,6.407524,0.285264,93.307213,100.000001,(POLYGON ((1389515.950805801 6852829.269928327...


In [7]:
print('There are', len(watersheds), 'watersheds')

There are 134846 watersheds


In [9]:
print('There are', len(watersheds[watersheds['tot_N'] > 1]), 'watersheds w/ more than 1 g N')

There are 66309 watersheds w/ more than 1 g N


# Check the data

In [None]:
countries_gdam['open_N'].sum() / 10**12

In [None]:
countries_gdam['treated_N'].sum() / 10**12

In [None]:
countries_gdam['septic_N'].sum() / 10**12

In [None]:
open_n = countries_gdam['open_N'].sum() /10**12
septic_n = countries_gdam['treated_N'].sum() /10**12
treated_n = countries_gdam['septic_N'].sum()/10**12
tot_n = countries_gdam['tot_N'].sum() /10**12
(open_n + septic_n + treated_n) / tot_n

# How much N is there total?
We are going to use the gdam totals.

**How does this compare to Total N into the ocean from other studies?**
Let's average the available global estiamtes
1. Global riverine N and P transport to ocean increased during the 20th century despite increased retention along the aquatic continuum (Beusen 2016): 37 Tg/yr in 2000
2. Global river nutrient export: A scenario analysis of past and future trends (Seitzinger 2010): 43.2 Tg/yr in 2000
3. Sources and delivery of carbon, nitrogen, and phosphorus to the coastal zone: An overview of Global Nutrient Export from Watersheds (NEWS) models and their application (Seitzinger 2005): 66 Tg/yr in mid 1990s
4. Riverine nitrogen export from the continents to the coasts (Boyer 2006): 48 Tg/yr circa 2000
5. Global modeling of the fate of nitrogen from point and nonpoint sources in soils, groundwater and surface water (Van Drecht 2003): 54 Tg / yr (2000)
6. Pre-industrial and contemporary fluxes of nitrogen through rivers: a global assessment based on typology (Green 2004): 40 Tg/year 
7. Exploring changes in river nitrogen export to the world's oceans (Bouwman 2006): 46 Tg / yr 

In [None]:
#print('countries Total N is: ', countries['tot_N'].sum()/10**12, 'TG')
print('watersheds Total N is: ', watersheds['tot_N'].sum()/10**12, 'TG')
print('countries_gdam Total N is: ', countries_gdam['tot_N'].sum()/10**12, 'TG')
tot_n = countries_gdam['tot_N'].sum()/10**12

In [None]:
print('N from waste water is', tot_n / 36.5 * 100, 'of total (Beusen 2016)')
print('N from waste water is', tot_n / 43.2 * 100, 'of total (Seitzinger 2010)')
print('N from waste water is', tot_n / 43.2 * 100, 'of total (Mayorga 2010)')

In [None]:
global_avgN = (37+43.2+66+48+54+40+46)/7
print('Pct from Sewage of Total', tot_n/global_avgN*100, "using avg N from all main studies") 

# How much WW N compared to Ag N?

From Beusen 2016 for land inputs to surface water: "From 1900 to 2000 its contribution rose from 6 (19 % of total) to 33TgNyr−1 (51% of total)" and total N into oceans was 37Tg / yr, so 51% * 47 Tg = 23.97 TG.

In [None]:
print('Waste Water N vs. Ag =', tot_n / (36.5 * .51) * 100, 'using Buesen 2016 Ag numbers')

# How much WW N by treatment type?

In [None]:
treated_n = countries_gdam['treated_N'].sum()/10**12
septic_n = countries_gdam['septic_N'].sum()/10**12
open_n = countries_gdam['open_N'].sum()/10**12

In [None]:
print('Treated', treated_n, 'Tg N,', treated_n/tot_n * 100, 'pct')
print('Septic', septic_n, 'Tg N,', septic_n/tot_n * 100, 'pct')
print('Open', open_n, 'Tg N,', open_n/tot_n * 100, 'pct')

In [None]:
print('Treated', countries_gdam['treated_N'].sum()/10**12,'N,', countries_gdam['treated_N'].sum()/10**12 / tot_n * 100,'Pct')

# Which countries are the top producers of N?

In [None]:
# Total N
countries_gdam.sort_values('tot_N', ascending = False).head(25)

In [None]:
# Treated N
countries_gdam.sort_values('treated_N', ascending = False).head(25)

In [None]:
# Treated N
countries_gdam.sort_values('treated_N', ascending = False).head(25)

In [None]:
# Septic N
countries_gdam.sort_values('septic_N', ascending = False).head(25)

In [None]:
countries_gdam.columns