<a href="https://colab.research.google.com/github/AvantiShri/oceanography_colab_notebooks/blob/master/for_clkelly/Colette_N2O_Data.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import pandas

In [2]:
!wget --no-check-certificate 'https://docs.google.com/uc?export=download&id=1lzNG3-ClWIKWwTska9OPBaUb1o8uPA0h' -O 200413_nitrous_oxide_cycling_regimes_data_for_repositories.csv

--2020-07-30 03:11:27--  https://docs.google.com/uc?export=download&id=1lzNG3-ClWIKWwTska9OPBaUb1o8uPA0h
Resolving docs.google.com (docs.google.com)... 108.177.119.138, 108.177.119.113, 108.177.119.100, ...
Connecting to docs.google.com (docs.google.com)|108.177.119.138|:443... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://doc-08-50-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/r3cjc52fa2h4b0dj57c3ojrlf8eu28pf/1596078675000/00395683668588961264/*/1lzNG3-ClWIKWwTska9OPBaUb1o8uPA0h?e=download [following]
--2020-07-30 03:11:28--  https://doc-08-50-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/r3cjc52fa2h4b0dj57c3ojrlf8eu28pf/1596078675000/00395683668588961264/*/1lzNG3-ClWIKWwTska9OPBaUb1o8uPA0h?e=download
Resolving doc-08-50-docs.googleusercontent.com (doc-08-50-docs.googleusercontent.com)... 108.177.126.132, 2a00:1450:4013:c01::84
Connecting to doc-08-50-docs.googleusercontent.com (doc-08

From Colette:

```
So this whole thing started with a plots of N2O isotopomers (columns "d15N-N2Oa_mean (per mil vs. atm N2)", "d15N-N2Ob_mean 
(per mil vs. atm. N2)", and "d18O-N2O_mean (per mil vs. VSMOW)") vs. the inverse of N2O concentration (1/"N2O_mean (nM)"). They are 
a figure in my paper. These plots had a visible change point in them. Patrick has noticed a similar phenomenon in his data. The two 
clusters on a plot like this indicate two different pools of N2O produced from two different sources.

I strongly suspect that nitrite concentration ("Nitrite [uM]" in the spreadsheet) and oxygen ("Seabird Oxygen [umol/L]") also inform the clustering. 
Also the isotopes of nitrite and nitrate ("d18O-NO3 avg (per mil vs. VSMOW)" and so forth). Furthermore I feel like the degree to which they inform 
clustering actually gives us additional information as well. For example, if nitrite concentration is a strong predictor whether a datapoint falls into one 
cluster or another, that tells me that nitrite is likely a substrate for one of these N2O pools.

If we could define a relationship between [N2O] and d18O-N2O, controlling for d15N-N2Oa, that could be interesting. From the rudimentary version of this 
clustering stuff in my paper, we see that d15N-N2Oa looks like it could be an N2O consumption signal. But d18O-N2O does not — or rather, d18O-N2O is more of 
a net production + consumption signal. In reductive waters, d18O-N2O and d15N-N2Oa are both tightly controlled by N2O consumption and thus are very well 
correlated. In these plots, we are making the assumption that these are NOT reductive waters, so it would be interesting to see if these two factors have 
relationships with [N2O] that are independent of each other.
```

In [3]:
from matplotlib import pyplot as plt
import numpy as np

#Easy remapping of the column names
colnames_map = {'d15N_N2Oa_mean':"d15N-N2Oa_mean (per mil vs. atm N2)",
            'd15N_N2Ob_mean':"d15N-N2Ob_mean (per mil vs. atm. N2)",
            'd18O_N2O_mean':"d18O-N2O_mean (per mil vs. VSMOW)",
            'N2O_mean':"N2O_mean (nM)",
            'd18O_NO3_mean':'d18O-NO3 avg (per mil vs. VSMOW)',
            'd15N_NO3_mean':'d15N-NO3 avg (per mil vs. atm. N2)',
            'd15N_NO2': 'd15N-NO2 (per mil vs. atm N2)',
            'd18O_NO2': 'd18O-NO2 (per mil vs. VSMOW)',
            'Nitrite':"Nitrite [uM]",
            'Oxygen':"Seabird Oxygen [umol/L]",
            'NO3_mean':'NO3_mean (uM)',
            'Depth': 'Target Depth [m]'}

#For some reason, altair chokes when provided data frames with some
# of the original column names. So I am remapping the column names.
def remap_colnames(df, colnames_map):
  foraltair_df = pandas.DataFrame(dict([
      (new_col, np.array(df[orig_col]))
      for new_col,orig_col in colnames_map.items()]))
  return foraltair_df

df = pandas.read_csv("200413_nitrous_oxide_cycling_regimes_data_for_repositories.csv")
filtered_df = remap_colnames(df=df, colnames_map=colnames_map)
#create a column for the inverse of the N2O mean
filtered_df['inv_N2O_mean'] = 1/filtered_df['N2O_mean']

In [4]:
import altair as alt

interval = alt.selection_interval()

invN2O_x = alt.Chart(filtered_df).mark_point().encode(
  x='inv_N2O_mean',
  color=alt.condition(interval, 'Depth', alt.value('lightgray'),
                      scale=alt.Scale(scheme='goldgreen'))
).properties(
  selection=interval,
  width=250, height=250
)

alt.vconcat(
(invN2O_x.encode(y='d15N_N2Oa_mean')
| invN2O_x.encode(y='d15N_N2Ob_mean')
| invN2O_x.encode(y='d18O_N2O_mean')
| invN2O_x.encode(x='d15N_N2Oa_mean', y='d18O_NO3_mean')
),

(invN2O_x.encode(x='d15N_NO2', y='Nitrite')
| invN2O_x.encode(x='d15N_NO2', y='d18O_NO2')
| invN2O_x.encode(x='Oxygen', y='NO3_mean')
| invN2O_x.encode(x='d15N_NO3_mean', y='d18O_NO3_mean')
),

)