# Generate Flux Event Table

This notebook analyzes every fluxbot and each event in order to generate a single dataframe of our flux estimates. This dataframe is then written out as `all_events.csv`. The `.csv` file is the basis for subequent analyses that will be included in our manuscript. As such, this notebook should be fully executable using only the files and code contained in our repository.

In [1]:
import pandas as pd
import numpy as np
from fluxbot import Fluxbot
%matplotlib inline

## Load Datafiles.

Our raw datafiles are contained in the `/data` directory that is contained in this repository. A permanent link to these raw data can be found [here]().

In [2]:
import glob
do_avgP = True

if do_avgP:
    data_dir = 'data/avgP'
else:
    data_dir = 'data'

data_files = glob.glob("{dir}/*[A-Z][0-9].csv".format(dir=data_dir))


print(data_files)

['data/avgP/NMWC_OS1.csv', 'data/avgP/NO_UT2.csv', 'data/avgP/NO_UT3.csv', 'data/avgP/NMWC_OS2.csv', 'data/avgP/NO_UT1.csv', 'data/avgP/NMWC_OS3.csv', 'data/avgP/NMWC_UT3.csv', 'data/avgP/NMWC_UT2.csv', 'data/avgP/NO_OS1.csv', 'data/avgP/NO_OS3.csv', 'data/avgP/NMWC_UT1.csv', 'data/avgP/NO_OS2.csv', 'data/avgP/NO_OM3.csv', 'data/avgP/NO_OM2.csv', 'data/avgP/NO_OM1.csv', 'data/avgP/NMWC_OM2.csv', 'data/avgP/NMWC_OM3.csv', 'data/avgP/NMWC_OM1.csv']


In [None]:
# data_file = data_files[0]
# fluxbot = Fluxbot(filename=data_file, do_avgP=do_avgP)
# fluxbot.generate_output(valid_only=False)

In [None]:
# file = 'data/something/name'
# dir_name, file_name = file.split('/')[-2:]

In [None]:
# fluxbot.events[50].calculate_flux()
# fluxbot.events[50].output()

## Create a List of Fluxbots

This first step takes the longest amount of time. As each fluxbot is loaded, we parse the data and extract events from the raw fluxbot datafile. These events are then analyzed to:

* extract ambient CO$_2$ concentrations from data for each event
* transform CO$_2$ concentrations into mass
* re-baseline CO$_2$ mass for each event into a difference from initial condition. 
* fit polynomials to the changes in CO$_2$ mass using linear, 2nd-order, and cubic fits.
* save polynomial parameters, R$^2$ values, and parameter uncertainty estimates
* generate event output
* write out the flux calculations for each fluxbot


In [3]:
tag = 'humidity_correction_avgP'
for data_file in data_files:
    print("Doing calculations for {}".format(data_file))
    fluxbot = Fluxbot(filename=data_file,output_tag=tag, do_avgP=do_avgP)
    print("Generating output for {}".format(fluxbot.title))
    fluxbot.generate_output(valid_only=False)
    print("Writing output file {}".format(fluxbot.output_filename))
    fluxbot.write()

Doing calculations for data/avgP/NMWC_OS1.csv




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



Generating output for Northern MWC Plot, Open Soil Replicate 1
Generating output for events and bad events
Writing output file data/avgP/NMWC_OS1_output_20_humidity_correction_avgP.csv
Doing calculations for data/avgP/NO_UT2.csv
Generating output for Northern O Plot, Under Tree Replicate 2
Generating output for events and bad events
Writing output file data/avgP/NO_UT2_output_20_humidity_correction_avgP.csv
Doing calculations for data/avgP/NO_UT3.csv
Generating output for Northern O Plot, Under Tree Replicate 3
Generating output for events and bad events
Writing output file data/avgP/NO_UT3_output_20_humidity_correction_avgP.csv
Doing calculations for data/avgP/NMWC_OS2.csv



divide by zero encountered in double_scalars



Generating output for Northern MWC Plot, Open Soil Replicate 2
Generating output for events and bad events
Writing output file data/avgP/NMWC_OS2_output_20_humidity_correction_avgP.csv
Doing calculations for data/avgP/NO_UT1.csv
Generating output for Northern O Plot, Under Tree Replicate 1
Generating output for events and bad events
Writing output file data/avgP/NO_UT1_output_20_humidity_correction_avgP.csv
Doing calculations for data/avgP/NMWC_OS3.csv
Generating output for Northern MWC Plot, Open Soil Replicate 3
Generating output for events and bad events
Writing output file data/avgP/NMWC_OS3_output_20_humidity_correction_avgP.csv
Doing calculations for data/avgP/NMWC_UT3.csv



invalid value encountered in double_scalars



Generating output for Northern MWC Plot, Under Tree Replicate 3
Generating output for events and bad events
Writing output file data/avgP/NMWC_UT3_output_20_humidity_correction_avgP.csv
Doing calculations for data/avgP/NMWC_UT2.csv
Generating output for Northern MWC Plot, Under Tree Replicate 2
Generating output for events and bad events
Writing output file data/avgP/NMWC_UT2_output_20_humidity_correction_avgP.csv
Doing calculations for data/avgP/NO_OS1.csv
Generating output for Northern O Plot, Open Soil Replicate 1
Generating output for events and bad events
Writing output file data/avgP/NO_OS1_output_20_humidity_correction_avgP.csv
Doing calculations for data/avgP/NO_OS3.csv
Generating output for Northern O Plot, Open Soil Replicate 3
Generating output for events and bad events
Writing output file data/avgP/NO_OS3_output_20_humidity_correction_avgP.csv
Doing calculations for data/avgP/NMWC_UT1.csv
Generating output for Northern MWC Plot, Under Tree Replicate 1
Generating output for 

## Testing Area for Single Fluxbot:

The code below is to test the functionality for a single fluxbot.

In [None]:
# data_file = 'data/NO OS2.csv'
# fluxbot = Fluxbot(filename=data_file)

In [None]:
# fluxbot.generate_output(valid_only=False)

## Generate an `all_events.csv` file

The code below reads in all the `output.csv` files in the data directory and appends them into a single, massive dataframe. We then export that dataframe into a new csv file.

In [6]:
import glob
output_files = glob.glob("{dir}/*_output_20_humidity_correction_avgP.csv".format(
    dir='data/avgP'))
                         
print(output_files)
len(output_files)
df_list = []
for file in output_files:
    df = pd.read_csv(file)
    df_list.append(df)

all_output = pd.concat(df_list)
all_output.to_csv('data/avgP/all_events_with_bad_20.csv', index=False)

['data/avgP/NMWC_OS2_output_20_humidity_correction_avgP.csv', 'data/avgP/NO_OM2_output_20_humidity_correction_avgP.csv', 'data/avgP/NO_OM1_output_20_humidity_correction_avgP.csv', 'data/avgP/NMWC_OS1_output_20_humidity_correction_avgP.csv', 'data/avgP/NMWC_OM1_output_20_humidity_correction_avgP.csv', 'data/avgP/NO_OS1_output_20_humidity_correction_avgP.csv', 'data/avgP/NO_OS2_output_20_humidity_correction_avgP.csv', 'data/avgP/NMWC_OM2_output_20_humidity_correction_avgP.csv', 'data/avgP/NO_UT2_output_20_humidity_correction_avgP.csv', 'data/avgP/NO_UT1_output_20_humidity_correction_avgP.csv', 'data/avgP/NMWC_UT1_output_20_humidity_correction_avgP.csv', 'data/avgP/NMWC_UT2_output_20_humidity_correction_avgP.csv', 'data/avgP/NMWC_UT3_output_20_humidity_correction_avgP.csv', 'data/avgP/NO_UT3_output_20_humidity_correction_avgP.csv', 'data/avgP/NO_OS3_output_20_humidity_correction_avgP.csv', 'data/avgP/NMWC_OM3_output_20_humidity_correction_avgP.csv', 'data/avgP/NMWC_OS3_output_20_humidity_

## Write function to export event data.

This function will be in the `event` object, and will export a single row `DataFrame` with standard columns:

- `timestamp`
- `year`
- `month`
- `day`
- `hour`
- `avg_temp_degC`
- `avg_pressure_hPa`
- `avg_rel_humidity`
- `ambient_CO2_kg`
- `ambient_CO2_ppm`
- `beta`
- `duration`
- `change_in_CO2_kg`
- `1st_order_beta_0`
- `1st_order_beta_0_error`
- `1st_order_r_sq`
- `2nd_order_beta_0`
- `2nd_order_beta_0_error`
- `2nd_order_beta_1`
- `2nd_order_r_sq`
- `flux_umol_m2_sec`
- `flux_umol_m2_sec_error`
- `qaqc_flags`

- OPTIONAL: Include the `CO2_mass` data and `time` data used to do the regression fitting.

The `event_output` function will be called from a `fluxbot_output` function. The `fluxbot_output` function will add the following columns to each `event`:

- `data_file`
- `fluxbot_hardware_version`
- `fluxbot_software_version`
- `chamber_volume_cm3`
- `chamber_area_cm2`
- `treatment`
- `block`
- `location`
- `replicate`

The `Fluxbot.write` function will create a single dataframe containing all these columns and one row per event for a fluxbot. The output will be written to a `.csv` file by default.

Evnetually, there will also be a `Event.write` function, which will export a `.csv` file containing the parsed and smoothed data. This `.csv` file can then be subsequently read in using the `Event.read_csv` function.

