### Purpose of this notebook

This notebook is designed to demonstrate how to process TSSE spreadsheets once you have created them (as done in the `01_....ipynb` file and filled them in (using Excel)

### Let's go!

In [1]:
from pathlib import Path
from tsse_data.toc_measurement import process_toc_spreadsheet

Create and examine the path to the folder containing the spreadsheets. Ordinarily the path you define here would be the same one used in the create_sheets notebook. However, to avoid the files being accidentally overwritten during the tutorial, a separate directory with filled spreadsheets has been provided.

In [2]:
p = Path.cwd() / 'measurements_filled'
p

PosixPath('/Users/ianbillinge/dev/tsse_data/tsse_data/tutorial/measurements_filled')

Define the dimensions that are described in the spreadsheet... as well as the common_dims, which are common descriptors that apply to the whole spreadsheet and therefore were not entered into the spreadsheet.

In [3]:
dims = ['amine', 'cation', 'anion', 'experimenter', 'replicate']
cd = {'T':25}
dipa = {'mw':101.19, 'nc':6}
calibration = None

Define the filename of the spreadsheet you will read.

In [4]:
filename = 'toc.xlsx'
path = str(p / filename)
path

'/Users/ianbillinge/dev/tsse_data/tsse_data/tutorial/measurements_filled/toc.xlsx'

Now we can process the spreadsheet. This will return an xarray.Dataset object.
The most relevant data in the dataset is the 'w_a' and 'dw_a' variables, which are the weight fraction amine and uncertainty in the weight fraction amine.

In [5]:
ds = process_toc_spreadsheet(path, dims, dipa, common_dims=cd, calib=calibration)
ds

  log_w_toc_calib[i] = float(b) + float(m) * float(log_w_toc[i])
  log_w_toc_calib[i] = float(b) + float(m) * float(log_w_toc[i])


The data can be saved in a variety of formats. Here, we create a new folder called `measurements_processed`.

In [6]:
out_dir = Path.cwd() / 'measurements_processed'

try:
    out_dir.mkdir()
except FileExistsError:
    pass

The file can be exported in a variety of formats. If you want to export as Excel, you can convert it to a Pandas DataFrame, then use Pandas's tools to export to excel:

In [7]:
df = ds.to_dataframe()
df.to_excel(out_dir / 'toc.xlsx')
df.to_csv(out_dir / 'toc.csv')
df

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,Unnamed: 5_level_0,m_sample,m_water,toc_raw,dtoc_raw,w_toc_raw,dw_toc_raw,w_toc_calibrated,dw_toc_calibrated,w_toc_adj_dil,w_a,dw_a
amine,cation,anion,experimenter,replicate,T,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
dipa,Na,Cl,ib,1,25,0.1,40,10000,100,1e-05,1e-07,4e-06,4e-06,0.001635,0.002296,0.0


And if you want to export the data as a dataset (netcdf format, .nc), that can be done straightforwardly:


In [8]:
ds.to_netcdf(out_dir / 'toc.nc')

This data can be loaded as an xarray Dataset easily later:

In [12]:
import xarray as xr
ds2 = xr.load_dataset(out_dir / 'toc.nc')
ds2