# ***Create pickle*** -- preparing a single synchronised datafile

Use this script first to combine raw potentiostat, LabView and photodiode data
into a single, synchronised entity file that can be efficently stored and read for future analysis.

In [None]:
import pandas as pd
from matplotlib import pyplot as plt
import numpy as np
import glob
from pickle_utils import read_ecf, read_emsi, optimise_emsi_start, read_photodiode, read_ecf_text
from config import filenames, props

I prefer the plots to be in their own separate window.
This way it's easier to enlarge and manipulate them.
The following cell tells `matplotlib` to use the windowed `qt` backend:

In [None]:
%matplotlib qt

For debugging convenience, all of the external functions are reimported at each invocation with `autoreload`.

In [None]:
%load_ext autoreload
%autoreload 2

## Loading potentiostat data

In [None]:
ecfile = glob.glob(filenames.data+'*.txt')[-1] # last text file in the folder
ecfile

Some old measurements don't have the potentiostat data stored in text.
It is possible to read old ECLab files with a python package `eclabfiles`.
In order to use it, find the `.mpr` file and use the `read_ecf` function
instead of `read_ecf_text`.

In [None]:
ecfdf, ecf_start = read_ecf_text(ecfile)

## Optional -- loading the photodiode data

In [None]:
if glob.glob(filenames.data+'*photodiode.csv'):
    phdf, ph_start = read_photodiode(glob.glob(filenames.data+'*photodiode.csv')[0])

Photodiode data is stored in a single file.
Therefore, manual synchronisation is simpler and more accurate.
Plot the photodiode and current data next to each other
and shift the photodiode time reference (`phdf.t`) until both datasets match.

In [None]:
plt.figure()
plt.plot(phdf.t, phdf.photodiode/100)
plt.plot(ecfdf.t, ecfdf.I/props.el_area*100)

In [None]:
phdf.t += 0.3

In [None]:
ecfdf['photodiode'] = np.interp(ecfdf.t, phdf.t, phdf.photodiode)

## Loading LabView `.dat` files

List of all `.dat` files in the folder will appear below.
Don't worry that some of them are from a CV measurement --
this will be sorted out later.

In [None]:
datfiles = glob.glob(filenames.data+'*.dat')
datfiles

Next cell runs a loop that:

1. Reads the `.dat` file.
2. In case it finished before the PRC measurement started, it ignores it (it's most likely a CV).
3. Automatically synchronises the `.dat` file to the potentiostat data by fitting the recorded currents.
4. Plots both currents on one plots so the user can visually check the precision of the fit.

In [None]:
joined_emsi = pd.DataFrame([])
for filename in datfiles:
    emsidf, emsi_start, emsi_end = read_emsi(filename)
    if emsi_end < ecf_start:
        continue
    emsidf.loc[:, 'I'] = emsidf.I * 980
    emsi_shift = optimise_emsi_start(ecfdf, ecf_start, emsidf, emsi_start)
    emsidf.loc[:, 't'] = emsidf.t + emsi_start - ecf_start - emsi_shift
    joined_emsi = pd.concat([joined_emsi, emsidf], ignore_index=True)
plt.plot(joined_emsi.t, joined_emsi.I)
# plt.plot(joined_emsi.t, joined_emsi.light)
plt.plot(ecfdf.t, ecfdf.I)

## Final adjustments and saving

Until now, we worked with absolute current. Let's convert it to current density:

In [None]:
ecfdf.I = 100*ecfdf.I/props.el_area

Join all the files and interpolate everything to the potentiostat's time sampling:

In [None]:
full_data = ecfdf.copy()
full_data['emsi'] = np.interp(ecfdf.t, joined_emsi.t, joined_emsi.emsi)
full_data['light'] = np.interp(ecfdf.t, joined_emsi.t, joined_emsi.light)

Save as a pickle. Feel free to change the data storage format if you need something more portable.

In [None]:
ecfdf.to_pickle(filenames.data+'data.pkl')