# <center>Workflow for the CRC1333 project B07 - Technical Chemistry</center>
# <center>2.2 Experimental notebook - Analysis</center>

---

This is the ``Experimental`` ``notebook``, where the actual analysis of the experiments takes place. It consists of three parts: ``Parsing``, ``analysis`` and ``DaRUS`` ``upload``. Within the scope of each project, multiple experiments are perfomed, hence multiple analyses are to be done. For each individual experiment this workflow is to be executed once, and the results can be appended to the project's dataset.

---

---
## Section 0: Imports, Paths, and Logging
---

In this section all the necessary python packages are imported, the path to this notebook and the logger for this notebook is set up.

Activate autoreload.

In [1]:
%reload_ext autoreload
%autoreload 2

Import standard library python packages necessary to set up the ``logger``.

In [2]:
import os
import json
import logging
import logging.config
import pandas as pd
from pathlib import Path

Get path to the directory this notebook is located.

In [3]:
root = Path(os.path.abspath(''))

Set path to the directory containing the configuration file for the logger.

In [4]:
logging_config_path = root / "datamodel_b07_tc/tools/logging/config_exp_2_1.json"

Set up logger by reading the .json-type configuration file.

In [5]:
with open(logging_config_path) as logging_config_json:
    logging_config = json.load(logging_config_json)
logging.config.dictConfig(logging_config)

Create a child of the root logger and set its name to the name of the current notebook.

In [6]:
logger = logging.getLogger(__name__)

Set the level of several third-party module loggers to avoid dumping too much information in the log file.
<div class="alert alert-block alert-info"><b>Info:</b> Some third party modules use the same logging module and structure as this notebook, which is unproblematic, unless the level of their corresponding logging handlers is too low. In these cases the logging messages of lower levels, such as 'DEBUG' and 'INFO' are propagated to the parent logger of this notebook.</div>

In [7]:
third_party_module_loggers = ['markdown_it', 'h5py', 'numexpr', 'git']
for logger_ in third_party_module_loggers:
    logging.getLogger(logger_).setLevel('WARNING')

Import and instantiate the ``Librarian`` module for efficient and clean file and directory handling.

In [8]:
from datamodel_b07_tc.tools import Librarian
librarian = Librarian(root_directory=root)

Import modfied sdRDM object.

In [9]:
from datamodel_b07_tc.modified.dataset import Dataset

<div class="alert alert-block alert-info"><b>Info:</b> Python objects created by the sdRDM generator can be equipped with additional features, such as functions or classes, e.g. to parse data or perform internal calculations, which allows for a more modular approach of working with them.</div>

Import the data model containing all the objects of sdRDM's python API.

In [10]:
# from sdRDM.generator import generate_python_api
from sdRDM import DataModel

Manually generate the sdRDM python objects.

In [11]:
# generate_python_api('specifications/datamodel_b07_tc.md', '', 'datamodel_b07_tc')

Import tools used for the analysis.

In [12]:
from datamodel_b07_tc.tools import FaradayEfficiencyCalculator
from datamodel_b07_tc.tools import PeakAssigner

Import standard library python packages.

In [13]:
import ipywidgets as widgets
# import ipython_blocking
from IPython.display import display

---
## Section 1: Dataset and data model parsing
---

In this section the data model and the dataset as well as all the output files necessary for analysis are parsed.  

In [14]:
root_subdirectories = librarian.enumerate_subdirectories(directory=root)

Parent directory: 
 /mnt/c/Users/rscho/Documents/GitHub/datamodel_b07_tc 
Available subdirectories:
0: .../.git
1: .../.github
2: .../.vscode
3: .../data
4: .../datamodel_b07_tc
5: .../datasets
6: .../logging
7: .../specifications


List all available dataset json files in the 'datasets' directory.

In [15]:
json_dataset_files = librarian.enumerate_files(directory=root_subdirectories[5], filter='json')

Directory: 
 /mnt/c/Users/rscho/Documents/GitHub/datamodel_b07_tc/datasets 
Available files:
0: b07.json


In [16]:
json_dataset = json_dataset_files[0]
dataset, lib = DataModel.parse(json_dataset)
dataset = Dataset(**dataset.__dict__)

Visualize the data model.

In [17]:
# lib.Dataset.meta_tree()

Print current status of the dataset.

In [18]:
# print(dataset.json())

---
## Section 2: Analysis
---

Print all experiments object of the dataset.

In [19]:
experiments_dict = dataset.enumerate('experiments')

0: experiment0


Select experiment object with which the analysis should be performed.

In [20]:
experiment = experiments_dict[0]

### Peak assignment

<div class="alert alert-block alert-info"><b>Info:</b> The peak areas recorded by the GC have to be matched with the correct species. The individial <i>Area</i> is selected by its corresponding <i>Peak_Number</i> . It is possible that the same species is accountable for multiple peaks, i.d. multiple peaks are assigned to the same species.</div>

Get list of alll three GC measurements.

In [21]:
gc_measurements = [gc for gc in experiment.measurements if gc.measurement_type == 'GC measurement']

In [22]:
species = ['Hydrogen', 'Carbon monoxide', 'Carbon dioxide', 'Methane', 'Ethene', 'Ethane']

In [23]:
peak_assignment = PeakAssigner.from_gc_measurement(gc_measurements, species)
peak_assignment.assign_peaks()

HBox(children=(VBox(children=(Label(value='Measurement number 0', layout=Layout(height='30px', justify_content…

HBox(children=(Button(description='Save Assignments', layout=Layout(align_items='center', width='30%'), style=…

Output()

In [24]:
stop

NameError: name 'stop' is not defined

In [30]:
print(peak_assignment._assignment_dicts)

[{'Hydrogen': [69.17], 'Carbon dioxide': [65492.75], 'Methane': [164.16], 'Carbon monoxide': [2876.95]}, {'Hydrogen': [104.63], 'Carbon dioxide': [70813.52], 'Methane': [317.44], 'Carbon monoxide': [3685.7]}, {'Hydrogen': [97.26], 'Carbon dioxide': [71603.85], 'Methane': [317.31], 'Carbon monoxide': [3433.34]}]


Print current state of the experiment object.

In [31]:
# print(experiment.json())

### Calculation

Set averaging radius.
<div class="alert alert-block alert-info"><b>Info:</b> The mass flow at the time of the GC measurement is determined by matching the time of the gc measurement with the corresponding times of the mass flow measurements. Errors in the mass flows due to strong fluctuations are minimized by calculating the mean by averaging over a certain number (='radius') of measuring points before and after the time of the GC measurement. The radius has to be specified in accordance with the strength of fluctuations.</div>

In [32]:
mean_radius = 10

Set up the ``FaradayEfficiencyCalculator``.

In [33]:
fe_calculator = FaradayEfficiencyCalculator(
    experiment=experiment,
    electrode_surface_area=1.0,
    assignment_dicts=peak_assignment.get_assignment_dicts
)

Calculate faraday efficiencies.

In [38]:
mean_faraday_efficiencies_df = fe_calculator.calculate_faraday_efficiencies()
mean_faraday_efficiencies_df

TypeError: FaradayEfficiencyCalculator.calculate_faraday_efficiencies() missing 3 required positional arguments: 'gc_measurement', 'mean_radius', and 'assigned_peak_areas_dict'

In [None]:
faraday_efficiencies = []
for gc_measurement, assigned_peak_areas_dict in zip(gc_measurements, peak_assignment._assignment_dicts):
    faraday_efficiency = fe_calculator.calculate_faraday_efficiency(
        gc_measurement=gc_measurement,
        mean_radius=mean_radius,
        assigned_peak_areas_dict=assigned_peak_areas_dict
    )
    faraday_efficiencies.append(faraday_efficiency)


Calculate mean faraday efficiency.

In [None]:
mean_faraday_efficiency = pd.concat(faraday_efficiencies).groupby(level=0).mean()
mean_faraday_efficiency

Append 'experiment' objects to the dataset.

In [None]:
for species_data in experiment.species_data:
    if species_data.species in mean_faraday_efficiency.index:
        faraday_efficiency = mean_faraday_efficiency.loc[species_data.species].values
        species_data.faraday_efficiency = lib.Data(quantity= 'Faraday efficiency', values = faraday_efficiency.tolist(), unit = '%')

Print dataset.

In [None]:
# print(dataset)

In [None]:
dataset.experiments.append(experiment)

In [None]:
with open(json_dataset, "w") as f:
    f.write(dataset.json())