# Construct the `experiment_info` Dictionary

This notebook is meant for developers that would like to contribute to the utilities of the [repository](https://github.com/MaCFP/matl-db). Furthermore, it provides means for the maintainers of the repo to easily update the dictionary when new data comes in. 

It details how the README files are processed and a dictionary is created that contains the information of the different experiments. The dictionary is written to a text file automatically.

In [1]:
import os
import sys
import importlib
import copy

import pandas as pd


# Absolute path to MaCFP matl repository:
# "D:\Git\MaCFP_matl_MyFork\matl-db"
macfp_matl_root = os.path.join("d:/", "Git", "MaCFP_matl_MyFork", "matl-db")


# Add path to where Python looks for modules to import..
base_func_script = os.path.join(macfp_matl_root, "Utilities")
sys.path.insert(1, base_func_script)

# Import base_functions script.
import basic_functions as base_f
# Re-import base_functions script.
importlib.reload(base_f)

<module 'basic_functions' from 'd:/Git\\MaCFP_matl_MyFork\\matl-db\\Utilities\\basic_functions.py'>

In [2]:
# Import general information needed for this notebook.

# Path to the PMMA data.
pmma_path = os.path.join(macfp_matl_root, "Non-charring", "PMMA")

First all institute labels are taken from the directory labels.

In [3]:
# Get all institute labels from dictionary names.
institutes = list()
institutes = [ f.name for f in os.scandir(pmma_path) if f.is_dir() ]

# Check results, remove ";".
institutes;

# TGA Experiment Information

## Get All README Files Containing TGA Data

All README files are read from the institute directories. Each file is checked it it contains information on TGA experiments. 

In [4]:
experiment_label = "TGA"
# Get all README files that contain information on the experiments.
tga_readme_files = base_f.get_exp_readme_files(institutes=institutes,
                                               base_path=pmma_path, 
                                               experiment_key=experiment_label)

# Check results, remove ";".
tga_readme_files.keys();

* Institutes that contributed TGA data:
  Aalto

  DBILund
  + True

  Edinburgh

  FM

  GIDAZE+
  + True

  HKPoly
  + True

  LCPP
  + True

  NIST
  + True

  Sandia
  + True

  TIFP
  + True

  UClan
  + True

  UDRI
  + True

  UMD
  + True

  UMET
  + True

  UQ
  + True



## Extract the Information on the Experiments
Now, the README files are parsed and the information on the respective experiments is extracted.

In [5]:
experiment_info = {experiment_label: dict()}

# for institute in ["UMET"]:
for institute in tga_readme_files.keys():
    # Get README file content.
    readme_lines = tga_readme_files[institute]
    
    # Extract institute label and name.
    institute_info = base_f.get_institute(readme_lines)
    institute_label = institute_info[0]
    print(institute_label)
    
    # Extract the TGA experiment information.
    experiment_lines = base_f.read_experiment_lines(readme_lines, 
                                                    start_marker_a=experiment_label, 
                                                    start_marker_b="###", 
                                                    end_marker="###")
    
    # Extract test conditions summary table.
    test_cond_df = base_f.read_test_condition_table(experiment_lines)
    
    # Combine the information from above to create a dictionary
    # for the experiment repetitions of a given institute.
    institute_test_info = base_f.fill_tga_dict(experiment_lines,
                                               institute_name_info=institute_info,
                                               exp_table_df=test_cond_df,
                                               tga_base_dict=base_f.experiment_template["TGA_base"],
                                               material_path=pmma_path)
    
    # 
    experiment_info[experiment_label][institute_label] = institute_test_info

DBILund
DBILund_TGA_N2_20K_1.csv
DBILund_TGA_N2_20K_2.csv
DBILund_TGA_N2_20K_3.csv
GIDAZE+
GIDAZE+_TGA_N2_10K_1.csv
GIDAZE+_TGA_N2_10K_2.csv
GIDAZE+_TGA_O2-10_10K_1.csv
GIDAZE+_TGA_O2-10_10K_2.csv
GIDAZE+_TGA_O2-21_10K_1.csv
GIDAZE+_TGA_O2-21_10K_2.csv
GIDAZE+_TGA_O2-21_10K_3.csv
GIDAZE+_TGA_O2-21_10K_4.csv
HKPoly
HKPolyU_TGA_N2_10K_1.csv
HKPolyU_TGA_N2_10K_2.csv
HKPolyU_TGA_O2-21_10K_1.csv
HKPolyU_TGA_O2-21_10K_2.csv
LCPP
LCPP_TGA_N2_2-5K_1.csv
LCPP_TGA_N2_2-5K_2.csv
LCPP_TGA_N2_2-5K_3.csv
LCPP_TGA_N2_5K_1.csv
LCPP_TGA_N2_5K_2.csv
LCPP_TGA_N2_5K_3.csv
LCPP_TGA_N2_10K_1.csv
LCPP_TGA_N2_10K_2.csv
LCPP_TGA_N2_10K_3.csv
LCPP_TGA_N2_15K_1.csv
LCPP_TGA_N2_15K_2.csv
LCPP_TGA_N2_15K_3.csv
LCPP_TGA_N2_20K_1.csv
LCPP_TGA_N2_20K_2.csv
LCPP_TGA_N2_20K_3.csv
NIST
NIST_TGA_N2_10K_1.csv
SANDIA
SANDIA_TGA_Ar_1K_1.csv
SANDIA_TGA_Ar_10K_1.csv
SANDIA_TGA_Ar_10K_2.csv
SANDIA_TGA_Ar_50K_1.csv
SANDIA_TGA_Ar_50K_2.csv
SANDIA_TGA_Ar_50K_3.csv
TIFP
TIFP_TGA_N2_10K_1.csv
TIFP_TGA_N2_10K_2.csv
UClan
* An except

# DSC Experiment Information

## Get All README Files Containing DSC Data

All README files are read from the institute directories. Each file is checked if it contains information on DSC experiments. 

In [6]:
experiment_label = "DSC"
# Get all README files that contain information on the experiments.
dsc_readme_files = base_f.get_exp_readme_files(institutes=institutes,
                                               base_path=pmma_path, 
                                               experiment_key=experiment_label)

# Check results, remove ";".
dsc_readme_files.keys();

* Institutes that contributed DSC data:
  Aalto

  DBILund
  + True

  Edinburgh

  FM

  GIDAZE+
  + True

  HKPoly
  + True

  LCPP

  NIST

  Sandia
  + True

  TIFP
  + True

  UClan
  + True

  UDRI

  UMD
  + True

  UMET
  + True

  UQ



## Extract the Information on the Experiments
Now, the README files are parsed and the information on the respective experiments is extracted.

In [7]:
experiment_info[experiment_label] = dict()
exp_template = base_f.experiment_template["DSC_base"]

# for institute in ["UMET"]:
for institute in dsc_readme_files.keys():
    # Get README file content.
    readme_lines = dsc_readme_files[institute]
    
    # Extract institute label and name.
    institute_info = base_f.get_institute(readme_lines)
    institute_label = institute_info[0]
    print(institute_label)
    
    # Extract the experiment information.
    experiment_lines = base_f.read_experiment_lines(readme_lines, 
                                                    start_marker_a=experiment_label, 
                                                    start_marker_b="###", 
                                                    end_marker="###")
    
    # Extract test conditions summary table.
    test_cond_df = base_f.read_test_condition_table(experiment_lines)
    
    # Combine the information from above to create a dictionary
    # for the experiment repetitions of a given institute.
    institute_test_info = base_f.fill_dsc_dict(experiment_lines,
                                               institute_name_info=institute_info,
                                               exp_table_df=test_cond_df,
                                               dsc_base_dict=exp_template,
                                               material_path=pmma_path)
    
    # 
    experiment_info[experiment_label][institute_label] = institute_test_info

DBILund
DBILund_DSC_N2_20K_1.csv
DBILund_DSC_N2_20K_2.csv
DBILund_DSC_N2_20K_3.csv
GIDAZE+
GIDAZE+_TGA_N2_10K_1.csv
GIDAZE+_TGA_N2_10K_2.csv
GIDAZE+_TGA_O2-10_10K_1.csv
GIDAZE+_TGA_O2-10_10K_2.csv
GIDAZE+_TGA_O2-21_10K_1.csv
GIDAZE+_TGA_O2-21_10K_2.csv
GIDAZE+_TGA_O2-21_10K_3.csv
GIDAZE+_TGA_O2-21_10K_4.csv
HKPoly
HKPolyU_DSC_N2_10K_1.csv
HKPolyU_DSC_N2_10K_2.csv
HKPolyU_DSC_O2-21_10K_1.csv
HKPolyU_DSC_O2-21_10K_2.csv
SANDIA
SANDIA_DSC_Ar_1K_1.csv
SANDIA_DSC_Ar_10K_1.csv
SANDIA_DSC_Ar_10K_2.csv
SANDIA_DSC_Ar_50K_1.csv
SANDIA_DSC_Ar_50K_2.csv
SANDIA_DSC_Ar_50K_3.csv
TIFP
TIFP_DSC_N2_10K_1.csv
TIFP_DSC_N2_10K_2.csv
UClan
* An exception occurred: 'None' will be set to 'None'.

UClan_DSC_N2_10K_1.csv
UMD
* An exception occurred: 'None' will be set to 'None'.

UMD_TGA_N2_10K_1.csv
UMET
UMET_DSC_N2_3K_1.csv
UMET_DSC_N2_10K_1.csv
UMET_DSC_N2_10K_2.csv
UMET_DSC_N2_20K_1.csv
UMET_DSC_N2_20K_2.csv


In [8]:
# Check results, remove ";".
experiment_info["DSC"].keys()

dict_keys(['DBILund', 'GIDAZE+', 'HKPoly', 'SANDIA', 'TIFP', 'UClan', 'UMD', 'UMET'])

# Cone Calorimeter Experiment Information

## Get All README Files Containing Cone Calorimeter Data

All README files are read from the institute directories. Each file is checked if it contains information on Cone Calorimeter experiments. 

In [9]:
experiment_label = "Cone Calorimeter"
# Get all README files that contain information on the experiments.
cone_readme_files = base_f.get_exp_readme_files(institutes=institutes,
                                                base_path=pmma_path,
                                                experiment_key=experiment_label)

# Check results, remove ";".
cone_readme_files.keys();

* Institutes that contributed Cone Calorimeter data:
  Aalto
  + True

  DBILund
  + True

  Edinburgh
  + True

  FM

  GIDAZE+
  + True

  HKPoly
  + True

  LCPP
  + True

  NIST
  + True

  Sandia

  TIFP
  + True

  UClan
  + True

  UDRI
  + True

  UMD

  UMET

  UQ
  + True



## Extract the Information on the Experiments
Now, the README files are parsed and the information on the respective experiments is extracted.

In [10]:
experiment_info[experiment_label] = dict()
exp_template = base_f.experiment_template["Cone_base"]
readme_files = cone_readme_files


# for institute in ["UMET"]:
for institute in readme_files.keys():
    # Get README file content.
    readme_lines = readme_files[institute]
    
    # Extract institute label and name.
    institute_info = base_f.get_institute(readme_lines)
    institute_label = institute_info[0]
    print(institute_label)
    
    # Extract the experiment information.
    experiment_lines = base_f.read_experiment_lines(readme_lines, 
                                                    start_marker_a=experiment_label, 
                                                    start_marker_b="###", 
                                                    end_marker="###")
    
    # Extract test conditions summary table.
    test_cond_df = base_f.read_test_condition_table(experiment_lines)
    
    # Combine the information from above to create a dictionary
    # for the experiment repetitions of a given institute.
    institute_test_info = base_f.fill_cone_dict(experiment_lines,
                                                institute_name_info=institute_info,
                                                exp_table_df=test_cond_df,
                                                base_dict=exp_template,
                                                material_path=pmma_path)
    
    # 
    experiment_info[experiment_label][institute_label] = institute_test_info

Aalto
* An exception occurred: 'None' will be set to 'None'.

* An exception occurred: 'None' will be set to 'None'.

* An exception occurred: 'None' will be set to 'None'.

Aalto_Cone_65kW_1.csv
Aalto_Cone_65kW_2.csv
Aalto_Cone_65kW_3.csv
DBILund
* An exception occurred: '[None]' will be set to 'None'.

* An exception occurred: '[None]' will be set to 'None'.

* An exception occurred: '[None]' will be set to 'None'.

* An exception occurred: '[None]' will be set to 'None'.

* An exception occurred: '[None]' will be set to 'None'.

* An exception occurred: '[None]' will be set to 'None'.

* An exception occurred: '[None]' will be set to 'None'.

* An exception occurred: '[None]' will be set to 'None'.

* An exception occurred: '[None]' will be set to 'None'.

* An exception occurred: '[None]' will be set to 'None'.

* An exception occurred: '[None]' will be set to 'None'.

* An exception occurred: '[None]' will be set to 'None'.

DBILund_Cone_25kW_1.csv
* Missing info:  Air [?]
DBILund

In [11]:
# Check results, remove ";".
experiment_info["Cone Calorimeter"].keys()

dict_keys(['Aalto', 'DBILund', 'Edinburgh', 'GIDAZE+', 'HKPoly', 'LCPP', 'NIST', 'TIFP', 'UClan', 'UDRI', 'UQ'])

# Overview over all collected experiments

In [12]:
# Check content of the whole dicitonary.
list(experiment_info.keys())

['TGA', 'DSC', 'Cone Calorimeter']

# Save Experiment Information Dictionary
Now, the extracted information of the experiments, that was stored in the dictionary, ist saved as a text file (Python). Thus, for further work, the dictionary can be imported from the Python file and the information in it is readily accessible. 

In [13]:
# A text file "ExperimentInformation.py" is created and the content of
# the dictionary is written to it.
with open('ExperimentInformation.py','w', encoding='utf8') as exp_info_file:
    exp_info_file.write("matl_db_info = " + str(experiment_info))

# Read the Experiment Information Dictionary
The created file that stores the experiment information dictionary is read again and some brief checks are performed to ensure the process worked as intended.

In [14]:
# Demonstration as to how to read the dictionary again.

# Import of the Python file.
import ExperimentInformation as exp_info
# Re-import ExperimentInformation script 
# (neccessary for changes to take effect without kernel restart).
importlib.reload(exp_info)

# Get path of a CSV file containing data series of a TGA experiment.
file_name = exp_info.matl_db_info["TGA"]["UMET"]["UMET_TGA_N2_1K_1"]["path"]

# Read file as a Pandas DataFrame and show the first five lines.
pd.read_csv(os.path.join(macfp_matl_root, file_name)).head()

Unnamed: 0,Time,Temperature,Mass
0,[s],[K],[mg]
1,0,313.115,4.3361
2,30,313.615,4.3361
3,60,314.115,4.3361
4,90,314.615,4.3361


In [15]:
# Another example, where the "_STA_" part in the test label
# needs to be changed to "_DSC_" to read the *.csv file:

# Get path of a CSV file containing data series of a DSC experiment.
file_name = exp_info.matl_db_info["DSC"]["SANDIA"]["SANDIA_STA_Ar_1K_1"]["path"]

# Read file as a Pandas DataFrame and show the first five lines.
pd.read_csv(os.path.join(macfp_matl_root, file_name)).head()

Unnamed: 0,Time,Temperature,Heat Flow
0,[s],[K],[W/g]
1,1202.2998,307.031,0.00363
2,1434.8772,307.531,0.00841
3,1513.2828,308.031,0.0134
4,1573.4652,308.531,0.0146


In [16]:
# Another example, reading Cone Calorimeter data:

# Get path of a CSV file containing data series of a Cone Calorimeter experiment.
file_name = exp_info.matl_db_info['Cone Calorimeter']['NIST']['NIST_Cone_25KW_2']['path']

# Read file as a Pandas DataFrame and show the first five lines.
pd.read_csv(os.path.join(macfp_matl_root, file_name)).head()

Unnamed: 0,Time,Mass,HRR
0,[s],[g],[kW/m2]
1,0,154.9929772,-0.510215
2,1,155.2843183,0.70071
3,2,155.2847016,-0.317272
4,3,155.1189055,0.907264


## Build README from Dictionary
Here, an example is provided that README files can also be built from a dictionary.

*Needs further development...*