# Culture Data Viewer


## Input files structures
How files should be structured

### DASGIP® reactor files
These are the zip files produced at the end of the fermentation. CDV was tested with files coming from v4 and v5 of the DASGIP® software.

### Offline measurements files
These are xlsx files containing a single sheet with the one column for each measured compound and the following required columns: 

    Condition, Time (h), OD
    
- The "Condition" column refers to the reactor number in the DASGIP® file.
- The "Time (h)" column indicates the amount of time passed since the inoculation.
- The "OD" column contains the optical density measurements.


### Compounds.xlsx
This file contains information about the properties of each measured compound. It is necessary to perform some of the calculations coded in the calc module. This is an xlsx file with only one sheet with the following columns. Each compound has its own line.

    Compound, Density [g/mL], Molar mass [g/mol], C Atoms [#]
    
- The "Compound" column contains the name of the compound
- The "Density [g/mL]" column contains the density of the compound
- The "Molar mass [g/mol]" column contains the molar mass of the compound
- The "C Atoms [#]" column contains the number of carbon atoms present in the compound

## Usage

In [None]:
import pandas as pd

import das
import merge
import calc
import visualize

In [None]:
# Specify your entry files:
reactor_file_dasgip = "DAS01.zip"
offline_file = "DAS01-offline.xlsx"
compounds_file = "Compounds.xlsx"

In [None]:
# Data necessary for the calculations in the CALC module

# cmpd_data_dict: dictionary of 'feeds'. Each feed is a list of compounds, expressed as dictionaries with the following
# keys: (cmpd: compound name, Concentration: concentration of the compound in the feed (g/L), Density: density of the
# feed). These data are necessary to calculate the amount of each compound added with the feeding.
cmpd_data_dict = {"Feed1": [{"cmpd": "Cmpd1", "Concentration": 600, "Density": 1100}]}

# cmpd_set: set of compound names that will be analysed. In the offline file, only the columns with names contained in
# this list will be imported.
cmpd_set = {'Cmpd1', 'Cmpd2', 'Cmpd3', 'Cmpd4', 'Cmpd5'}

# cmpds_substrates: set of compound names that will be considered as substrates.
cmpds_substrates = {'Cmpd1'}

Culture Data Viewer uses pandas' *DataFrames (df)* to hold tabular *culture* data. 
For this reason the main object in CDV is called ***Culture DataFrame (cdf)***.

In [None]:
# Import the original files

# cdf_dict: each DASGIP® file contains multiple reactor tracks. Each track is extracted to a cdf and saved in the cdf
# dictionary with the reactor number as key.
cdf_dict = das.extract(reactor_file_dasgip)

offline_cdf_dict = merge.open_offline_file(offline_file)

cmpds_properties = pd.read_excel(compounds_file).set_index("Compound")

In [None]:
# Merge the offline data to the reactor data
merged_cdf_dict = merge.merge_df_dicts(cdf_dict, offline_cdf_dict)

In [None]:
# Working on one of the cultures
cdf = merged_cdf_dict[1]

In [None]:
# Perform all the calculations specified in the CALC module 
cdf_calc = calc.complete_analysis(cdf, cmpds_substrates, cmpds_properties, cmpd_set, cmpd_data_dict)

In [None]:
# Define the plotting parameters
nax1 = {"cols":["OD", "Cmpd1"], "mode":"lines+markers", "title":"OD [A.U.], Cmpd1 [g/L]", "titlefont":{"color":"#000000"}, "tickfont":{"color":"#000000"}}
nax2 = {"cols":["Cmpd5"], "mode":"lines+markers", "title":"Cmpd5 [g/L]", "titlefont":{"color":"#000000"}, "tickfont":{"color":"#000000"}}
nax3 = {"cols":["CO2_g_prod_cumsum [g]"], "mode":"lines", "title":"CO2 produced [g]", "titlefont":{"color":"#000000"}, "tickfont":{"color":"#000000"}}
nax4 = {"cols":['Cmpd2', 'Cmpd3', 'Cmpd4'], "mode":"lines+markers", "title":"Cmpd2, Cmpd3, Cmpd4 [g/L]", "titlefont":{"color":"#000000"}, "tickfont":{"color":"#000000"}}

x_axes_list = [nax1, nax2, nax3, nax4]



# Setting Inoculation Time as x axis
cdf_calc["InoculationTime []"] = cdf_calc["InoculationTime []"] / pd.Timedelta(hours=1) 
cdf_calc.set_index("InoculationTime []", inplace = True)

f = visualize.plot(cdf_calc, x_axes_list)

In [None]:
f.show()