# *PyThermoML* - Examplary Workflow
This template illustreates how *PyThermoML* can be used. 

1. DataReport object creation directly
2. Write DataReport object to ThermoML
3. Reading from ThermoML file for DataReport object creation
4. Read from JSON file for DataReport object creation
5. DaRUS data access

Before usage make sure that you've installed pyThermML as described in the README.md

In [1]:
from pythermo.thermoml.core import PureOrMixtureData, DataReport, Compound, DataPoint

from pythermo.thermoml.vars.componentcomposition import ComponentCompositionBase
from pythermo.thermoml.vars.temperature import TemperatureBase
from pythermo.thermoml.props.transportproperties import TransportProperty

from pythermo.thermoml.props.volumetricproperties import VolumetricProperty
from pythermo.thermoml.tools.writeTools import ThermoMLWriter
from pythermo.thermoml.tools.readTools import ThermoMLReader

## 1. DataReport object creation
First a DataReport instance can be created in which you can declare the title and the DOI of a referred paper and the authors of the dataset.

In [2]:

# title, DOI, authors
authors = {
    "author 1": "authror1",
    "author 2": "author2",
    "author 3": "author3"
}

dataReport = DataReport(title="Title of referred paper",
                        DOI="Doi of referred paper", authors=authors)

## Compounds
After that the compounds of a chemical system can be declared. Giving an ID is mandatory.

In [3]:
comp1 = Compound(ID=1, standardInchI="inhi1", standardInchIKey="inchikey1", smiles="smiles1", commonName="water")
comp2 = Compound(ID=2, standardInchI="inchi2", standardInchIKey="inchikey2", smiles="smiles2", commonName="ethanol")

The attributes of Compounds are an ID, the inchi code, inchi key, smiles code and a common name. Add the compounds to the dataReport object. With the ID you can refer to the compound later.

In [4]:
comp1_ID = dataReport.addCompound(comp1)
comp2_ID = dataReport.addCompound(comp2)

## Pure or mixture data
In objects of the PureOrMixtureData class the actual data is stored: Declare in a PureOrMixtureData object which compounds you have used, by putting them into an array and calling them by their ID. Give your PureOrMixtureData object also an ID. (In our case "pom1")

In [5]:
comps = [comp1_ID, comp2_ID]
experiment = PureOrMixtureData(ID=1, comps=comps, compiler="Matthias Gueltig")

## Declaration of Property and Variable
Declare your examined property (e.g. the viscosity) and variables (e. g. the temperature, mole fraction) of the experiment/simulation.

Concerning the property, enter again an ID and indicate whether you've done a simulation or an experiment.
The variable has also an ID. Be aware, that there are compound specific variables/properties (e. g. mole fraction). The function returns an ID to refer to respective property/variable.

In [6]:
dens = VolumetricProperty.massDensity(ID=1, method='simulation')
sdiffCoeff1 = TransportProperty.selfDiffusionCoefficient(
    ID=2, method='simulation', compoundID=comp1_ID)
sdiffCoeff2 = TransportProperty.selfDiffusionCoefficient(
    ID=3, method='simulation', compoundID=comp2_ID)

# Variable definitions
temp = TemperatureBase.temperature(ID=1)

frac1 = ComponentCompositionBase.moleFraction(2, comp1_ID)
frac2 = ComponentCompositionBase.moleFraction(3, comp2_ID)

This API provides the following properties:
* volumetric properties:
    * mass density (kg/m3)
* heat capacity properties:
    * molar heat capacity at constant pressure (J/K/mol)
    * molar heat capacity at constant volume (J/K/mol)
* transport properties:
    * viscosity (Pa*s)
    * kinematic viscosity (m2/s)
    * self dffusion coefficient (m2/s)
* bioproperties:
    * peakTemperature (K)
* other:
    * surface tension (N/m) 
    * speed of sound (m/s)

This API provides the following variables:
* component composition:
    * mole fraction ()
* temperatures:
    * temperature(K)
    * upper temperature(K)
    * lower temperature(K)
* pressure:
    * pressure (kPa)
    

The units are fixed and cannot be changed!
For usage import the respective packages.

Add initialized properties/variables to experiement. The experiment is now ready to get filled with data.


In [7]:
densID = experiment.addProperty(dens)
dffCoeff1ID = experiment.addProperty(sdiffCoeff1)
dffCoeff2ID = experiment.addProperty(sdiffCoeff2)
tempID = experiment.addVariable(temp)
frac1ID = experiment.addVariable(frac1)
frac2ID = experiment.addVariable(frac2)

## Add data to experiment
Data can be added to experiment by creating Datapoints. Each Datapoint needs an identifier on which measurement you refer to, the measured value and the property/variable used.

In [8]:
measurementID = 1

viscDataPoint = DataPoint(
    measurementID=measurementID,
    value=10.0,
    propID=densID,
)

sdiff1DataPoint1 = DataPoint(
    measurementID=measurementID,
    value=10334,
    propID=dffCoeff1ID
)

sdiff2DataPoint1 = DataPoint(
    measurementID=measurementID,
    value=123123,
    propID=dffCoeff2ID
)

tempDataPoint = DataPoint(
    measurementID=measurementID,
    value=300.0,
    varID=tempID,
)

frac1DataPoint = DataPoint(
    measurementID=measurementID,
    value=0.2,
    varID=frac1ID,
)

frac2DataPoint = DataPoint(
    measurementID=measurementID,
    value=0.8,
    varID=frac2ID,
)

measurementID = 2
sdiff1DataPoint2 = DataPoint(
    measurementID=measurementID,
    value=10334,
    propID=dffCoeff1ID
)

sdiff2DataPoint2 = DataPoint(
    measurementID=measurementID,
    value=123123,
    propID=dffCoeff2ID
)
viscDataPoint2 = DataPoint(
    measurementID=measurementID,
    value=1000.0,
    propID=densID,
    uncertainty=0.1
)

tempDataPoint2 = DataPoint(
    measurementID=measurementID,
    value=1000.0,
    varID=tempID,
    uncertainty=10.0
)

frac1DataPoint2 = DataPoint(
    measurementID=measurementID,
    value=1000.0,
    varID=frac1ID,
    uncertainty=0.01
)

frac2DataPoint2 = DataPoint(
    measurementID=measurementID,
    value=1000,
    varID=frac2ID,
    uncertainty=0.02
)

datapoints = [viscDataPoint, sdiff1DataPoint1, sdiff2DataPoint1,
              tempDataPoint, frac1DataPoint, frac2DataPoint]

datapoints2 = [viscDataPoint2, sdiff1DataPoint2, sdiff2DataPoint2, tempDataPoint2,
               frac1DataPoint2, frac2DataPoint2]

Note: the chosen values don't have any chemical meaning.

For each variable and property an uncertainty can be declared. For variables pyThermoML integrates the **standard deviation** around a given quantity. Concerning properties the API integrates the **standard deviation** too. There are no possibilities to integrate combined uncertainties. 

In [9]:
# add Measurement to experiment
experiment.addMeasurement(dataPoints=datapoints)
experiment.addMeasurement(dataPoints=datapoints2)

# add experiment to dataReport
dataReport.addPureOrMixtureData(experiment)

1

In [10]:
# pretty print dataReport object with functionalities provided by pydantic
print(dataReport.to_string())

{
    "title": "Title of referred paper",
    "DOI": "Doi of referred paper",
    "authors": {
        "author 1": "authror1",
        "author 2": "author2",
        "author 3": "author3"
    },
    "compounds": {
        "1": {
            "ID": 1,
            "standardInchI": "inhi1",
            "standardInchIKey": "inchikey1",
            "smiles": "smiles1",
            "commonName": "water"
        },
        "2": {
            "ID": 2,
            "standardInchI": "inchi2",
            "standardInchIKey": "inchikey2",
            "smiles": "smiles2",
            "commonName": "ethanol"
        }
    },
    "pureOrMixtureData": {
        "1": {
            "ID": 1,
            "compiler": "Matthias Gueltig",
            "comps": [
                "1",
                "2"
            ],
            "properties": {
                "1": {
                    "propName": "Mass density",
                    "propGroup": "VolumetricProp",
                    "ID": 1,
                  

Congratulations your DataReport object has been created sucessfully. Now you can write your DataReport to a ThermoML file by using the writeThermo() function. If you want to modify your ThermoML file you can read it by readFromThermoMLFile()

## 2. Interaction with ThermoML files

In [11]:
# write to 'testThermo.xml'
writer = ThermoMLWriter(folder_thermoML_files="files/", folder_json_files="files/")
writer.writeThermo(dataReport=dataReport, filename="testThermo.xml")

In [12]:
# read from 'testThermo.xml'
reader = ThermoMLReader(folder_thermoML_files="files/", folder_json_files="files/")
dataRepRead1 = reader.readFromThermoMLFile(filename="testThermo.xml")

If you've dowloaded a thermoML file from the NIST archive, you can read it, as long as you just integrate properties which are taken over by pyThermoML, too. PyThermoML does not contain each property of the ThermoML schema definition!

In [13]:
dataRepNist = reader.readFromThermoMLFile(filename="NISTarchive_thermo.xml", NIST=True)

## 3. Interaction with .json files
Writing a .json file containing the dataset is possible too. This .json file can again be read in by pyThermoML.

In [14]:
writer.writeJSON(dataReport=dataReport, filename="testThermo.json")
dataReport = reader.readFromJSON(filename="testThermo.json")

<class 'str'>


## 4. Accesing data with pyThermoML functionalities

In [15]:
# accesing data with softdata functionalities
x = dataReport.pureOrMixtureData["pom1"].measurements["meas1"].properties["p1"]
print(x)

measurementID='meas1' value=10.0 propID='p1' varID=None uncertainty=None numberOfDigits=None data_point_type='Property' elementID='p1'


In [16]:
# accesing data by variable ID and values
x = dataReport.pureOrMixtureData["pom1"].getMeasurementByValues(val1=("v1", 1000.0), val2=("v2",1000.0))
print(list(x.measurements))

At least one matching measurement could be found.
['meas2']


## 5. Upload to DaRUS

In [17]:
from pythermo.thermoml.tools.uploadTools import ThermoMLDaRUSHandler
from pyDaRUS.metadatablocks.citation import SubjectEnum, Contact

handler = ThermoMLDaRUSHandler(folder_thermoML_files="files/")

matthias = Contact(name="Matthias Gueltig", email="matthias.gueltig@ibtb.uni-stuttgart.de")

p_id = handler.uploadToDaRUS(
    thermoML_filename="testThermo.xml",
    dv_path = "testThermo.xml",
    dv_name = "matthias_playground",
    title = "Title",
    subject = [SubjectEnum.chemistry],
    description = "Viscosity examination, extracted from the related publication and integrated into ThermoML.",
    authors = [matthias]
)

'from pythermo.thermoml.tools.uploadTools import ThermoMLDaRUSHandler\nfrom pyDaRUS.metadatablocks.citation import SubjectEnum, Contact\n\nhandler = ThermoMLDaRUSHandler(folder_thermoML_files="files/")\n\nmatthias = Contact(name="Matthias Gueltig", email="matthias.gueltig@ibtb.uni-stuttgart.de")\n\np_id = handler.uploadToDaRUS(\n    thermoML_filename="testThermo.xml",\n    dv_path = "testThermo.xml",\n    dv_name = "matthias_playground",\n    title = "Title",\n    subject = [SubjectEnum.chemistry],\n    description = "Viscosity examination, extracted from the related publication and integrated into ThermoML.",\n    authors = [matthias]\n)'

In [18]:
dataReport = handler.downloadFromDaRUS(p_id=p_id, filename="testThermo.xml")

Again we can acces data by functionalities of the pyThermoML

In [19]:
dataReport.title

'Title of referred paper'