-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
9 changed files
with
1,687 additions
and
12 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,199 @@ | ||
Validation of EnzymeML documents | ||
================================ | ||
|
||
EnzymeML is considered a container for data and does not perform any | ||
validation aside from data type checks. Hence, a user is free to insert | ||
whatever is necessary for the application without any restrictions. | ||
However, once data will be published to databases, data compliance needs | ||
to be guaranteed. | ||
|
||
For this, PyEnzyme allows EnzymeML documents to be validated against a | ||
database standard before upload. Databases can host a specific YAML file | ||
that can be generated from a spreadsheet, which in turn will validate | ||
compliance or not. In addition, if the document is non-compliant, a | ||
report will be given where and why a document received negative | ||
validation. | ||
|
||
The YAML validation file mirrors the complete EnzymeML data model and | ||
offers content to be checked on the following attributes: | ||
|
||
- **Mandatory**: Whether or not this field is required. | ||
- **Value ranges**: An interval where vertain values should be | ||
- **Controlled vocabularies**: For fields where only certain values are | ||
allowed. Intended to use for textual fields. | ||
|
||
The following example will demonstrate how to generate a EnzymeML | ||
Validation Spreadsheet and convert it to to a YAML file. Finally, an | ||
example ``EnzymeMLDocument`` will be loaded and validated against the | ||
given YAML file. For the sake of demonstration, validation will fail to | ||
display an example report. | ||
|
||
.. code:: ipython3 | ||
import pyenzyme as pe | ||
Generation and conversion of a validation spreadsheet | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
The ``EnzymeMLValidator`` class has methods to generate and convert an | ||
EnzymeML validation spreadsheet. It should be noted, that the generated | ||
spreadsheet will always be up to the data models state and is not | ||
maintained manually. The ``EnzymeMLDocument`` class definition is | ||
recursively inferred to generate the file. This way, once the data model | ||
is extended, the spreadsheet will be updated too. | ||
|
||
.. code:: ipython3 | ||
from pyenzyme.enzymeml.tools import EnzymeMLValidator | ||
.. code:: ipython3 | ||
# Generation of a validation spreadsheet | ||
EnzymeMLValidator.generateValidationSpreadsheet(".") | ||
# ... for those who like to go directly to YAML | ||
yaml_string = EnzymeMLValidator.generateValidationYAML(".") | ||
Using an example spreadsheet | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Since the blank validation YAML wont demonstrate all types of checks, we | ||
are going to use an example that has been provided in this directory and | ||
convert it to YAML. | ||
|
||
.. code:: ipython3 | ||
# Convert an example spreadsheet to YAML | ||
yaml_string = EnzymeMLValidator.convertSheetToYAML( | ||
path="EnzymeML_Validation_Template_Example.xlsx", | ||
filename="EnzymeML_Validation_Template_Example" | ||
) | ||
Performing validation | ||
~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Once the YAML file is ready, validation can be done for an example | ||
``EnzymeMLDocument`` found in this directory. The validation for this | ||
example will fail by intention and thus return a report taht will be | ||
shown here. Such a report is returned as ``Dict`` and can be inspected | ||
either manually or programmatically. This was done to allow automation | ||
workflows to utilize validation. | ||
|
||
.. code:: ipython3 | ||
# Load an example document | ||
enzmldoc = pe.EnzymeMLDocument.fromFile("Model_4.omex") | ||
# Perform validation against the preciously generated YAML | ||
report, is_valid = enzmldoc.validate(yaml_path="EnzymeML_Validation_Template_Example.yaml") | ||
print(f">> Document is valid: {is_valid}") | ||
.. parsed-literal:: | ||
>> Document is valid: False | ||
.. code:: ipython3 | ||
# Lets inspect the report | ||
import json | ||
print(json.dumps(report, indent=4)) | ||
.. parsed-literal:: | ||
{ | ||
"name": { | ||
"enum_error": "Value of 'EnzymeML_Lagerman' does not comply with vocabulary ['Specific Title']" | ||
}, | ||
"reactant_dict": { | ||
"s0": { | ||
"init_conc": { | ||
"range_error": "Value of '20.0' is out of range for [400.0, 600.0]" | ||
} | ||
}, | ||
"s1": { | ||
"init_conc": { | ||
"range_error": "Value of '42.0' is out of range for [400.0, 600.0]" | ||
} | ||
}, | ||
"s2": { | ||
"init_conc": { | ||
"range_error": "Value of '0.0' is out of range for [400.0, 600.0]" | ||
} | ||
}, | ||
"s3": { | ||
"init_conc": { | ||
"range_error": "Value of '0.0' is out of range for [400.0, 600.0]" | ||
} | ||
} | ||
}, | ||
"global_parameters": { | ||
"v_r": { | ||
"value": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
}, | ||
"initial_value": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
}, | ||
"upper": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
}, | ||
"lower": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
}, | ||
"stdev": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
}, | ||
"ontology": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
} | ||
}, | ||
"K_si": { | ||
"value": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
}, | ||
"initial_value": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
}, | ||
"upper": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
}, | ||
"lower": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
}, | ||
"stdev": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
}, | ||
"ontology": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
} | ||
}, | ||
"K_n": { | ||
"value": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
}, | ||
"initial_value": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
}, | ||
"upper": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
}, | ||
"lower": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
}, | ||
"stdev": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
}, | ||
"ontology": { | ||
"mandatory_error": "Mandatory attribute is not given." | ||
} | ||
} | ||
} | ||
} | ||
-------------- |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,139 @@ | ||
Visualisation of an EnzymeML document | ||
===================================== | ||
|
||
PyEnzyme offers the ability to visualize experimental data present in an | ||
EnzymeML document for inspection and publication. The method can be | ||
specified such that either static or interactive visualisations are be | ||
returned. In addition, the visualisation can be parametrized to specific | ||
measurements by using the ``measurement_ids`` argument to only visualize | ||
a subset of the given data. | ||
|
||
The following example will demonstrate the usage of the | ||
``visualize``-method as well as how to display only a single measurement | ||
and a subset. It should be noted, that the method returns an object to | ||
further optimize the plot to your needs. | ||
|
||
.. code:: ipython3 | ||
import pyenzyme as pe | ||
.. code:: ipython3 | ||
# Load the EnzymeML document at first | ||
enzmldoc = pe.EnzymeMLDocument.fromFile("Model_4.omex") | ||
Visualising all measurements | ||
---------------------------- | ||
|
||
Static | ||
~~~~~~ | ||
|
||
By default the ``visualize``-method returns a static scatterplot and by | ||
utilizing the ``use_names`` argument the species IDs will be converted | ||
to their actual names for an improved readability. Furthermore, | ||
trendlines can be added by using the ``trendline`` argument. | ||
|
||
.. code:: ipython3 | ||
fig = enzmldoc.visualize(use_names=True, trendline=True) | ||
.. image:: output_4_0.png | ||
|
||
|
||
Interactive | ||
~~~~~~~~~~~ | ||
|
||
Interactive visualisations behaves the same way and are returned when | ||
the ``interactive`` argument is set to ``True``. These plots are based | ||
on ``Plotly`` and offer controls such as a Zoom, an Export as well as a | ||
selection of which species should be displayed. | ||
|
||
.. code:: ipython3 | ||
enzmldoc.visualize(interactive=True, trendline=True, use_names=True) | ||
.. image:: output_6_1.png | ||
|
||
----------------------------------- | ||
|
||
It is not always desired to visualize all data but a single or subset of | ||
the given data. This can be done by using the | ||
``measurement_ids``\ argument, where measurement IDs are given as a list | ||
of strings. Since these apply to both interactive and static | ||
visualisations, the following will only display the interactive case for | ||
a single measurement. | ||
|
||
.. code:: ipython3 | ||
# FYI: Measurement IDs always start with an "m" | ||
enzmldoc.visualize(interactive=True, use_names=True, trendline=True, measurement_ids=["m1"]) | ||
.. image:: output_8_0.png | ||
|
||
One last thing | ||
~~~~~~~~~~~~~~~~~ | ||
|
||
If you are not sure about the ID of the measurements you’d like to | ||
visualize, use the ``printMeasurements``-method in the | ||
``EnzymeMLDocument`` object. It will not only display the IDs and names | ||
of your measurements, but also the initial concentrations assigned to | ||
each species. | ||
|
||
.. code:: ipython3 | ||
enzmldoc.printMeasurements() | ||
.. parsed-literal:: | ||
>>> Measurement m0: Cephalexin synthesis 1 | ||
s0 | initial conc: 20.0 mmole / l | #replicates: 1 | ||
s1 | initial conc: 10.0 mmole / l | #replicates: 1 | ||
s2 | initial conc: 0.0 mmole / l | #replicates: 1 | ||
s3 | initial conc: 2.0 mmole / l | #replicates: 1 | ||
p0 | initial conc: 0.0002 mmole / l | #replicates: 0 | ||
>>> Measurement m1: Cephalexin synthesis 2 | ||
s0 | initial conc: 20.0 mmole / l | #replicates: 1 | ||
s1 | initial conc: 20.0 mmole / l | #replicates: 1 | ||
s2 | initial conc: 0.0 mmole / l | #replicates: 1 | ||
s3 | initial conc: 1.3 mmole / l | #replicates: 1 | ||
p0 | initial conc: 0.0002 mmole / l | #replicates: 0 | ||
>>> Measurement m2: Cephalexin synthesis 3 | ||
s0 | initial conc: 20.0 mmole / l | #replicates: 1 | ||
s1 | initial conc: 40.0 mmole / l | #replicates: 1 | ||
s2 | initial conc: 0.0 mmole / l | #replicates: 1 | ||
s3 | initial conc: 5.1 mmole / l | #replicates: 1 | ||
p0 | initial conc: 0.0002 mmole / l | #replicates: 0 | ||
>>> Measurement m3: Cephalexin synthesis 4 | ||
s0 | initial conc: 20.0 mmole / l | #replicates: 1 | ||
s1 | initial conc: 60.0 mmole / l | #replicates: 1 | ||
s2 | initial conc: 0.0 mmole / l | #replicates: 1 | ||
s3 | initial conc: 1.9 mmole / l | #replicates: 1 | ||
p0 | initial conc: 0.0002 mmole / l | #replicates: 0 | ||
>>> Measurement m4: Cephalexin synthesis 5 | ||
s0 | initial conc: 20.0 mmole / l | #replicates: 1 | ||
s1 | initial conc: 42.0 mmole / l | #replicates: 1 | ||
s2 | initial conc: 0.0 mmole / l | #replicates: 1 | ||
s3 | initial conc: 1.5 mmole / l | #replicates: 1 | ||
p0 | initial conc: 0.0002 mmole / l | #replicates: 0 | ||
>>> Measurement m5: Cephalexin synthesis 6 | ||
s0 | initial conc: 40.0 mmole / l | #replicates: 1 | ||
s1 | initial conc: 42.0 mmole / l | #replicates: 1 | ||
s2 | initial conc: 0.0 mmole / l | #replicates: 1 | ||
s3 | initial conc: 3.3 mmole / l | #replicates: 1 | ||
p0 | initial conc: 0.0002 mmole / l | #replicates: 0 | ||
>>> Measurement m6: Cephalexin synthesis 7 | ||
s0 | initial conc: 76.0 mmole / l | #replicates: 1 | ||
s1 | initial conc: 40.0 mmole / l | #replicates: 1 | ||
s2 | initial conc: 0.0 mmole / l | #replicates: 1 | ||
s3 | initial conc: 5.7 mmole / l | #replicates: 1 | ||
p0 | initial conc: 0.0002 mmole / l | #replicates: 0 | ||
>>> Measurement m7: Cephalexin synthesis 8 | ||
s0 | initial conc: 140.0 mmole / l | #replicates: 1 | ||
s1 | initial conc: 40.0 mmole / l | #replicates: 1 | ||
s2 | initial conc: 0.0 mmole / l | #replicates: 1 | ||
s3 | initial conc: 14.0 mmole / l | #replicates: 1 | ||
p0 | initial conc: 0.0002 mmole / l | #replicates: 0 | ||
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
Upload to Dataverse | ||
=================== | ||
|
||
PyEnzyme offers the upload to any Dataverse installation that supports | ||
the official `EnzymeML | ||
metadatablock <https://doi.org/10.18419/darus-2105>`__ by utilizing the | ||
Dataverse API `PyDaRUS <https://github.com/JR-1991/pyDaRUS>`__ to map | ||
all relevant fields and perform upload. The following steps will be done | ||
in this example: | ||
|
||
- Convert an EnzymeML spreadsheet to an ``EnzymeMLDocument`` | ||
- Upload the dataset to Dataverse | ||
|
||
.. code:: ipython3 | ||
import pyenzyme as pe | ||
.. code:: ipython3 | ||
# Load the EnzymeMLDocument | ||
enzmldoc = pe.EnzymeMLDocument.fromTemplate("EnzymeML_Template_Example.xlsm") | ||
.. code:: ipython3 | ||
# Upload it to Dataverse (Dataset is private) | ||
enzmldoc.uploadToDataverse(dataverse_name="playground") | ||
For reasons of data quality, the resulting dataset cant be viewed on the | ||
web. In order to visit examples that have utilized the method, see the | ||
`EnzymeML at | ||
Work <https://darus.uni-stuttgart.de/dataverse/enzymeml_at_work>`__ | ||
collection. | ||
|
||
-------------- |
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.