Skip to content

Commit

Permalink
Exported examples to RST
Browse files Browse the repository at this point in the history
  • Loading branch information
JR-1991 committed Mar 20, 2022
1 parent 7e5afb3 commit b1582ca
Show file tree
Hide file tree
Showing 9 changed files with 1,687 additions and 12 deletions.
588 changes: 588 additions & 0 deletions docs/_examples/01_KineticModeling_PySCeS.rst

Large diffs are not rendered by default.

722 changes: 722 additions & 0 deletions docs/_getstarted/01_BasicUsage.rst

Large diffs are not rendered by default.

199 changes: 199 additions & 0 deletions docs/_getstarted/02_Validation.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
Validation of EnzymeML documents
================================

EnzymeML is considered a container for data and does not perform any
validation aside from data type checks. Hence, a user is free to insert
whatever is necessary for the application without any restrictions.
However, once data will be published to databases, data compliance needs
to be guaranteed.

For this, PyEnzyme allows EnzymeML documents to be validated against a
database standard before upload. Databases can host a specific YAML file
that can be generated from a spreadsheet, which in turn will validate
compliance or not. In addition, if the document is non-compliant, a
report will be given where and why a document received negative
validation.

The YAML validation file mirrors the complete EnzymeML data model and
offers content to be checked on the following attributes:

- **Mandatory**: Whether or not this field is required.
- **Value ranges**: An interval where vertain values should be
- **Controlled vocabularies**: For fields where only certain values are
allowed. Intended to use for textual fields.

The following example will demonstrate how to generate a EnzymeML
Validation Spreadsheet and convert it to to a YAML file. Finally, an
example ``EnzymeMLDocument`` will be loaded and validated against the
given YAML file. For the sake of demonstration, validation will fail to
display an example report.

.. code:: ipython3
import pyenzyme as pe
Generation and conversion of a validation spreadsheet
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``EnzymeMLValidator`` class has methods to generate and convert an
EnzymeML validation spreadsheet. It should be noted, that the generated
spreadsheet will always be up to the data models state and is not
maintained manually. The ``EnzymeMLDocument`` class definition is
recursively inferred to generate the file. This way, once the data model
is extended, the spreadsheet will be updated too.

.. code:: ipython3
from pyenzyme.enzymeml.tools import EnzymeMLValidator
.. code:: ipython3
# Generation of a validation spreadsheet
EnzymeMLValidator.generateValidationSpreadsheet(".")
# ... for those who like to go directly to YAML
yaml_string = EnzymeMLValidator.generateValidationYAML(".")
Using an example spreadsheet
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Since the blank validation YAML wont demonstrate all types of checks, we
are going to use an example that has been provided in this directory and
convert it to YAML.

.. code:: ipython3
# Convert an example spreadsheet to YAML
yaml_string = EnzymeMLValidator.convertSheetToYAML(
path="EnzymeML_Validation_Template_Example.xlsx",
filename="EnzymeML_Validation_Template_Example"
)
Performing validation
~~~~~~~~~~~~~~~~~~~~~

Once the YAML file is ready, validation can be done for an example
``EnzymeMLDocument`` found in this directory. The validation for this
example will fail by intention and thus return a report taht will be
shown here. Such a report is returned as ``Dict`` and can be inspected
either manually or programmatically. This was done to allow automation
workflows to utilize validation.

.. code:: ipython3
# Load an example document
enzmldoc = pe.EnzymeMLDocument.fromFile("Model_4.omex")
# Perform validation against the preciously generated YAML
report, is_valid = enzmldoc.validate(yaml_path="EnzymeML_Validation_Template_Example.yaml")
print(f">> Document is valid: {is_valid}")
.. parsed-literal::
>> Document is valid: False
.. code:: ipython3
# Lets inspect the report
import json
print(json.dumps(report, indent=4))
.. parsed-literal::
{
"name": {
"enum_error": "Value of 'EnzymeML_Lagerman' does not comply with vocabulary ['Specific Title']"
},
"reactant_dict": {
"s0": {
"init_conc": {
"range_error": "Value of '20.0' is out of range for [400.0, 600.0]"
}
},
"s1": {
"init_conc": {
"range_error": "Value of '42.0' is out of range for [400.0, 600.0]"
}
},
"s2": {
"init_conc": {
"range_error": "Value of '0.0' is out of range for [400.0, 600.0]"
}
},
"s3": {
"init_conc": {
"range_error": "Value of '0.0' is out of range for [400.0, 600.0]"
}
}
},
"global_parameters": {
"v_r": {
"value": {
"mandatory_error": "Mandatory attribute is not given."
},
"initial_value": {
"mandatory_error": "Mandatory attribute is not given."
},
"upper": {
"mandatory_error": "Mandatory attribute is not given."
},
"lower": {
"mandatory_error": "Mandatory attribute is not given."
},
"stdev": {
"mandatory_error": "Mandatory attribute is not given."
},
"ontology": {
"mandatory_error": "Mandatory attribute is not given."
}
},
"K_si": {
"value": {
"mandatory_error": "Mandatory attribute is not given."
},
"initial_value": {
"mandatory_error": "Mandatory attribute is not given."
},
"upper": {
"mandatory_error": "Mandatory attribute is not given."
},
"lower": {
"mandatory_error": "Mandatory attribute is not given."
},
"stdev": {
"mandatory_error": "Mandatory attribute is not given."
},
"ontology": {
"mandatory_error": "Mandatory attribute is not given."
}
},
"K_n": {
"value": {
"mandatory_error": "Mandatory attribute is not given."
},
"initial_value": {
"mandatory_error": "Mandatory attribute is not given."
},
"upper": {
"mandatory_error": "Mandatory attribute is not given."
},
"lower": {
"mandatory_error": "Mandatory attribute is not given."
},
"stdev": {
"mandatory_error": "Mandatory attribute is not given."
},
"ontology": {
"mandatory_error": "Mandatory attribute is not given."
}
}
}
}
--------------
139 changes: 139 additions & 0 deletions docs/_getstarted/03_Visualisation.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
Visualisation of an EnzymeML document
=====================================

PyEnzyme offers the ability to visualize experimental data present in an
EnzymeML document for inspection and publication. The method can be
specified such that either static or interactive visualisations are be
returned. In addition, the visualisation can be parametrized to specific
measurements by using the ``measurement_ids`` argument to only visualize
a subset of the given data.

The following example will demonstrate the usage of the
``visualize``-method as well as how to display only a single measurement
and a subset. It should be noted, that the method returns an object to
further optimize the plot to your needs.

.. code:: ipython3
import pyenzyme as pe
.. code:: ipython3
# Load the EnzymeML document at first
enzmldoc = pe.EnzymeMLDocument.fromFile("Model_4.omex")
Visualising all measurements
----------------------------

Static
~~~~~~

By default the ``visualize``-method returns a static scatterplot and by
utilizing the ``use_names`` argument the species IDs will be converted
to their actual names for an improved readability. Furthermore,
trendlines can be added by using the ``trendline`` argument.

.. code:: ipython3
fig = enzmldoc.visualize(use_names=True, trendline=True)
.. image:: output_4_0.png


Interactive
~~~~~~~~~~~

Interactive visualisations behaves the same way and are returned when
the ``interactive`` argument is set to ``True``. These plots are based
on ``Plotly`` and offer controls such as a Zoom, an Export as well as a
selection of which species should be displayed.

.. code:: ipython3
enzmldoc.visualize(interactive=True, trendline=True, use_names=True)
.. image:: output_6_1.png

-----------------------------------

It is not always desired to visualize all data but a single or subset of
the given data. This can be done by using the
``measurement_ids``\ argument, where measurement IDs are given as a list
of strings. Since these apply to both interactive and static
visualisations, the following will only display the interactive case for
a single measurement.

.. code:: ipython3
# FYI: Measurement IDs always start with an "m"
enzmldoc.visualize(interactive=True, use_names=True, trendline=True, measurement_ids=["m1"])
.. image:: output_8_0.png

One last thing
~~~~~~~~~~~~~~~~~

If you are not sure about the ID of the measurements you’d like to
visualize, use the ``printMeasurements``-method in the
``EnzymeMLDocument`` object. It will not only display the IDs and names
of your measurements, but also the initial concentrations assigned to
each species.

.. code:: ipython3
enzmldoc.printMeasurements()
.. parsed-literal::
>>> Measurement m0: Cephalexin synthesis 1
s0 | initial conc: 20.0 mmole / l | #replicates: 1
s1 | initial conc: 10.0 mmole / l | #replicates: 1
s2 | initial conc: 0.0 mmole / l | #replicates: 1
s3 | initial conc: 2.0 mmole / l | #replicates: 1
p0 | initial conc: 0.0002 mmole / l | #replicates: 0
>>> Measurement m1: Cephalexin synthesis 2
s0 | initial conc: 20.0 mmole / l | #replicates: 1
s1 | initial conc: 20.0 mmole / l | #replicates: 1
s2 | initial conc: 0.0 mmole / l | #replicates: 1
s3 | initial conc: 1.3 mmole / l | #replicates: 1
p0 | initial conc: 0.0002 mmole / l | #replicates: 0
>>> Measurement m2: Cephalexin synthesis 3
s0 | initial conc: 20.0 mmole / l | #replicates: 1
s1 | initial conc: 40.0 mmole / l | #replicates: 1
s2 | initial conc: 0.0 mmole / l | #replicates: 1
s3 | initial conc: 5.1 mmole / l | #replicates: 1
p0 | initial conc: 0.0002 mmole / l | #replicates: 0
>>> Measurement m3: Cephalexin synthesis 4
s0 | initial conc: 20.0 mmole / l | #replicates: 1
s1 | initial conc: 60.0 mmole / l | #replicates: 1
s2 | initial conc: 0.0 mmole / l | #replicates: 1
s3 | initial conc: 1.9 mmole / l | #replicates: 1
p0 | initial conc: 0.0002 mmole / l | #replicates: 0
>>> Measurement m4: Cephalexin synthesis 5
s0 | initial conc: 20.0 mmole / l | #replicates: 1
s1 | initial conc: 42.0 mmole / l | #replicates: 1
s2 | initial conc: 0.0 mmole / l | #replicates: 1
s3 | initial conc: 1.5 mmole / l | #replicates: 1
p0 | initial conc: 0.0002 mmole / l | #replicates: 0
>>> Measurement m5: Cephalexin synthesis 6
s0 | initial conc: 40.0 mmole / l | #replicates: 1
s1 | initial conc: 42.0 mmole / l | #replicates: 1
s2 | initial conc: 0.0 mmole / l | #replicates: 1
s3 | initial conc: 3.3 mmole / l | #replicates: 1
p0 | initial conc: 0.0002 mmole / l | #replicates: 0
>>> Measurement m6: Cephalexin synthesis 7
s0 | initial conc: 76.0 mmole / l | #replicates: 1
s1 | initial conc: 40.0 mmole / l | #replicates: 1
s2 | initial conc: 0.0 mmole / l | #replicates: 1
s3 | initial conc: 5.7 mmole / l | #replicates: 1
p0 | initial conc: 0.0002 mmole / l | #replicates: 0
>>> Measurement m7: Cephalexin synthesis 8
s0 | initial conc: 140.0 mmole / l | #replicates: 1
s1 | initial conc: 40.0 mmole / l | #replicates: 1
s2 | initial conc: 0.0 mmole / l | #replicates: 1
s3 | initial conc: 14.0 mmole / l | #replicates: 1
p0 | initial conc: 0.0002 mmole / l | #replicates: 0
34 changes: 34 additions & 0 deletions docs/_getstarted/04_UploadToDataverse.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
Upload to Dataverse
===================

PyEnzyme offers the upload to any Dataverse installation that supports
the official `EnzymeML
metadatablock <https://doi.org/10.18419/darus-2105>`__ by utilizing the
Dataverse API `PyDaRUS <https://github.com/JR-1991/pyDaRUS>`__ to map
all relevant fields and perform upload. The following steps will be done
in this example:

- Convert an EnzymeML spreadsheet to an ``EnzymeMLDocument``
- Upload the dataset to Dataverse

.. code:: ipython3
import pyenzyme as pe
.. code:: ipython3
# Load the EnzymeMLDocument
enzmldoc = pe.EnzymeMLDocument.fromTemplate("EnzymeML_Template_Example.xlsm")
.. code:: ipython3
# Upload it to Dataverse (Dataset is private)
enzmldoc.uploadToDataverse(dataverse_name="playground")
For reasons of data quality, the resulting dataset cant be viewed on the
web. In order to visit examples that have utilized the method, see the
`EnzymeML at
Work <https://darus.uni-stuttgart.de/dataverse/enzymeml_at_work>`__
collection.

--------------
Binary file added docs/_getstarted/output_4_0.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_getstarted/output_6_1.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_getstarted/output_8_0.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit b1582ca

Please sign in to comment.