Core elements

Ocelot workflow

Running Ocelot usually involves the following steps:

Get the path of an undefined database in ecospold2 format on your local computer.
Decide on a system model configuration to transform the undefined datasets to a linked database. This configuration could be a list of Python functions, or could be the default Ocelot system model.
Call the system_model function, either directly or through the command line application. system_model takes the directory path from step one and the configuration from step two as inputs.
Look through the HTML report generated by the system model function, and either accept the given linked database, or make changes in your configuration definitions or transformation functions.

Configuration & `system_model`

An Ocelot system model configuration is essentially just a list of transformation functions which, when applied in order, produce a realization of a linked database. Configurations are currently specified in Python code, but in the future will also be able to be defined in other formats such as Excel.

We are actively exploring various ways of defining these configurations. The built-in configurations will be provided as a list of transformation functions already in Ocelot, perhaps wrapped in a configuration object. Another simple configuration format would be a text file, where each line was the name of a transformation function that could be imported from ocelot.transformations. However, this doesn't work well for user-defined functions, nor if you need to prepare functions by e.g. currying them. We are also looking at several configuration libraries, but haven't found anything that seems to fit our mental models or use cases well:

ConfigParser (In Python standard library)
configure
PyStaticConfiguration
pymlconf

So far no final decisions have been made, and things here will evolve along with the Ocelot codebase.

Running Ocelot without specifying a configuration will use the default configuration, which is the cutoff system model.

A typical system model may have many transformation functions, as each function should do exactly one specific change. To make configurations more readable, you can use a Collection object to group transformation functions that are commonly used together, or that form one unit of work.

ocelot.Collection

The system_model function is actually quite simple:

ocelot.model.system_model

Transformation functions

Note

Logging

Ocelot uses standard python logging, with a custom formatter that encodes log messages to JSON dictionaries. Due to this custom formatter, the ocelot logger must be retrieved in each file which uses logging:

import logging

logger = logging.getLogger('ocelot')

def my_transformation(data):
    logger.info({"message": "something", "count": len(data)})

Log messages are written when a run is started or finished, when transformation functions are started or finished, and whenever the transformation function wants to log something. The message format for the log written to disk (i.e. with each line JSON encoded) is documented in logging-format.

Note

time is added automatically to each log message.

Reports

In the last step in the workflow, the model run log data is formatted into an HTML report.

ocelot.HTMLReport

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

foundation.rst

foundation.rst

Core elements

Ocelot workflow

Configuration & `system_model`

Transformation functions

Logging

Reports

Files

foundation.rst

Latest commit

History

foundation.rst

File metadata and controls

Core elements

Ocelot workflow

Configuration & system_model

Transformation functions

Logging

Reports

Configuration & `system_model`