Skip to content
Manuel Pastor edited this page Nov 4, 2020 · 3 revisions

Welcome to the Flame wiki!

Flame specifications

Date: February 14th, 2018

Author: Manuel Pastor

Scope

Develop a modeling framework supporting predictive modeling within eTRANSAFE and easy integration of models within the system. The focus is in model integration and support of models develop with other modeling software like R and KNIME We will also include a model building system based on standard tools, more efficient and adapted to be used in practice

eTOXlab concepts retained

OOP in Python, with RDKit, numpy, scipy and scikitlearn Sharing of modules to ensure consistency in building/prediction Storage of models in directories and subdirectories Method overriding GUI for model management

New features

  • Input will be processed in batch and not molecule-to-molecule, to speed up the calculations. The one-by-one approach will be kept as a fallback, if a non-controlled crash is detected.

  • Model prediction and model building will be fully decoupled, even if they will keep sharing modules. This mean that the models can be built at the modeler workstation and the result can be sent as a tarball to a production environment located somewhere else.

  • Input normalization and output parsing will not be included into the model, to allow performing these steps even if the model is run by external software.

  • Model will be made more generic, so it can be replaced by a call to KNIME/R or other systems. Calls to external software must support inter-container access, particularly for complex models making use of software not included in the Flame container.

  • Input normalization and output parsing will be carried out by other components/classes.

  • Input will not be restricted to chemical structures and will be designed to be generic.

  • Models can call other modules at the input normalization step.

  • GUI (web based) for model building, with integrated visualization tools. Model documentation is available to download in human readable YAML format. Upload function allows to update this documentation (from YAML or JSON).

  • Power users can make use of the classes from Jupiter notebooks or other prototyping tools. This can replace the model building GUI at the initial stages.

  • Parallel processing by series splitting and multithreading.

  • More elaborated classes and method overriding management (the model definition can include code overriding for more than one class)

  • Models can be retrained automatically. The result will not be exposed directly, but will produce quality reports, so the end-user can decide to keep or retract the changes

  • Software updates can be handled using separate (versioned) environments to maintain functionality of old models.

  • However, every model will be assigned a reasonable obsolescence deadline, beyond which, no support will be provided.

  • Each step of the building workflow must produce persistent output (serialized file) stamped with with a hashed tag of the settings. This would allow to restart the workflow without doing unnecessary computations.

  • GUI (web based) for model building now allows to download model parameters and upload them to build a new model without modifying the original model parameters. Upload function allows to update the GUI parameters from a YAML file in build window.

  • GUI (web based) for prediction, accessible from outside of the production environment

Workplan

  • Detailed description of first prototype and target version
  • Decisions on versions
  • Building of development environment (VM)
  • Recycling of eTOXlab components and first 'hello world'
  • First prototype [May 2018]