<font size="+3">Hermes-workflow toolkit</font>

# Overview

The Hermes workflow is designed to automate the creation of configuration files and the execution of files for running general applications.

Each workflow is stored in JSON format. To execute a workflow, the JSON is translated into a Python program using the Hermes package.
The Hermes-workflow toolkit manages the toolkits within a hera project.
This toolkit enables users to add workflows to the project, check for the existence of a workflow based on its parameters, and compare different
workflows.

Each workflow has a name, typically based on the name of the JSON workflow file, although this is not mandatory.
Additionally, each workflow belongs to a simulation group (simulationGroup). The toolkit allows the users
to simply compare the parameters of the different workflows within the group.
Generally, the simulations are named [group name]_[group id]. However, users can choose names that do not follow this convention.
The groups are defined dynamically, that is, if there is a workflow with a group defined.

The toolkit can be used as a library from code, or directly from a command-line interface (CLI). This CLI enables users to perform all operations conveniently.

# Hermes workflows

Preparing the usage of the simulations by building the workflow. 
To do so, we need to get the workflow template


Getting the workflow template can be achieved by:

1. Building the workflow from scratch.
1. Adjust an existing template manually
1. Adjust an existing template using case configuration.

Building a workflow from scratch and adjusting it manually is documented in the hermes workflow.

Adjusting the existing template using case configurateion is covered here.
We note that sometimes it is necessary to adjust the results manually (or using the GUI in the hermes workflow).

The specialization of the workflow perfomed by setting the values of different parameters in the template
to reflect the needs of the specific simulation. For example, in a template for wind calculation using OpenFOAM,
the specialization is setting up the topography and the blockmesh (and possible the urban region, if needed).

The current (<it>planned</it>) specializations are:

* Flow field - indoors
* Flow field - outdoors 
    * Topography 
    * Urban
    * Topography and urban
* Stochastic Lagrangian Dispersion

(The explanation of the specialization for the different cases will be described in the future)


# Usage

For the code examples, we would need to initialize a toolkit. 
We can Initialize a `SIMULATIONS_WORKFLOWS` or a `SIMULATIONS_OPENFOAM` toolkit

In [3]:
from hera import toolkitHome
projectName = "documentation"
tk = toolkitHome.getToolkit(toolkitName=toolkitHome.SIMULATIONS_WORKFLOWS,projectName=projectName)

 INFO      :hera.simulations.hermesWorkflowToolkit.workflowToolkit.init project.py/213/__init__: Initializing with logger documentation


AttributeError: 'workflowToolkit' object has no attribute 'logger'

## Add/update workflows

Add the workflow with the file name **[group name]_[group id]**  to the group in the project. 
If the file has a name in another format then use 

The user can supply the workgroup, or let hera determine it from the code.


If the simulation with that id already exists, it updates the parameters in the database (if the 
overwrite flag is set). 

A workflow that was added to the project, belongs to a workflow group automatically.


When executing, the code automatically updates the python-workflow, removes the old dependecy files and executes the workflow.

### Command Line Interface 
<div class="alert alert-success" role="alert">    
    >> hera-workflows add {workflow file}
                         [--projectName {projectName}]
                         [--groupName {groupName}]
                         [--overwrite]
                         [--force]
                         [--assignName]
                         [--execute]
</div>    

* if --projectName is not supplied, the try to read it from the caseConfiguration.json file.

* If --groupName appears use the name supplied as the group name.

  Otherwise deduce the groupname from the workflow file name.
  That is, we assume that the name of the workflow is {groupname}_{id}.json

* If --overwrite exists than overwite the DB document with the contents
  of the file. This allows the update of the workflow

* If --force exists than allow the addition of workflow that exists in the DB under a different name.

* If --assignName exists then find the next available ID in the group and use it.

* Use the --execute to build and execute the workflow.
    
### Code  

The code allows the user to determine the data

1. Adds the workflow to the database in the requested group
2. Builds the template (.json) and python executer
3. Runs the workflow.

The stages are executed according to the buildMode.

Notes:

* If the workflow is already in the db in a different name adds to the db only if **force** is True.

* If the workflowName already exist in the group then overwrite its definitions
  only if the **overwrite** is True.

* If the template and python execution files exist on the disk, raise error unless overwrite is True.

* If the group is None, parse the file name to get the group. That is, we assume that the
  file name has the structure : {groupname}_{id}. If the {id} is not an integer,
  the id in the database will be saved as None.


In [None]:
workflowJSON=os.path.abspath("Flow_1.json")
groupName= None
assignName=False,
overwrite=True,
execute=False,
force=False

tk.addCaseToGroup(workflowJSON=groupName,
                groupName= groupName,
                assignName=assignName,
                overwrite=overwrite,
                execute=execute,
                force=force)


## Add workflow and execute it

The execute commands is similar to the add command, but it executes the workflow.

Remember that you can also execute a workflow using the hermes-workflow interface, and bypass the
hera mechanism with the projects. This could be useful to test a workflow, or as an alternative after
it was added.

When executing, the code automatically updates the python-workflow, removes the old dependecy files and executes the workflow.

### Command Line Interface 

The syntax is of the execute command is,

<div class="alert alert-success" role="alert">  
    >> hera-workflows execute {workflow file}
                         [--projectName {projectName}]
                         [--groupName {groupName}]
                         [--overwrite]
                         [--force]
                         [--assignName]
                         [--execute]
</div>


* if --projectName is not supplied, the try to read it from the caseConfiguration.json file.

* If --groupName appears use the name supplied as the group name.

  Otherwise deduce the groupname from the workflow file name.
  That is, we assume that the name of the workflow is {groupname}_{id}.json

* If --overwrite exists than overwite the DB document with the contents
  of the file. This allows the update of the workflow

* If --force exists than allow the addition of workflow that exists in the DB under a different name.

* If --assignName exists then find the next available ID in the group and use it.

* Use the --execute to build and execute the workflow.

### Code    

1. Adds the workflow to the database in the requested group
2. Builds the template (.json) and python executer
3. Runs the workflow.

The stages are executed according to the buildMode.

Notes:

* If the workflow is already in the db in a different name adds to the db only if **force** is True.

* If the workflowName already exist in the group then overwrite its definitions
  only if the **overwrite** is True.

* If the template and python execution files exist on the disk, raise error unless overwrite is True.

* If the group is None, parse the file name to get the group. That is, we assume that the
  file name has the structure : {groupname}_{id}. If the {id} is not an integer,
  the id in the database will be saved as None.


In [None]:
workflowJSON=os.path.abspath("Flow_1.json")
groupName= None
assignName=False,
overwrite=True,
execute=True,
force=True

tk.addCaseToGroup(workflowJSON=groupName,
                groupName= groupName,
                assignName=assignName,
                overwrite=overwrite,
                execute=execute,
                force=force)

## List the workflow groups in a project.


List all the workflow groups in the project.
A workflow groups is defined when a simulation was added to that group.
A group is deleted when all the simulation that belong to the group were
deleted from the project.

### Command Line Interface 
<div class="alert alert-success" role="alert">
    >> hera-workflows list group {workflow file} [--projectName {projectName}]
</div>
    
### Code   

The code for getting all the simulation groups with their simulations from the project

In [None]:
simulationTable = tk.tableGroups()
simulationTable

## List workflows in a group

Listing all the workflows in the simulation group

### Command Line Interface 

<div class="alert alert-success" role="alert">
    >> hera-workflows list group {workflow file}
                         [--projectName {projectName}] [--nodes] or [--parameters]
</div>

* --nodes flag will print a list of the nodes for each workflow.
* --parametrs flag will also print the list of parameters.

### Code      

The code for getting all the simulation from the group from the project

In [None]:
groupName = 'group'
simulationTable = tk.tableGroups(groupName=groupName)
simulationTable

## Comparing workflows


When comparing simulations, the tool lists the differing parameters along with their corresponding values. By default, the simulations are displayed as columns and the parameters are displayed as rows.

### Command Line Interface 

<div class="alert alert-success" role="alert">
    >> hera-workflows compare {obj1} {obj2} ....
                         [--projectName {projectName}]
                         [--longTable]
                         [--transpose]
                         [--format pandas|json|latex]
                         [--file {outputfileName}]
</div>
    
The input obj can take various forms, such as a simulation name,
a directory path on the disk, a file name on the disk, or a workflow group name.
In the case of a workflow group name, all the simulations within that group will be compared to each other.

* if --projectName is not supplied, the try to read it from the caseConfiguration.json file.

* if --longTable is supplied, then the results are pronted as a long table.
  That is, each parameter (that differs) in each simulation is shown in one line.

* if --transpose is supplied, the the simulations are printed as rows and the parameters are printed as lines.

* The --format prints the comparison in different formats.
  Available formats are: pandas, latex, csv and json

* if the --file is supplied, then the output is also printed to a file. If the outputfileName
  does not have extension (i.e it is just the name), the the file name will be appended with

### Code    

## Deleting workflow


When deleting a workflow from Hera, it's important to note that the deletion process only removes the workflow from the project itself. The files and execution directories associated with the workflow are not automatically deleted, requiring additional action from the user.

When a workflow is deleted from the project, it is exported to a file, and a Python script is generated. This script allows the user to remove all directories associated with the workflow's execution. However, the workflow will not be removed from the project if a file exists in its directory, unless the user explicitly requests overwriting.

It is necessary for the user to manually remove both the workflow file and the execution directories, as these actions need to be performed separately.

### Command Line Interface 

To remove the workflow(s) from the project type

<div class="alert alert-success" role="alert">
    >> hera-workflows delete {obj1} {obj2} .... [--no-export] [--overwrite]
</div>
    
Where obj{i} can be a simulation name or a workgroup.

* If the --no-export flag is supplied, then the workflow will not be exported to the disk.

* if the --overwrite flag is supplied, then the workflow will be overwrite the currently
 existing workflow  on the disk.

Running this procedure creates a completeRemove.py script that will remove the execution directories.
To remove the execution

<div class="alert alert-success" role="alert">
>> python completeRemove.py
</div>
    
### Code    

## Export workflow


Exporting workflow saves the workflow in the DB to a file.
If file name is not specified, then the output will be the simulation name

### Command Line Interface 

<div class="alert alert-success" role="alert">
    >> hera-workflows export {obj1} {obj2} .... [--overwrite]
</div>
    
* if the --overwrite flag is supplied, then the workflow will be overwrite the currently
 existing workflow  on the disk.
    
### Code

# Workflow objects


To help the user handle the hermes workflows, this toolkit also includes
a python wrapper to the hermes workflow.

The basic wrapper is a general object that allows the user to access the different nodes.
Specialized wrappers for simpleFOAM or other solvers also exist.

To access the workflows, use the getHermesWorkflowFrom<JSON|DB> functions.

# Internals


## Stages in adding a workflow to the project


Adding a workflow to the project using the CLI has  3 stages.

1.  Determine the simulation and group names.
    The default behaviour assumes the workflow file name has the format
    [group name]_[group id].

    Then, the default is use the workflow file name as the simulation name,
    and parse it to get the group name and id.

    However, when using the CLI the user can determine the group name
    and can set the simulation name to be of the default format with the
    next available ID in the group.

    Note: If the simulation name is not [group name]_[group id],
          then the group-id of the simulation will be None.

1. Add the simulation to the database.
   If the name exists, or if the workflow already exists in the DB (possibly
   with another name) then it will raise an error.

   If the name of the simualation exists,
   use --overwrite to update the value of the simulation with the given workflow

   If the simulation data already exists in the DB, use --force
   to add it again with the new name.

1. Perform addition actions that the user requested (using the action flag).