Using signac and row workflows provide the following benefits:
-
The
signacandrowworkflows provide contained and totally reproducible results, since all the project steps and calculations are contained within this a singlesignac/rowproject. Although, to ensure total reproduciblity, the project should be run from a container. Note: This involves building a container (Docker, Apptainer, Podman, etc.), using it to run the original calculations, and providing it the future parties that are trying to reproduce the exact results. -
The
signacandrowworkflows can simply track the progress of any project on locally or on the HPC, providing as much or a little details of the project status as the user programs into theactions.pyfile. Note:rowtracks the progress and completion of a project step or section by determining if a file exists. Therefore, the user can generate this file after a verification step is performed to confirm a sucessful completion or commands run without error (Exampe:Exit Code 0). -
These
signacandrowworkflows are designed to track the progress of all the project's parts or stages, only resubmitting the jobs locally or to the HPC if they are not completed or not already in the queque. -
These
signacandrowworkflows also allow colleagues to quickly transfer their workflows to each other, and easily add new state points to a project, without the fear of rerunning the original state points. -
Please also see the signac website and row website, which outlines some of the other major features.
This is a signac Workflow example/tutorial using Julia for a simple dot product calculation, which utilizes the following workflow steps:
-
Part 1: For each individual job (set of state points), this code generates the
signac_job_document.jsonfile from thesignac_statepoint.jsondata. Thesignac_statepoint.jsononly stores the set of state points or required variables for the given job. Thesignac_job_document.jsoncan be used to store any other variables that the user wants to store here for later use or searching. -
Part 2: This uses the
Juliaprogramming language to calculate the dot product, which is then output to a file in each individual run (workspace/YY...YY/dot_product_output_file.txt). There is a random number generater that produces a value from 0 to 1 that is used to scale the dot product, as we want to simulate the standard deviation between the different replicates of the same test. The seed number to the random numbers generater is thereplicate_number_int. -
Part 3: Obtain the average and standard deviation for each calculated dot product value across all the replicates, and print the analysis to a data file (
analysis/output_avg_std_of_replicates_txt_filename.txt). Signac is setup to automatically loop through all the json files (signac_statepoint.json), calculating the average and standard deviation for the jobs with the state points that only have a differentreplicate_number_intnumbers.
- src directory: This directory can be used to store any custom function that are required for this workflow. This includes any developed
Pythonfunctions or any template files used for the custom workflow (Example: A base template file that is used for a find and replace function, changing the variables with the differing state point inputs).
- The signac documentation, row documenation, signac GitHub, and row GitHub and can be used for reference.
Please cite this GitHub repository and the following repositories:
These signac workflows for this project can be built using conda with the environment.yml file, which includes Julia in the Python conda package with the environment.yml file. This is the standard build, which requires no other dependancies to run the entire workflow.
If you want to install and use Julialocally or load it on the HPC (example: module load julia), this project can be built using conda with the environment_without_julia.yml file, which is built without Julia in the Python conda package. If this project is built this way and run without installing Julia locally or loading it on the HPC, this workflow will fail when trying to run Julia.
cd signac_julia_excel_analysisInstall with Julia included (see above for details on which conda env create command to use):
mamba env create -f environment.ymlInstall without Julia included (see above for details on which conda env create command to use):
mamba env create -f environment_without_julia.ymlmamba activate signac_julia_excel_analysisThe clusters.toml file is used to specify the the HPC environment. The specific HPC will need to be setup for each HPC and identified on the workflow.toml file.
The following files are located here:
cd <you_local_path>/signac_julia_excel_analysis/signac_julia_excel_analysis/signac_julia_excel_analysis/project-
Modify the
clusters.tomlfile to fit your HPC (Example: Replace the <ADD_YOUR_HPC_NAME_STRING> values with your custom values.) -
Add the cluster configuration file (
clusters.toml) to the following location on the HPC under your account (~/.config/row/clusters.toml).
cp clusters.toml ~/.config/row/clusters.toml- Modify the
workflow.tomlfile to fit your HPC (Example: Replace the <ADD_YOUR_HPC_NAME> and <ADD_YOUR_CHARGE_ACCOUNT_NAME> values with your custom values.) - Modify the slurm submission script, or modify the
workflow.tomlfile to your cluster's partitions that you want to use, you can do that with the below addition to theworkflow.tomlfile.
For parts 1 and 3, add the CPU partion(s) you want to use:
```bash
custom = ["","--partition=cpu-1,cpu-1,cpu-3"]
```
For part 2, add the GPU partion(s) you want to use:
```bash
custom = ["","--partition=gpu-1,gpu-1,gpu-3"]
```
Note: As needed, the cluster partitions in the clusters.toml can be fake ones. Then specifying the fake or real partition selection in the workflow.toml file (i.e., partition=fake_partition_name), allows you just override the selected partition and allow many real partitions in the workflow.toml (i.e., custom = ["","--partition=cpu-1,cpu-1,cpu-3"]), which is used to write the Slurm submission script.
- This can also be done if >1 or more partitions is needed.
Build the test workspace:
python init.pyRun the following command as the test:
row submit --dry-runYou should see an output that looks something like this (export ACTION_CLUSTER=<YOUR_HPC_NAME>) in the output if it is working:
...
directories=(
be31aae200171ac52a9e48260b7ba5b1
)
export ACTION_WORKSPACE_PATH=workspace
export ACTION_CLUSTER=<YOUR_HPC_NAME>
...Clean up row and delete the test workspace:
row cleanrm -r workspace- If
row submitis run locally like this, then you must remove the HPC parts in theworkflow.tomlfile (see the notes in theworkflow.toml). - Change the GPU parts to run only on CPU, if the local hardware is supports CPU workflows (see the notes in the
workflow.toml).
Build the test workspace:
python init.pyRun the following command as the test:
row submit --dry-runYou should see an output that looks something like this (export ACTION_CLUSTER=`none`) in the output if it is working:
...
directories=(
be31aae200171ac52a9e48260b7ba5b1
)
export ACTION_WORKSPACE_PATH=workspace
export ACTION_CLUSTER=`none`
...Clean up row and delete the test workspace:
row cleanrm -r workspace