Skip to content

99 Development Guide

Matin Nuhamunada edited this page Nov 22, 2023 · 5 revisions

Adding new features/workflows

It is encouraged to add new features or workflows through a pull request. You can find more about this here. Steps:

  1. Make a fork or branch from the main branch of BGCFlow
  2. Add changes in your personal fork or branch
  3. Submit a Pull Request

Adding new pipelines in the main workflow

This guide is to add new pipelines in the main worfklow:

  1. Define the final output in the pipeline metadata yaml
  2. Add new rules in the workflow/rules/my_new_feature.smk
  3. Define the conda environment, if possible use the same name of the rules added worklow/envs/my_new_feature.yaml
  4. [Optional]: Add post-deploy bash script to modify the environment. It needs to have the same name as the conda yaml file, e.g. worklow/envs/my_new_feature.post-deploy.yaml

Creating additional workflow

This guide is to create a separate workflow (sub-workflows) that is detached from the main workflow. An already existing sub-workflows are:

Note that these are already integrated in the wrapper, and can be run using bgcflow run --workflow <workflow name or Snakefile path>

A custom workflow development can be seen in this PR: https://github.com/NBChub/bgcflow/pull/292 Basically, it's similar to how you add a pipeline to the main workflow, but instead of using the existing Snakefile, you build one on your own. The general steps are:

  1. Create a new Snakefile, give it a relevant name to your workflow, e.g: https://github.com/NBChub/bgcflow/blob/main/workflow/lsabgc
  2. Create a new rule and re-use existing rules. The rules should be located in the workflow/rules/my_new_feature.smk
  3. Define the conda environment, if possible use the same name of the rules added worklow/envs/my_new_feature.yaml
  4. [Optional]: Add post-deploy bash script to modify the environment. It needs to have the same name as the conda yaml file, e.g. worklow/envs/my_new_feature.post-deploy.yaml
  5. Give it a test by running bgcflow run --workflow workflow/my_new_workflow

Pre-commits

For continuous integration, enable pre-commit by:

pip install pre-commit
pre-commit

Unit test

To run unit test:

pip install pytest-cov
pip install alive-progress
pytest --cov=.tests/unit .tests/unit/

Updating the documentation

Automatically generate table of pipelines

# pip install tabulate # required for markdown conversion
import pandas as pd
import yaml

with open("../workflow/rules.yaml", "r") as file:
    pipelines = yaml.safe_load(file)

references = []
table = {}
for key, value in pipelines.items():
    keyword = key
    description = value["description"]
    links = []
    for link in value["link"]:
        link = f"[{link.split('/')[-1]}]({link})"
    links.append(link)
    table[keyword] = {"Keyword" : keyword,
                     "Description" : description,
                     "Links" : ", ".join(links)}
    for reference in value["references"]:
        references.append(reference)
df = pd.DataFrame.from_dict(table).T.reset_index(drop=True)

for i in references:
    print(f"> - *{i}*")
    
df.to_markdown()