Skip to content

presentation and resources for NorPreM bioinformatics workshop in March 2024

Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



79 Commits

Repository files navigation

marp theme

Bioinformatics session

A two-day workshop for bioinformaticians and molecular biologists with focus on the TSO500 pipeline in InPreD

bg right


  1. Setup
  2. Development & Collaboration
  3. Nextflow
  4. tso500_nxf_workflow
  5. Python

1. Setup

1.1. Create a GitHub account

1.1. Create a GitHub account

  • enter your email

1.1. Create a GitHub account

  • set a password

1.1 Create a GitHub account

  • choose a username

1.1 Create a GitHub account

  • choose email preferences

1.1 Create a GitHub account

  • solve the puzzle

1.1 Create a GitHub account

  • create your account


1.1 Create a GitHub account

  • find the activation code in the email you received


1.1 Create a GitHub account

  • select the desired options

1.1 Create a GitHub account

  • choose the free plan


1.2. Be added to InPreD organisation at GitHub

1.3. Resources

2. Development & Collaboration

2.1. Short git introduction

  • distributed version control system
  • tracks history of changes commited by different contributors
  • every developer has full copy of project and its history

2.1.1 git config

git config --global <your name>
git config --global <your email>

2.1.2. Basic git commands

git init: initialises new git repository

git clone <repository url>: creates local copy of remote repository

git add <file/s>: stage new or changed files (anything that should be committed to the repository)

git commit -m "feat: my new feature": commit changes to the repository commit message conventions

<type>[optional scope]: <description>

  • feat: new feature
  • fix: patching bug
  • refactor: code change that neither is neither feat nor fix
  • build: build system related changes
  • perf: improving performance commit message conventions

<type>[optional scope]: <description>

  • chore: code unrelated changes, e.g. dependencies
  • style: code change that does not change meaning
  • test: changes to tests
  • docs: adding/updating documentation
  • ci: continuous integration, e.g. github actions

2.1.2. Basic git commands

git status: overview over untracked, modified and staged changes

git branch: show local branches

git merge: merge branches

git pull: load changes from remote counterpart

git push: upload changes to remote counterpart

2.2. Branching model: simplied Gitflow workflow

  • start with two branches to record project history: main and develop
  • each new feature resides in its own branch (feature branch)
  • feature branch is generally created off latest develop commit
  • upon feature completion, feature branch is merged into develop
  • whenever you are ready to release, merge develop into main and tag it

2.2. Branching model: simplied Gitflow workflow

2.3. GitHub Actions

  • continuous integration (CI) and continuous deployment (CD)
  • building, testing and deploying directly from GitHub
  • set up by adding yaml instructions to .github/workflows
name: GitHub Actions Demo
on: [push]
    runs-on: ubuntu-latest
      - run: echo "Hello world!"

2.3. GitHub Actions

name: Docker Build
      - main
      - develop
      - '*.*.*'

    name: Run unit tests
    runs-on: ubuntu-latest
        name: Check out the repo
        uses: actions/checkout@v4
        name: Unit testing
        uses: fylein/python-pytest-github-action@v2
          args: pip3 install -r requirements.txt && pytest

2.3. GitHub Actions

    name: Build Image
    runs-on: ubuntu-latest
    needs: test
        name: Check out the repo
        uses: actions/checkout@v4
        name: Lint Dockerfile
        uses: hadolint/hadolint-action@v3.1.0
        name: Docker Meta
        id: meta
        uses: docker/metadata-action@v5
          images: |
          tags: |
        name: Login to Dockerhub
        uses: docker/login-action@v3
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}
        name: Build and push image to Docker Hub
        uses: docker/build-push-action@v5
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}

2.3. GitHub Actions

width:900px width:900px

2.3. GitHub Actions

2.3. GitHub Actions

2.4. GitHub workflow

  • go to issues and create a New issue

width:900px width:900px

2.4. GitHub workflow

  • give the issue a descriptive title and a description and Submit new issue

2.4. GitHub workflow

  • if you decide to work on the issue (own repository), Create a branch via the issue

2.4. GitHub workflow

  • Change branch source to develop and Create branch


2.4. GitHub workflow

  • load the new branch to your local repository, check it out and start working
  • push your changes back to the remote
$ git pull
$ git checkout 4-new-fancy-feature
$ git add
$ git commit -m "docs: updating docs"
$ git push

2.4. GitHub workflow

  • for repositories you don't have access to, create a fork


2.4. GitHub workflow


2.4. GitHub workflow

  • once you have a fork, git clone your forked repository
  • create a new branch and work on that
  • git push your changes back to the forked remote

2.4. GitHub workflow

  • when you are done, go to pull requests and create a New pull request

width:900px width:900px

2.4. GitHub workflow

  • choose develop as base and your new feature branch (same repo or forked) for compare

2.4. GitHub workflow

  • assign yourself, add at least one reviewer (cog icon), provide some context and Create pull request

2.4. GitHub workflow

  • if you still want to work on the pull request, you can Convert to draft to let the reviewers know that it is not done yet
  • otherwise you can just wait for them to review your changes


2.4. GitHub workflow

  • as a reviewer, make your you check your email notifications to see if there is pull requests waiting for you
  • open the pull request and start the review in the Files changed tab


2.4. GitHub workflow

  • you can leave comments and suggestions in the code by hovering over the line with the changes and clicking on +

2.4. GitHub workflow

  • you can type your comment

2.4. GitHub workflow

  • or you leave a suggestion, ideally you click Start a review to initialise the reviewing process

width:700px width:700px

2.4. GitHub workflow

  • when you are done with reviewing, Finish your review


2.4. GitHub workflow

  • again, leave a comment if you like, and choose if you just want to Comment, Approve or Request changes


2.4. GitHub workflow

  • you can add a general comment to the pull request under Conversation

2.4. GitHub workflow

  • after the reviewer left their comments and suggestions, you can address them one by one by replying or applying the suggested changes
  • whenever a certain comment/suggestion is handled (discussion comes to conclusion, suggestion was applied), you can resolve it


2.4. GitHub workflow

  • as soon as the reviewers gave you an approval, you can finally Merge pull request

2.4.1 Hands-on pull request

  • go to
  • create fork to your own account
  • open an issue "test pull request" or similar and create a branch
  • go to the branch and add a markdown file with your first name and favorite emoji to the participants folder, ideally the file is named <your first name>.md
  • open a pull request in the original repository and add someone else in the group to review your pull request
  • review someone else's pull request, give feedback and approve if correct

2.5. Release

  • releases should be from main branch
  • good practice is to open a pull request for develop into main when you are done with the desired features

2.5. Release

  • whenever you are ready for a new release, create a new release

2.5. Release

  • add a title and a description for your release and Choose a tag

2.5. Release

  • ideally, you choose a tag according to semantic versioning

2.5.1. Semantic versioning

  • version tag should be MAJOR.MINOR.PATCH
  • you increment one of the three depending on the change
    • MAJOR: version when you make incompatible API changes
    • MINOR: version when you add functionality in a backward compatible manner
    • PATCH: version when you make backward compatible bug fixes

2.5. Release

  • when you are satisfied with your release, Publish release


2.6. Licensing

  • let's discuss

2.7. Resources

3. Nextflow

3.1. Short introduction

  • workflow manager that enables scalable and reproducible scientific workflows using software containers
  • an extension of groovy which is object-oriented programming language for the Java platform
  • can be used with an array of executors, such as SLURM, k8s, AWS, Azure, Google Cloud and many more
  • nf-core: project/community that develops framework for nextflow including guidelines, tools, modules, subworkflows, pipelines and test data

3.2. Requirements

  • POSIX compatible system (e.g. Linux, Os X)
  • Bash
  • Java ≥ 11 / ≤ 21
  • Docker/Singularity

3.3. Installation

$ curl -s | bash
$ chmod +x nextflow


$ wget -O nextflow

or via browser at

3.4. Best pratice: nf-core template

├── assets
│   ├── mock.genome.fasta
│   ├── samplesheet.csv
│   └── schema_input.json
├── bin
│   └──
├── conf
│   ├── base.config
│   ├── modules.config
│   └── test_stub.config
├── lib
│   ├── NfcoreSchema.groovy
│   ├── NfcoreTemplate.groovy
│   ├── WorkflowMain.groovy
│   └── nfcore_external_java_deps.jar
├── modules
│   ├── local
│   │   ├──
│   │   └──
│   └── nf-core
│       ├── module_1
│       │   └── arg_1
│       │       ├──
│       │       └── meta.yml
│       └── custom
│           └── dumpsoftwareversions
│               ├──
│               ├── meta.yml
│               └── templates
│                   └──
├── modules.json
├── nextflow.config
├── nextflow_schema.json
└── workflows

3.6 Resources

4. tso500_nxf_workflow

4.1. Status update

  • modified nf-core template (removed unnecessary functionality, config and metadata files)
  • added devcontainer to have controlled environment (dind and sind available)
  • stubbing data available
  • containing three modules so far (localapp_prepper, LocalApp, dumpsoftwareversions)
  • using nf-validation plugin

4.2. Overview


4.3. Demonstration

4.4. Outlook

  • samplesheet_generator
  • tsoppi (requires some restructuring)
  • include configuration files for each node
  • Documentation

4.5. Resources

5. Python project

5.1. Repository structure

  • consistency/standard
  • keep main script short and sweet - functionality in modules

from my_module import main

if __name__ == "__main__":

5.1. Repository structure

  • module folder should contain
  • keep functions short and try to refactor big functions
  • leave descriptive comments in code
  • use libraries to make your life easier
    • pandas: csv/tsv files
    • click or argparse: define cli input flags
  • introduce proper exception handling
  • logging with log levels

5.1.1. Unit testing

  • pytest for testing
  • include unit tests for functions, preferable table-driven
def addition(x, y):
  return x+y
import pytest

@pytest.mark.parametrize("x, y, z", [(1, 1, 2), (1, -1, 0)])
def test_eval(x, y, z):
    assert addition(x, y) == z
$ pytest

5.1. Repository structure

  • include test data for unit testing if necessary
  • create container image from project, preferably docker
  • include all necessary dependencies in requirements.txt (locked versions)
  • add GitHub actions for testing, linting, building, etc.
  • preferable include a devcontainer definition
  • and other docs

5.1. Repository structure

|-- .devcontainer
|   `-- devcontainer.json
|-- .github
|   `-- workflows
|       `-- main.yml
|-- .gitignore
|-- Dockerfile
|-- docs
|-- my_module
|   |--
|   |--
|   `-- tests
|       |--
|       `--
|-- requirements.txt
`-- test

5.2. Resources


presentation and resources for NorPreM bioinformatics workshop in March 2024






No releases published


No packages published