This lesson teaches the basics of modern workflow engines through Snakemake.
The example workflow performs a frequency analysis of several public domain books sourced from Project Gutenberg, testing how closely each book conforms to Zipf's Law. All code and data are provided. This example has been chosen over a more complex scientific workflow as the goal is to appeal to a wide audience and to focus on building the workflow without distraction from the underlying processing.
At the end of this lesson, you will:
- Understand the benefits of workflow engines.
- Be able to create reproducible analysis pipelines with Snakemake.
The lesson outline and rough breakdown of topics is in lesson-outline.md.
This is a fast overview of the Software Carpentry lesson template.
For a full guide to the lesson template, see the Software Carpentry example lesson.
Software Carpentry lessons are generally episodic, with one clear concept for each episode (example).
An episode is just a markdown file that lives under the _episodes
folder.
Here is a link to a markdown cheatsheet with most markdown syntax.
Additionally, the Software Carpentry lesson template uses several extra bits of formatting - see here for a full guide.
The most significant change is the addition of a YAML header that adds metadata (key questions, lesson teaching times, etc.)
and special syntax for code blocks, exercises, and the like.
Episode names should be prefixed with a number of their section plus the number of their episode within that section. This is important because the Software Carpentry lesson template will auto-post our lessons in the order that they would sort in. As long as your lesson sorts into the correct order, it will appear in the correct order on the website.
The lesson website is viewable at https://csiro-data-school.github.io/workflows/.
The lesson website itself is auto-generated from the gh-pages
branch of this repository.
Github pages will rebuild the website as soon as you push to the Github gh-pages
branch.
Because of this gh-pages
is considered the "master" branch.
Obviously having to push to Github every time you want to view your changes to the website isn't very convenient.
To preview the lesson locally, run make serve
.
You can then view the website at localhost:4000
in your browser.
Pages will be automatically regenerated every time you write to them.
Note that the autogenerated website lives under the _site
directory
(and doesn't get pushed to Github).
This process requires Ruby, Make, and Jekyll. You can find setup instructions here.
A couple links to example SWC workshop lessons for reference:
- Example Bash lesson
- Example Python lesson
- Example R lesson (uses R markdown files instead of markdown)