*This post introduces a new training course and explains some of the rationale and technology choices we made to create, publish and deliver the course online.*

---

Part of our mission is to make malaria genomics as accessible as possible. Over the past year my [team](/team) and I have been working hard to develop a training course that introduces scientists to the elements of mosquito genomic data analysis. The goal was to create a course that can help someone get as quickly as possible from little or no prior experience of genomic data analysis to a point where they can independently plan and run a suite of common analysis methods required for surveillance of mosquito populations.

Along the way, we have been delivering the course as a series of 8 one-day online workshops to a cohort of 50 African scientists in both French and English. It's been a challenge because we've run one workshop every 6 weeks or so, developing the content for each workshop as we go. But we've finally come to the end and have a complete course of 8 workshops. All content from the course is open access and published via the [course website](https://anopheles-genomic-surveillance.github.io/). 

This post gives introduction to the scope and structure of the course, along with some discussion of choices we made to develop, publish and deliver the course.

In [3]:
%%html
<style type="text/css">
img[alt=screenshot] {
    display: block;
    margin: 0 auto;
    max-width: 80%;
    box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.2);
}
</style>

## Scope and structure

The course is structured into 8 workshops, each of which requires around 6-8 hours of learning time and so can be delivered in a day. Each workshop is focused around a single topic, such as population structure or detecting genes under recent positive selection. The final workshop then focuses on how to plan, execute and present a complete analysis. Here's the full workshop programme:

![screenshot](attachment:0528d504-9406-4197-b77a-f1d529288516.png)

The topics we cover include basic population genetics, such as analysing population structure, identifying crypting species and quantifying genetic diversity, and evolutionary threats to malaria vector control, such as analysing insecticide resistance mutations and how they spread between mosquito populations. There is obviously a very applied focus for these topics, most members of our community are interested in using genomics to help monitor and control malaria vectors better. 

Each workshop is divided into 4 modules which cover the topic from different angles. The modules have different themes. Everything builds towards the **Analysis** module, where we learn how to run one or more commonly needed analysis methods on some real data from the [Malaria Vector Genome Observatory](https://www.malariagen.net/vobs/). However, I wanted to also include modules on the underlying **Tools & Technology**, **Biology** and **Data** so that trainees get some insight into everything that goes into that analysis. E.g., here are the modules from Workshop 2:

![screenshot](attachment:a532767f-4b53-4417-83a1-6ca57c68d4c0.png)

Some workshops also have a **Journal Club** module, where an existing paper relevant to the topic is presented by one of the authors, so trainees see how the ideas and methods have been applied.

Each module comprises a lecture video in French or English plus a lecture notebook which includes fully worked and executable code examples. After watching the lecture video, trainees launch the notebook in [Google Colab](https://colab.research.google.com/) and execute all the code examples for themselves using real genomic data, attempting some additional exercises along the way.

## Executable notebooks

The lecture notebooks from all workshops and modules have been used to build the [training course website](https://anopheles-genomic-surveillance.github.io/). This website was built using [Jupyter Book](https://jupyterbook.org/en/stable/intro.html), a relatively new technology that fortuitously arrived just before we started this project, and has been great. The [source code for the training website](https://github.com/anopheles-genomic-surveillance/anopheles-genomic-surveillance.github.io) is hosted on GitHub and the website is built and deployed to GitHub Pages. If you browse into the [`docs`](https://github.com/anopheles-genomic-surveillance/anopheles-genomic-surveillance.github.io/tree/master/docs) folder you'll see a file called [`_toc.yml`](https://github.com/anopheles-genomic-surveillance/anopheles-genomic-surveillance.github.io/blob/master/docs/_toc.yml) which defines the table of contents for the site, and [`_config.yml`](https://github.com/anopheles-genomic-surveillance/anopheles-genomic-surveillance.github.io/blob/master/docs/_config.yml) which provides the site configuration. There is then one folder for each workshop (e.g., [`workshop-2`](https://github.com/anopheles-genomic-surveillance/anopheles-genomic-surveillance.github.io/tree/master/docs/workshop-2)), and inside each folder are some Jupyter notebooks, one for each workshop module. Jupyter Book renders each notebook to an HTML file and organises all the files into a static website according to the table of contents you define. As course developers, all we needed to do was create the notebooks and update the table of contents, and everything would be rendered and published automatically via a [GitHub action](https://github.com/anopheles-genomic-surveillance/anopheles-genomic-surveillance.github.io/blob/master/.github/workflows/gh-pages.yml).

A super nice feature of Jupyter Book is that it adds a button to each page to launch the notebook in Google Colab:

![screenshot](attachment:8e34613a-ebe3-4cf0-80fb-9514b83a50e9.png)

In one click, anyone following the training course can have the code examples opened and ready to run in Google Colab. This makes the transition from passive learning to active learning via practical, hands-on activities about as easy as you can make it.


## English and French

A substantial fraction of our community is based in francophone Africa, and we really wanted to do something to make the course accessible to different language groups. I'm extremely fortunate to have a native French speaker, [Jon Brenas](https://www.linkedin.com/in/brenas-jon-6bb79670/), in my team, as well as support from [Eric Lucas](https://www.lstmed.ac.uk/about/people/dr-eric-lucas) at LSTM who is also bilingual. So we decided to translate and rerecord all of the lecture videos from English into French. There are 32 modules in total, each with a lecture between 30-50 minutes in length, often with very technical language. Jon was also often working to tight deadlines, as we typically only managed to finish the English content for each workshop about a week before we were due to deliver the workshop to the first cohort of trainees. It was a heroic effort, Jon is a legend.

![screenshot](attachment:57c63c20-4fbb-4a56-8beb-be2950d75145.png)

We decided not to translate all of the text in the lecture notebooks as well, as this would have meant maintaining two different versions of each notebook, which would have been too much. However, we did translate all of the practical exercises into French within each notebook. At some point it would be amazing if Jupyter Book had some kind of support for internationalisation, so you could have text for different languages within the same notebook/page.

## Running the course online

All of the lecture videos and notebooks are available via the course website, so anyone can follow the course in their own time via self-directed learning. But we also wanted to run the course as a series of online workshops, to provide a more structured experience with support from experienced teaching assistants. 

TODO