Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time
  - Teaching-EMBL-Plant-Path-Genomics

License Badge


This repository contains material for the 2016 presentation of "Pathogen Genome Data" in the "Bioinformatics of Plants and Plant Pathogens" course (website).

This slot is 1hr long and takes the form of a slide presentation with worked examples. The worked examples for the lesson should take 15-20min each, but it is possible that the session may overrun slightly if all are attempted.

The worked examples are located in the examples subdirectory, and cover the following activities:

  • Exercise 01: Whole genome comparisons of bacterial plant pathogens
  • Exercise 02: CDS feature comparisons of bacterial plant pathogens
  • Exercise 03: Training/building an HMM profile with bacterial pathogen effector sequences, and using it to find new members of the family

There are additional worksheets covering topics from or related to the presentation, but that could not be addressed practically in the session. These are located in the worksheets subdirectory.

  • Worksheet 01: Downloading (many) genomes from NCBI using Biopython
  • Worksheet 02: Using Prokka and Roary to annotate pathogen genomes and calculate a pangenome
  • Worksheet 03: Interpreting effector prediction with the baserate fallacy and Bayes' Theorem

Run these notebooks

You can run the exercise and worksheet notebooks in an interactive environment in your browser, using MyBinder. To do so, simply click on the launch binder button, below.


Obtaining materials

This repository can be downloaded in its entirety using git

git clone
cd Teaching-EMBL-Plant-Path-Genomics