Scripts, utilities and programs for genomic bioinformatics.
Python Shell Perl Java R
Failed to load latest commit information.
NGS-general NGS-general/ reimplementation for substantial speed-… Nov 19, 2015
QC-pipeline QC-pipeline/ update handling --subset 0 for 0.6.2 and… Aug 30, 2016
RNA-seq Update all Python programs to use '#!/usr/bin/env python' invocation. Aug 4, 2015
docs Update the reference documentation for bcftbx.IlluminaData. Aug 16, 2016
illumina2cluster illumina2cluster/ add automatic pagination of pr… Aug 31, 2016
microarray Switch shell scripts to use '#!/bin/bash' invocation. Aug 4, 2015
share Switch shell scripts to use '#!/bin/bash' invocation. Aug 4, 2015
utils Switch shell scripts to use '#!/bin/bash' invocation. Aug 4, 2015
.gitignore Update .gitignore to ignore outputs from Mar 12, 2015
.travis.yml Add test script for to the Travis testing. Apr 17, 2015
LICENSE Added license document. May 8, 2012
README.rst Clean up rogue cut'n'paste text accidentally added to README. Aug 14, 2015
setup.cfg bcftbx/ XLSWorkBook.save_as_xlsx supports XLSX output (… Aug 4, 2016



Utilities for NGS and genomics-related bioinformatics developed within the Bioinformatics Core Facility (BCF) within the Faculty of Life Sciences (FLS) at the University of Manchester (UoM).

Full documentation is available at


The utilities are divided into broad categories:

  • Handling data from SOLiD and Illumina sequencers (solid2cluster, illumina2cluster)
  • Performing QC and manipulation of NGS data (QC-pipeline)
  • Setting up reference data (build-indexes)
  • Supporting analysis of ChIP-seq, RNA-seq and microarray data (ChIP-seq, RNA-seq, microarray, NGS-general)
  • General non-bioinformatics utilities (utils)

There is also a Python package called bcftbx which is used by many of the programs, and which provides a wide range of utility functions.


It is recommended to use:

pip install .

from within the top-level source directory to install the package.

To use the package without installing it first you will need to add the directory to your PYTHONPATH environment.

To install directly from github using pip:

pip install git+


Many of the scripts should run directly after installation without additional setup. The exceptions are the QC scripts, which require a file to be created and edited to point to the locations of the fastq_screen configuration files.


Documentation based on sphinx is available under the docs directory.

To build do either:

python sphinx_build


cd docs
make html

both of which create the documentation in the docs/build subdirectory.

Running Tests

The Python unit tests can be run using:

python test

Note that this requires the nose package.

There are also some test scripts in the QC-pipeline/tests directory, these can be run individually or via a 'runner' script:

(Note that this requires that the QC scripts have already been setup after installing the package.)

In addition the tests are run via TravisCI whenever this GitHub repository is updated:

Current status of TravisCI build for master branch

Developmental version

The developmental branch of the code on github is devel, this can be installed using:

pip install git+

Use the -e option to install an 'editable' version (see the section on "Editable" installs <>_ in the pip documentation),

The tests are run on TravisCI whenever the developmental version is updated:

Current status of TravisCI build for devel branch


The package consists predominantly of code written in Python, which has been used extensively with Python 2.6 and 2.7.

In addition there are scripts requiring:

  • bash
  • Perl
  • R

The following packages are required for subsets of the code:

  • perl: Statistics::Descriptive and BioPerl
  • python: xlwt, xlrd and xlutils

Some of the scripts also use third party software, including:

  • bowtie
  • bowtie2
  • bfast
  • fastq_screen
  • fastqc
  • convert (from ImageMagick)

There are also a couple of Java-based programs.