Skip to content
Incubator for useful bioinformatics code, primarily in Python and R
Python HTML CSS TeX JavaScript R Other
Find file
Latest commit a4b7edb @chapmanb HLA/hg38 status update
Failed to load latest commit information.
abstracts Azure for Research grant application
align Avoid converting to and from phred format for speed
biopython Coding region coordinate remapping for SNPs
biosql SQLAlchemy definitions for BioSQL; partial implementation
biosql_ontologies Finalized initial version
biostar Supporting code and configuration for BioStar NGS challenge from Pierre
classify Move Dan's scripts to separate project directory in mgh_projects
cv CV and research statement: clean up typos and improve phrasing
distblast Remove num_alignments, since it's incompatible (and redundant) with m…
galaxy Initial move of next gen automated analysis into git revision control
gff Explicitly added LICENSE file to bcbio-gff.
keyval_testing Update couchdb test with bulk loading fixed from Paul and Chris
nextgen Add update notice about new bcbio-nextgen repository
papers/bcbio-nextgen Update chapman_bcbio.tex
posters Poster from Luca on bcbio-nextgen for cancer calling
posts Talk for Iowa State Bioinformatics Orientation class
qualbin Add plot for variant cause changes
rest_apis Move Dan's scripts to separate project directory in mgh_projects
semantic Update query example to match paper text
stats Updated script for bootstrapping R
svplot Generalize SV circos-style plots with multi depth
talks HLA/hg38 status update
validation Update scalpel results with improved filtering and include suggestion…
visualize Move Dan's scripts to separate project directory in mgh_projects
.gitignore Presentation for Intel Life Sciences tutorial Add update notice about new bcbio-nextgen repository

Collection of useful code related to biological analysis. Much of this is discussed with examples at Blue collar bioinformatics.

Some projects which may be especially interesting:

  • CloudBioLinux -- An automated environment to install useful biological software and libraries. This is used to bootstrap blank machines, such as those you'd find on Cloud providers like Amazon, to ready to go analysis workstations. See the CloudBioLinux effort for more details. This project moved to its own repository at
  • gff -- A GFF parsing library in Python, aimed for inclusion into Biopython.
  • nextgen -- A python toolkit providing best-practice pipelines for fully automated high throughput sequencing analysis. This project has moved into its own repository:
  • distblast -- A distributed BLAST analysis running for identifying best hits in a wide variety of organisms for downstream phylogenetic analyses. The code is generalized to run on local multi-processor and distributed Hadoop clusters.
Something went wrong with that request. Please try again.