Skip to content
Incubator for useful bioinformatics code, primarily in Python and R
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
abstracts Fix typos in abstract Mar 20, 2018
align Avoid converting to and from phred format for speed Dec 16, 2010
biopython Glimmer: ensure have Bio.Seq for reverse complement Mar 14, 2019
biosql SQLAlchemy definitions for BioSQL; partial implementation Aug 21, 2009
biosql_ontologies Finalized initial version Dec 15, 2008
biostar Supporting code and configuration for BioStar NGS challenge from Pierre Dec 19, 2011
classify Move Dan's scripts to separate project directory in mgh_projects May 17, 2011
cv Talk for Park lab bcbio discussion Jan 26, 2018
distblast Remove num_alignments, since it's incompatible (and redundant) with m… Dec 9, 2013
galaxy Initial move of next gen automated analysis into git revision control Sep 1, 2010
gff v0.6.6: fix for LICENSE file install bioconda/bioconda-recipes#13873 Mar 17, 2019
hg38alt Annotate BED files with hg38 alts Feb 26, 2018
keyval_testing Update couchdb test with bulk loading fixed from Paul and Chris May 11, 2009
nextgen Add update notice about new bcbio-nextgen repository Feb 6, 2013
papers/bcbio-nextgen Update chapman_bcbio.tex Dec 10, 2013
posters Codefest poster for CHI 2018 Hackathon workshop Apr 16, 2018
qualbin Add plot for variant cause changes Feb 12, 2013
rest_apis Move Dan's scripts to separate project directory in mgh_projects May 17, 2011
stats Updated script for bootstrapping R Nov 22, 2010
svplot Generalize SV circos-style plots with multi depth Jan 8, 2016
talks Recommendations for variant calling and bcbio CWL runs Mar 1, 2019
validation Update scalpel results with improved filtering and include suggestion… Jan 5, 2015
visualize Move Dan's scripts to separate project directory in mgh_projects May 17, 2011
.gitignore Presentation for Intel Life Sciences tutorial Aug 7, 2014 Add license for code and docs. Fixes #119 Sep 11, 2018

Collection of useful code related to biological analysis. Much of this is discussed with examples at Blue collar bioinformatics.

All code, images and documents in this repository are freely available for all uses. Code is available under the MIT license and images, documentations and talks under the Creative Commons No Rights Reserved (CC0) license.

Some projects which may be especially interesting:

  • CloudBioLinux -- An automated environment to install useful biological software and libraries. This is used to bootstrap blank machines, such as those you'd find on Cloud providers like Amazon, to ready to go analysis workstations. See the CloudBioLinux effort for more details. This project moved to its own repository at
  • gff -- A GFF parsing library in Python, aimed for inclusion into Biopython.
  • nextgen -- A python toolkit providing best-practice pipelines for fully automated high throughput sequencing analysis. This project has moved into its own repository:
  • distblast -- A distributed BLAST analysis running for identifying best hits in a wide variety of organisms for downstream phylogenetic analyses. The code is generalized to run on local multi-processor and distributed Hadoop clusters.
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.