Improving genome bins through the combination of different binning programs
Switch branches/tags
Nothing to show
Clone or download
Latest commit 1be43d1 May 27, 2018
Failed to load latest commit information.
examples v1.2 Dec 2, 2017
images v1.2 Dec 2, 2017
manual v1.2 May 21, 2018
previous_versions v1.0 Dec 16, 2017
.gitignore v1.2 Nov 30, 2017 v1.2 May 27, 2018
LICENSE.txt v1.2 May 21, 2018 v1.2 May 27, 2018
get_sankey_plot.R v1.2 May 27, 2018




Change Log:

Version 1.2 (2017-11-30):

  • Binning_refiner has been simplified to keep only the core functions, which made it much easier to install and use, hope you enjoy it :)

Important notification !!!

  • In the original version of Binning_refiner, the blast approach (as described in its publication) was used to identify the same contig among input bin sets. As Binning_refiner was designed to refine bins derived from the same set of assemblies and the blast step is time-consuming (especially for big dataset), the same assembly among different bin sets was identified by its ID rather than blastn, which made Binning_refiner much faster to run and more easier to install.

How to install:

  1. Install Python and Biopython.

     # for Katana users from UNSW, simply run
     $ module load python/3.5.2
  2. Download to the place your want, it is ready to run now

     $ python path/to/ -h
  3. In case you want to see the correlations between your input bin sets (figure below), you need to have R and its following two packages installed: optparse and googleVis

Help information:

    python -h
      -h, --help      show this help message and exit
      -1              first bin folder name
      -2              second bin folder name
      -3              third bin folder name
      -x1             file extension for bin set 1, default: fasta
      -x2             file extension for bin set 2, default: fasta
      -x3             file extension for bin set 3, default: fasta
      -prefix         prefix of refined bins, default: Refined
      -ms             minimal size for refined bins, default: 524288 (0.5Mbp)

How to run:

  1. All bins in one folder must have same file extension.

  2. Binning_refiner now compatible with both python2 and python3.

     # For two binning programs (e.g. MetaBAT and MyCC)
     python -1 MetaBAT -2 MyCC -x1 fa -prefix Refined
     # For three binning programs (e.g. MetaBAT, MyCC and CONCOCT)
     python -1 MetaBAT -2 MyCC -3 CONCOCT -x1 fa -x3 fa -prefix Refined

Output files:

  1. All refined bins larger than defined bin size cutoff.

  2. The id of the contigs in each refined bin.

  3. The size of refined bins and where its contigs come from.

  4. You may want to run get_sankey_plot.R to visualize the correlations between your input bin sets (Figure below). To run it, you need to have R and its following two packages installed: optparse and googleVis.

     # Example command
     Rscript get_sankey_plot.R -f GoogleVis_Sankey_0.5Mbp.csv -x 800 -y 1000