Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Latest commit 3271277 Jul 7, 2017
Type Name Latest commit message Commit time
Failed to load latest commit information.
MANIFEST Update Jul 7, 2017 improved parsing files, counts and logs on STDOUT, verbose option Jun 9, 2017

MetaMetaMerge: merge output from metagenomic taxonomic profiling and binning tools

Vitor C. Piro (

install with bioconda

This tool is part of the MetaMeta Pipeline ( but can also be used as a standalone tool.


MetaMetaMerge integrates profiling and binning tools. MetaMetaMerge accepts BioBoxes format directly ( or a .tsv file in the following format:

  • Profiling: rank, taxon name or taxid, abundance


genus   Methanospirillum        0.0029
genus   Thermus 0.0029
genus   568394      0.0029
species Arthrobacter sp. FB24   0.0835
species 195      0.0582
species Mycoplasma gallisepticum        0.0536
  • Binning: readid, taxon name or taxid, lenght of sequence assigned


M2|S1|R140      354     201
M2|S1|R142      195     201
M2|S1|R145      457425  201
M2|S1|R146      562     201
M2|S1|R147      1245471 201
M2|S1|R150      354     201

Database profiles:

Database profiles can be generated based on a list of accession codes and two scripts (acc2tab.bash and tab2count.bash):

cat accession_list.txt | xargs --max-procs=12 -I '{}' bash acc2tab.bash '{}' > output_tax
tab2count.bash output_tax > output_dbprofile


./ -i binning_out.tsv profile1.tsv profile2.out -d dbprofile1.out dbprofile2.out dbprofile3.out -t 'tool1,tool2,tool3' -c 'b,p,p' -n names.dmp -e nodes.dmp -m merged.dmp -o output_profile.out


    usage: [-h] -i [<input_files> [<input_files> ...]] -d
                            [<database_profiles> [<database_profiles> ...]] -t
                            <tool_identifier> -c <tool_method> -n <names_file> -e
                            <nodes_file> -m <merged_file> [-b <bins>]
                            [-r <cutoff>] [-f <mode>] [-s <ranks>] -o
                            <output_file> [-p <output_type>]
                            [--output-parsed-profiles] [--detailed] [--verbose]

    MetaMetaMerge by Vitor C. Piro (,

    optional arguments:
      -h, --help            show this help message and exit
      -i [<input_files> [<input_files> ...]], --input-files [<input_files> [<input_files> ...]]
                            Input (binning or profiling) files. Bioboxes or tsv
                            format (see README)
      -d [<database_profiles> [<database_profiles> ...]], --database-profiles [<database_profiles> [<database_profiles> ...]]
                            Database profiles on the same order of the input files
                            (see README)
      -t <tool_identifier>, --tool-identifier <tool_identifier>
                            Comma-separated identifiers on the same order of the
                            input files
      -c <tool_method>, --tool-method <tool_method>
                            Comma-separated methods on the same order of the input
                            files (p -> profiling / b -> binning)
      -n <names_file>, --names-file <names_file>
                            names.dmp from the NCBI Taxonomy database
      -e <nodes_file>, --nodes-file <nodes_file>
                            nodes.dmp from the NCBI Taxonomy database
      -m <merged_file>, --merged-file <merged_file>
                            merged.dmp from the NCBI Taxonomy database
      -b <bins>, --bins <bins>
                            Number of bins. Default: 4
      -r <cutoff>, --cutoff <cutoff>
                            Minimum abundance/Maximum results for each taxonomic
                            level (0: off / 0-1: minimum relative abundance / >=1:
                            maximum number of identifications). Default: 0.0001
      -f <mode>, --mode <mode>
                            Result mode (precise, very-precise, linear, sensitive,
                            very-sensitive, no-cutoff). Default: linear
      -s <ranks>, --ranks <ranks>
                            Comma-separated list of ranks to be independently
                            merged (superkingdom,phylum,class,order,family,genus,s
                            pecies,all). Default: species
      -o <output_file>, --output-file <output_file>
                            Output file
      -p <output_type>, --output-type <output_type>
                            Output type (tsv, bioboxes). Default: bioboxes
                            Output parsed and converted profiles for all input
                            files (without cutoff)
      --detailed            Generate an additional detailed output with individual
                            normalized abundances for each tool, where: 0 -> not
                            identified but present in the database, -1 not present
                            in the database.
      --verbose             Verbose output log
      -v, --version         show program's version number and exit
You can’t perform that action at this time.