Bioinformatics tools for dealing with Multiple Alignment Format (MAF) files.
C Python Makefile
Latest commit 6b6a73c Apr 20, 2016 @dentearl Update mafExtractorAPI.c
Attempting to address proximate proximate raised by Issue #13, (though previous commit may have addressed it ultimately by changing gcc flags).
Permalink
Failed to load latest commit information.
external Moving to 64-bit integers for all Apr 17, 2013
inc Update common.mk Apr 20, 2016
lib add `universal_newlines=True` to subprocess calls in CreateVersionSrcs Nov 20, 2015
mafComparator addressing issue #4, adding -DNDEBUG to production flags. Jul 9, 2014
mafCoverage updating documentation Jul 9, 2014
mafDuplicateFilter making sure each component can re-build the library objects if need be Jul 8, 2014
mafExtractor Update mafExtractorAPI.c Apr 20, 2016
mafFilter making sure each component can re-build the library objects if need be Jul 8, 2014
mafPairCoverage making sure each component can re-build the library objects if need be Jul 8, 2014
mafPositionFinder making sure each component can re-build the library objects if need be Jul 8, 2014
mafRowOrderer making sure each component can re-build the library objects if need be Jul 8, 2014
mafSorter making sure each component can re-build the library objects if need be Jul 8, 2014
mafStats making sure each component can re-build the library objects if need be Jul 8, 2014
mafStrander making sure each component can re-build the library objects if need be Jul 8, 2014
mafToFastaStitcher making sure each component can re-build the library objects if need be Jul 8, 2014
mafTransitiveClosure making sure each component can re-build the library objects if need be Jul 8, 2014
mafValidator making the structure of the framework more linux idiomatic Jul 2, 2014
.gitignore adding build versions and consistent version function across modules Sep 6, 2012
LICENSE update dates. Oct 20, 2014
Makefile making the structure of the framework more linux idiomatic Jul 2, 2014
README.md How to cite Jan 5, 2015

README.md

mafTools

mafTools is a collection of tools that operate on Multiple Alignment Format (maf) files.

Authors

Dent Earl, Benedict Paten, Mark Diekhans

Dependencies

With the exception of the python dependencies, when a component is missing a dependency it will not be built, tested or cleaned by the Makefile. If the python dependencies are missing then some of the modules will fail to function and all of the modules' tests will fail. The sonLib and pinchesAndCacti dependencies should be built and placed in the same parent directory as mafTools.

Installation

  1. Install dependencies.
  2. Download or clone the mafTools package. Consider making it a sibling directory to sonLib/ and pinchesAndCacti.
  3. cd into mafTools directory.
  4. Type make.

Components

  • mafComparator A program to compare two maf files by sampling. Useful when testing predicted alignments against known true alignments.
  • mafCoverage A program to calculate the amount of alignment coverage between a target sequence and all other sequences in a maf file.
  • mafDuplicateFilter A program to filter alignment blocks to remove duplicate species. One sequence per species is allowed to remain, chosen by comparing the sequence to the consensus for the block and computing a similarity bit score between the IUPAC formatted consensus and the sequence. The highest scoring duplicate stays, or in the case of ties, the sequence closest to the start of the file stays.
  • mafExtractor A program to extract all alignment blocks that contain a region in a particular sequence. Useful for isolating regions of interest in large maf files.
  • mafFilter A program to filter a maf based on sequence names. Can be used to include or exclude sequence names. Useful for removing extraneous sequences from maf files.
  • mafPairCoverage A program to compare the number of aligned positions between any pair of sequences within a maf file. Can use the * wildcard character to specify a species name. Can use a BED file to limit region of inspection to just intervals specified in the bed. Outputs total lengths of sequencs, number of aligned positions, percent coverage and in the case where a bed file was specified the number of bases within and outside of the region.
  • mafPositionFinder A program to search for a position in a particular sequence. Useful for determining where in maf a particular part of the alignment resides.
  • mafRowOrderer A program to order maf lines within blocks. Useful for moving a reference species to the top of all blocks. Species not specified in the ordering are automatically trimmed from the results.
  • mafSorter A program to sort all of the blocks in a MAF based on the (absolute) start position of one of the sequences. Blocks without the sequence are placed at the start of the output in their original order.
  • mafStats A program to read a maf file and report back summary statistics about the file contents.
  • mafStrander A program to enforce, when possible, a particular strandedness for blocks for a given species and strand orientation.
  • mafToFastaStitcher A program to convert a reference-based MAF file to a multiple sequence fasta. Requires both a .maf and a fasta containing complete sequences for all entries in the maf.
  • mafTransitiveClosure A program to perform the transitive closure on an alignment. That is it checks every column of the alignment and looks for situations where a position A is aligned to B in one part of a file and B is aligned to C in another part of the file. The transitive closure of this relationship would be a single column with A, B and C all present. Useful for when you have pairwise alignments and you wish to turn them into something more resembling a multiple alignment.
  • mafValidator A program to assess whether or not a given maf file's formatting is valid.

External tools

  • mafTools internal tests use Asim Jalis' CuTest C unit testing framework (included in external/). The license for CuTest is spelled out in external/license.txt.
  • mafTools internal tests will use valgrind if installed on your system.

How to Cite:

Genome Res. 2014 Dec;24(12):2077-89. doi: 10.1101/gr.174920.114. Epub 2014 Oct 1. Alignathon: a competitive assessment of whole-genome alignment methods.