Skip to content

Calculate SMD and Hamming and Jaccard distances between each pair of samples in a set of variant files.

License

Notifications You must be signed in to change notification settings

tsnorri/vcfdistances

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vcfdistances

Calculate SMD and Hamming and Jaccard distances between each pair of samples in a set of variant files.

Build/Runtime Requirements

  • libbio (provided as a Git submodule)

On Linux also the following libraries are required:

Build Requirements

Building

Short version

  1. git clone https://github.com/tsnorri/vcfdistances.git
  2. cd vcfdistances
  3. git submodule update --init --recursive
  4. Edit local.mk
  5. make -j4

Long version

  1. Clone the repository with git clone https://github.com/tsnorri/vcfdistances.git.
  2. Change the working directory with cd vcfdistances.
  3. Run git submodule update --init --recursive. This clones the missing submodules and updates their working tree.
  4. Edit local.mk in the repository root to override build variables. Useful variables include CC, CXX, RAGEL and GENGETOPT for C and C++ compilers, gengetopt and Ragel respectively. BOOST_INCLUDE is used as preprocessor flags when Boost is required. BOOST_LIBS and LIBDISPATCH_LIBS are passed to the linker. See common.mk for additional variables.
  5. Run make with a suitable numer of parallel jobs, e.g. make -j4

Useful make targets include:

all
Build everything
clean
Remove build products except for dependencies (in the lib folder).
clean-all
Remove all build products.

Running

The tool takes one or more Variant Call Format files as its input. It reads the variants and makes pairwise comparisons between each sample. Remaining files are processed similarly. The requested distances are output as triangular matrices.

Please see src/vcfdistances --help for command line options.

About

Calculate SMD and Hamming and Jaccard distances between each pair of samples in a set of variant files.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published