Move-Annotate-Merge: Tab-separated values file (.tsv) manipulation for manual review generated from sequencing data analysis
Preparing sequencing data for manual review can be tedious, time-consuming and a user error-prone process. This script, implemented in Haskell, transforms user-defined .tsv files containing variant analysis output into a single merged, sample annotated file easily searched and ready for downstream filtration.
MoveAnnotateMerge.hs assumes you have a the GHC compiler and packages installed that it imports. The easiest way to do this is to download the Haskell Platform.
To install the peripheral packages MoveAnnotateMerge.hs requires, you can call the following command assuming you have cabal, a package manager and build system for Haskell, installed on your system (it comes with the Haskell Platform).
$ cabal install [packagename]
Required packages
- Data.List
- Data.List.Split
- System.Process
- System.Environment
- System.IO
- Text.PrettyPrint.Boxes
- Text.Regex
A prerequisite for getting useful output from this Haskell script is to setup a input .tsv file that it expects.
Your input .tsv file should have the following structure:
[/Path/To/Tsv/File/example_variants.annotated.tsv]\t[Corresponding_sample_identifier]\t[/Path/To/Final/Directory]
There should be as many lines in this file as there are input .tsv files.
MoveAnnotateMerge.hs is easy to use.
You can call it using the runghc command provided by the GHC compiler as such:
$ runghc MoveAnnotateMerge.hs inputfile.tsv
For maximum performance, please compile and run the source code as follows:
$ ghc -O2 -o mam MoveAnnotateMerge.hs
$ ./mam inputfile.tsv
A docker-based solution (Dockerfile) is availible in the corresponding repository. Currently, this Dockerfile assumes that you run docker interactively.
Documentation was added February 2019.
Author : Matthew Mosior