Skip to content

ekg/wgatools

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub Workflow Status GitHub repo size

Whole Genome Alignment Tools

A Rust library and tools for whole genome alignment files

TOOLS

WHAT HAVE DONE

  • PAF file reader
  • MAF file reader
  • Chain file reader
  • CIGAR string parser
  • MAF2PAF
  • MAF2Chain
  • PAF2Chain
  • PAF2Blocks
  • PAF2MAF
  • Chain2MAF
  • Chain2PAF
  • Call Variants from MAF
  • Visualize MAF file in terminal
  • Extract regions from MAF file
  • Build MAF index
  • Statistics of MAF/PAF file

WHAT WILL DO IN FUTURE

  • SAM converter [really need?]
  • Local improvement of alignment by re-alignment
  • MAF -> GAF -> HAL
  • for BIG MAF, should optimize
  • split & chop MAF file

Install

git clone https://github.com/wjwei-handsome/wgatools.git
cd wgatools
cargo build --release

or just install from git:

cargo install --git https://github.com/wjwei-handsome/wgatools.git

Usages

> wgatools
wgatools -- a cross-platform and ultrafast toolkit for Whole Genome Alignment Files manipulation

Version: 0.1.0

Authors: Wenjie Wei <wjwei9908@gmail.com>

Usage: wgatools [OPTIONS] <COMMAND>

Commands:
  maf2paf    Convert MAF format to PAF format [aliases: m2p]
  maf2chain  Convert MAF format to Chain format [aliases: m2c]
  paf2maf    Convert PAF format to MAF format [aliases: p2m]
  paf2chain  Convert PAF format to Chain format [aliases: p2c]
  chain2maf  Convert Chain format to MAF format [aliases: c2m]
  chain2paf  Convert Chain format to PAF format [aliases: c2p]
  maf-ext    Extract specific region from MAF file with index [aliases: me]
  call       Call Variants from MAF file [aliases: c]
  maf2sam    TEST: maf2sam [aliases: m2s]
  maf-index  Build index for MAF file [aliases: mi]
  tview      View MAF file in terminal [aliases: tv]
  stat       Statistics for Alignment file [aliases: st]
  help       Print this message or the help of the given subcommand(s)

Options:
  -h, --help     Print help (see more with '--help')
  -V, --version  Print version

GLOBAL:
  -o, --outfile <OUTFILE>  Output file ("-" for stdout) [default: -]
  -r, --rewrite            Bool, if rewrite output file [default: false]
  -t, --threads <THREADS>  Threads, default 1 [default: 1]
  -v, --verbose...         Logging level [-v: Info, -vv: Debug, -vvv: Trace, defalut: Warn]

NOTE: If you want to convert into MAF format, you should provide target and query genome sequence files in [.fa/.fa.gz].

Examples

visualize MAF file in terminal

example

Library

Some simple reader and iterator for PAF, MAF and Chain files:

use wgatools::parser::paf::PafReader;
use wgatools::parser::maf::MAFReader;
use wgatools::parser::chain::ChainReader;
fn main() {
    let mut mafreader = MAFReader::from_path("test.maf").unwrap();
    for record in mafreader.records() {
        let record = record.unwrap();
        println!("{:?}", record);
    }
    /// ...
}

TODO for library

  • Error detection and handling
  • Test cases
  • Documentations

Features

  • use nom to parse CIGAR string
  • use rayon to accelerate the speed of conversions
  • use ratatui to visualize MAF file in terminal
  • ...

Contributing

Feel free to dive in! Open an issue or submit PRs.

License

MIT License © WenjieWei

About

Whole Genome Alignment Tools

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Rust 100.0%