Skip to content

AlgoLab/LightStringGraph

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
src
 
 
 
 
 
 
 
 
 
 
 
 
 
 

LightStringGraph

Lightweight String Graph Construction

This software is in its initial stage of development. Please contact Marco Previtali for any questions or use the issue tracker for reporting bugs.

Building and running LightStringGraph

The software is composed of three different tools: lsg (light (string) overlap graph), redbuild (string graph build), and graph2asqg (native graph format to ASQG).

To build all the tools simply move in the root directory of the project and run make all.

If you want to try out the software on some (simulated or real) reads follow these steps:

  1. given a FASTA file a.fa containing n reads produce another FASTA file b.fa containing 2n reads s.t. reads between position 1 and n are the same as those in a.fa and reads between position n+1 and 2n are their reverse and complement (for 1 <= i <= n, read in position n+i should be the reverse complement of read in position i)
  2. download and compile BEETL (please note that this is not the original repository)
  3. compile LightStringGraph cd <LSGPATH> && make all
  4. build the BWT of b.fa with beetl-bwt -i b.fa -o <BWTPrefix> --output-format=ASCII --generate-lcp --generate-end-pos-file
  5. run LightStringGraph <LSGPATH>/bin/lsg -B <BWTPrefix> -T <Tau> -C <CycNum> where <Tau> is the minimum overlap between reads and <CycNum> >= <reads length> - <Tau>
  6. run redbuild <LSGPATH>/bin/redbuild -b <BWTPrefix> -r b.fa -m <CycNum>+1
  7. optionally run graph2asqg <LSGPATH>/bin/graph2asqg -b <BWTPrefix> -r b.fa -l <readsLength> and redirect STDOUT (the string graph in the ASQG format) to a file (you can compress it on the fly).

If lsg crashes and produces a logic error try to raise the limit on the maximum number of open file descriptors for the user running that command (for example, with the bash built-in ulimit -n) and delete all the *.tmplsg.* files before running lsg again.

About

Lightweight String Graph Construction

Resources

License

Stars

Watchers

Forks

Packages

No packages published