-
Notifications
You must be signed in to change notification settings - Fork 0
Tool for mapping regions between genomes.
License
hzzv/maptool
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Maptool Version 1.0 05/31/2014
-------------------
Author: Petra Kubincova, Faculty of Mathematics, Physics and Informatics,
Comenius University in Bratislava, Slovakia
This tool is also available at: https://github.com/kpetra/maptool
DEPENDENCIES
-------------------
For preprocessing:
* Biopython including the module bgzf.
Available at http://biopython.org and also through Linux package management
system.
(If you have an older version of Biopython, you might run into the following
bug when trying to run Maptool preprocessing:
"AttributeError: 'BgzfWriter' object has no attribute 'handle'"
To fix it, simpy change in the file 'bgzf.py' the line:
return make_virtual_offset(self.handle.tell(), len(self._buffer))
To the following line:
return make_virtual_offset(self._handle.tell(), len(self._buffer))
)
For mapping:
* Python 2.x, x >= 6
* ocaml-bgzf library
Available at: https://github.com/mlin/ocaml-bgzf
The directory containing ocaml-bgzf needs to be in the same directory
as 'maptool'.
ocaml-bgzf depends on:
- zlib (http://www.zlib.net/),
- findlib (http://projects.camlcity.org/projects/findlib.html)
INPUT AND OUTPUT FILES
-------------------
Preprocessing:
* Input file format is MAF (http://genome.ucsc.edu/FAQ/FAQformat.html#format5)
* Output files are:
- Binary header file
- BGZF compressed file (dx.doi.org/10.1093/bioinformatics/btp352)
Mapping:
* Input files are the binary header file and the BGZF compressed file created
in the preprocessing phase. The standard input should be in BED format
(http://genome.ucsc.edu/FAQ/FAQformat.html#format1).
* Standard output is in BED format.
* Standart error output is text describing errors encountered during
the mapping.
HOW TO RUN MAPTOOL?
-------------------
Preprocessing:
Simpy run "python maptool.py" in the directory "preprocessing".
(Usage:
python maptool.py preprocess <alignment.maf> <header.bin> <compressed.bgzf>
)
Mapping:
1. Run "make" in directory "mapping".
2. Run "./maptool" in the same directory.
(Usage:
./maptool bed <header.bin> <compressed.bgzf> <informant> [--maxgap N] [--outer] [--alwaysmap]
- This will map regions on the standart input in BED format
from the reference sequence to the <informant>.
- Maxgap is the size of maximal allowed gap in the alignment.
Default is 10.
- Outer means that the mapping will be enlarged, not shrinked,
if needed.
- Alwaysmap means that the input region will be mapped even if thick
region or exons do not map correctly.
- Best to run with redirecting stdin, stdout and stderr:
- < regiones_to_be_mapped.bed
- > mapped_regions.bed
- 2> error_log.txt
./maptool info <header.bin>
- This will display identificators of informants and identificators
of reference chromosomes.
About
Tool for mapping regions between genomes.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published