forked from HadoopGenomics/SeqPig
-
Notifications
You must be signed in to change notification settings - Fork 0
SeqPig is a library for Apache Pig for the distributed analysis of large sequencing datasets. It provides import and export functions for file formats commonly used for sequencing data, as well as a collection of Pig user-defined-functions (UDF’s) to help process aligned and unaligned sequence data.
Bambooie/SeqPig
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
SeqPig is a library of import and export functions for file formats commonly used in bioinformatics for Apache Pig. Additionally, it provides a collection of Pig user-defined functions (UDF's) that allow for processing of aligned and unaligned sequence data. Currently SeqPig supports BAM/SAM, FastQ and Qseq input and output and FASTA input. It is built on top of the Hadoop-BAM library. Fore more information see http://seqpig.sourceforge.net/ and the documentation that comes with the release. Releases of SeqPig come bundled with Picard/Samtools, which were developed at the Wellcome Trust Sanger Institute, and Biodoop/Seal, which were developed at the Center for Advanced Studies, Research and Development in Sardinia. See http://samtools.sourceforge.net/ http://biodoop-seal.sourceforge.net/
About
SeqPig is a library for Apache Pig for the distributed analysis of large sequencing datasets. It provides import and export functions for file formats commonly used for sequencing data, as well as a collection of Pig user-defined-functions (UDF’s) to help process aligned and unaligned sequence data.
Resources
Stars
Watchers
Forks
Packages 0
No packages published
Languages
- Java 64.8%
- TeX 20.9%
- PigLatin 5.8%
- Shell 5.2%
- Perl 2.2%
- Python 1.1%