Skip to content

ngs_HTSEQ

Stephen Fisher edited this page Jan 29, 2015 · 7 revisions

Module: HTSEQ

This module will run HTSeq on the uniquely aligned output from either RUM or STAR (STAR by default).

Usage:
	ngs.sh htseq [-i inputDir] [-f inputFile] [-stranded] [-introns] [-id idAttr] -s species sampleID
Input:
	sampleID/inputDir/inputFile
Output:
	sampleID/htseq/sampleID.htseq.cnts.txt
	sampleID/htseq/sampleID.htseq.log.txt
	sampleID/htseq/sampleID.htseq.err.txt
Requires:
	HTSeq version 0.6 or later ( http://www-huber.embl.de/users/anders/HTSeq/ ). Note that HTSeq requires NumPy.
	Pysam ( https://pypi.python.org/pypi/pysam )
	dynamicRange.py ( https://github.com/safisher/ngs )
Options:
	-i inputDir - location of source file (default: star).
	-f inputFile - source file (default: sampleID.star.unique.bam).
	-stranded - use strand information (default: no).
	-introns - also compute intron counts (default: no). A single GTF file is expected to contain both introns and exon. HTSeq will be run first with type=exon and a second time with type=intron.
	-id idAttr - the 'idattr' flag for HTSeq.count() which is the GTF feature that contains the feature ID (default: gene_id).
	-s species - species from repository: /lab/repo/resources/htseq.

Run HTSeq using htseq-count script. This requires a BAM file as generated by either RUMALIGN or STAR (STAR by default).
The following HTSeq parameter values are used for exon counting:
 	--mode=intersection-strict --type=exon

INTRON COUNTING (-introns option):
When intron counting is enabled (-introns) then introns will be counted with intersection-nonempty. In this case three counts files will be generated:
	SampleID.htseeq.exons.cnts: exon counts
	SampleID.htseeq.introns.cnts: intron counts
	SampleID.htseeq.cnts.txt: combined counts tab delimited (gene, exons, introns, total)

For a description of the HTSeq parameters see http://www-huber.embl.de/users/anders/HTSeq/doc/count.html#count
Clone this wiki locally