Implemented --genomeFileSizes option to supply sizes of the genome index files. This allows for streaming of index files.
Implemented extra references input in the SAM/AM header from user-created "extraReferences.txt" file in the genome directory.
Implemented --chimOutType HardClip OR SoftClip options to output hard (default) / soft clipping in the BAM CIGAR for supplementary chimeric alignments.
Implemented --chimMainSegmentMultNmax parameters, which may be used to prohibit chimeric alignments with multimapping main segments to reduce false positive chimeras.
Implemented new SAM attribute 'ch' to mark chimeric aligmments in the BAM file for --chimOutType WithinBAM option.
Implemented --bamRemoveDuplicatesType UniqueIdenticalNotMulti option, which (unlike the UniqueIdentical optipon) will NOT mark multi-mappers as duplicates.
For --bamRemoveDuplicatesType UniqueIdentical, the unmmapped reads are no longer marked as duplicates.
Fixed occasional seg-faults after the completion of the mapping runs with shared memory.
Fixed a problem with RNEXT field in the Chimeric.out.sam file: RNEXT now always points to the other mate start.
- Fixed a problem with --outSAMmultNmax 1 not working for transcriptomic output.
- Fixed a bug with chimeric BAM output for --chimOutType WithinBAM option.
- Fixed a bug that could cause non-stable BAM sorting if the gcc qsort is unstable.
- Fixed a bug with causing seg-faults when combining --twopassMode Basic --outSAMorder PairedKeepInputOrder .
- Fixed a problem with SAM header in cases where reference sequences are added at the mapping stage.
- Fixed the "GstrandBit" problem.
- Fixed a bug introduced in 2.5.1a that caused problems with single-end alignments output in some cases.
- Fixed a bug that can cause STARlong seg-faults in rare cases.
- Fixed a bug that caused output of unmapped mates for single end alignments even with --outSAMunmapped None .
- Implemented --winReadCoverageRelativeMin and --winReadCoverageBasesMin to control coverage of the alignment windows for STARlong.
- Implemented --outSAMfilter KeepAllAddedReferences option which will keep all alignments to the added references.
- Implemented --alignEndsProtrude option to control output of alignments with protruding ends.
- Implemented --outTmpKeep All option to keep the temporary files.
- Implemented --alignEndsType Extend5pOfReads12 option for full extension of 5' ends of both mates.
- Fixed a bug in --quantMode TranscriptomeSAM that prevented output to Aligned.toTranscriptome.out.bam of the reads mapped to the very last annotated transcript.
- Cleaned up the code to remove compilation warnings (thanks to github.com/yhoogstrate).
- Implemented --outSAMunmapped Within KeepPairs option to record unmapped mate adjacent to the mapped one, in case single-end alignments are allowed.
For multi-mappers, the unmapped mate will be recored mulitple times adjacent to the mappet mate of each alignment.
- Implemented --genomeSuffixLengthMax option to control max suffix length at the genome generation step.
- Fixed a bug that caused genome generation stalling in some cases.
- In Aligned.toTranscriptome.out.bam (--quantMode TranscriptomeSAM), non-primary SAM flag is assigned to all but one randomly selected alignment in Aligned.toTranscriptome.out.bam .
- Fixed a bug that filtered out some chimeric junctions.
- Fixed a bug that prevented chimeric output for some of the "circular" configurations.
- Fixed a problem with non-primary alignment flags with --outSAMmultNmax option.
- Added counting of chimeric reads into Log.final.out .
- Fixed a bug in --outSAMfilter KeepOnlyAddedReferences.
- Fixed a minor bug that caused rare seg-faults.
- Fixed a minor bug in STARlong extension at the ends of the read.
- Fixed a seg-fault that occurred when non-default value of --genomeChrBinNbits was used.
- Fixed a seg-fault that occurred when junctions where inserted after inserting reference sequences.
STAR now uses essential c++11 features and requires gcc 4.7.0 or later.
Major new features:
- Implemented on the fly insertion of the extra sequences into the genome indexes.
- Implemented --outSAMmultNmax parameter to limit the number of output alignments for multimappers.
- Implemented --outMultimapperOrder Random option to output multiple alignments in random order.
This also randomizes the choice of the primary alignment. Parameter --runRNGseed can be used to set the random generator seed.
With this option, the ordering of multi-mapping alignments of each read, and the choice of the primary alignment will vary from run to run, unless only one thread is used and the seed is kept constant.
Minor new features:
- Implemented --outSAMattrIHstart parameter. Setting it to 0 may be required for compatibility with downstream software such as Cufflinks or StringTie.
- Implemented --outSAMfilter KeepOnlyAddedReferences option.
- Implemented --help option - thanks to @yhoogstrate for the code.
- Implemented --alignEndsType Extend3pOfRead1 option for full extension of the 3' end of read 1.
- Implemented --alignSJstitchMismatchNmax option to allow for mismatches around non-canonical junctions.
- Implemented --chimSegmentReadGapMax parameter which defines the maximum gap in the read sequence between chimeric segments. By default it is set to 0 to replicate the behavior of the previous STAR versions.
- Implemented --chimFilter banGenomicN | None options to prohibit or allow the N characters in the vicinity of the chimeric junctions. By default, they are prohibited - the same behavior as in the previous versions.
- For STARlong, increased compilation-time max read length to 500000 and max number of exons to 1000
- Fixed a bug which caused problems in some cases of genome generation without annotations.
- Fixed a bug in the --alignEndsType Extend5pOfRead1 option.
- Improved compilation flags handling in Makefile - thanks to Christian Krause for the code.
- Improved treatment of the streams and files - thanks to Alex Finkel for the code.
- Merged pull request from Nathan S. Watson-Haigh: Makefile for manual;Travis-CI automated build; Update STAR-Fusion submodule to v0.3.1
- Merged pull request from Alex Finkel to allow 'parameter=value' option formatting, e.g. --runThreadN=8.
Major new feature:
- --quantMode GeneCounts option for counting number of reads per gene, similar to htseq-count.
- STARlong: fixed --outFilterIntronMotifs and --outSAMstrandField options.
- Yet another fix for --sjdbOverhang logic.
- Error message when shared memory and on the fly junction insertion are used together.
- Fixed a bug causing unnecessary 1 base soft-clipping in rare cases with sparse suffix array.
- Fixed a bug that caused problems with junction motifs in rare cases. Very few alignments affected, <1 per million.
- Fixed problems with --sjdbOverhang default and user-defined values.
- Fixed problems with occasional non-adjacent output of multiple alignments into the unsorted BAM file and transcriptome BAM file.
- Fixed a bug causing seg-faults when shared memory options in --genomeLoad are used with --outStd SAM.
- Fixed a bug causing seg-faults for small values of --limitIObufferSize.
- Added STAR long pre-compiled executables.
- Fixed very minor issues with filtering into SJ.out.tab .
- Fixed some bugs in STARlong mapping algorithm.
- Fixed --outFilter BySJout filtering for STARlong.
- Fixed XS attrbutes in STARlong.