-uses samtools v1.3, jellyfish 2.2.6, and trimmomatic 0.36
-seqtk now used for fastq->fasta conversions
-updated compute resource monitoring routines that leverage collectl
-included utilities to examine strand-specificity of the rna-seq reads:
-Trinity carefully checks formatting of 'samples_file' to ensure it uses all the samples (dealing w/ non-unix text-editor formatting issues that can bundle the entire text file into what looks like a single line of text)
-Docker image updated to include salmon for abundance estimation
-refined use of bowtie2 in chrysalis clustering and various updates to improve long-read support.
Release v2.3.2 Nov 20, 2016
-for submitting parallel computes on a computing grid, use the new --grid_exec parameter with your own script that handles grid submissions and performs job management. -a '--samples_file' option is now available for Trinity and abundance estimation to simplify the use of many RNA-Seq data sets across different samples and biological replicates. -in silico normalization now happens by default. Use --no_normalize_reads to turn it off. -bowtie2 is used instead of bowtie1 -Butterfly has improved support for longer reads and is more efficient. Also, isoform clustering and reconciling overly similar sequence paths were refined. -DE analysis reports include the names of the samples A vs. B in the output table and fold changes as A/B -GOseq is now provided with the list of expressed genes to use as the background for functional enrichment testing. Also, expression-weighted gene lengths are used. Finally, the list of genes identified as functionally enriched in a GO category are provided in the output file. Basic support for GOplot integration is included. -added support for Glimma interactive volcano and MA plots (thanks Ken Field!!) -overhauled long read support. Currently, by default, long reads are used for iworm clustering and graph threading but not used for de Bruijn graph construction itself; only iworm contigs used there. -now requires Java-v1.8 -Developer notes: -chrysalis: separated the iworm graph from the iworm clustering step into separate utilities, easier to track and debug.
This pre-release includes a number of important changes from earlier versions:
In silico normalization now happens by default.
Improved support for long read integration.
Faster Butterfly and improved grouping of isoforms to genes.
** Best assembly stats recorded for Trinity to date **
and beyond assembly:
consistent representation of sample A vs. B in DE analyses with reporting of sample names in the DE_results files.
no more relying on metadata in filenames to determine sample comparisons (silly thing, but caused a lot of trouble).
GOseq support: reports the identity of the genes that are enriched in different GO categories, with integration into GOplot. All expressed transcripts are used as the background data in the GOseq enrichment testing instead of all features with GO assignments.
-Butterfly update: bugfix related to polynucleotide runs. -util/SAM_nameSorted_to_uniq_count_stats.pl: count fragments instead of reads. -util/abundance_estimates_to_matrix.pl: will output a matrix even if only a single sample is specified. Also, now can take a --samples_file containing a list of the target files to build the matrix from. -util/align_and_estimate_abundance.pl: added support for salmon -sample_data/test_align_and_estimate_abundance/: added examples and tests for single-end and paired-end abundance estimation
A few minor fixes:
Memory is divided among the samtools threads.
The Trinity contig identifiers for genome-guided assemblies are now formatted correctly (as compared to v2.1.0).
We now run a check to ensure that the number of fastq records being converted to fasta by fastools matches (sanity check).
Abundance estimation: added support for kallisto and using TPMs now instead of FPKMs for downstream analyses.
DE analysis: added support for Limma/Voom and ROTS, dropped support for DESeq(1) while keeping DESeq2. For edgeR w/o bio reps, user must define dispersion parameter.
Minimal changes to the assembler, minor bug fixes, tackled most github 'issues' from last release.
Trinity documentation was reorganized, revised, and moved to the wiki format.
Files containing reads to assemble are now properly being fanned out across a number of directories and files, instead of inadvertently co-localizing them all in a single directory. Performance improvements should be observed in the context of large data sets.