Skip to content

Releases: GuyTeichman/RNAlysis

Stable release V3.12.0

15 May 12:03
Compare
Choose a tag to compare

3.12.0 (2024-05-14)

I'm happy to announce the latest release of RNAlysis, which brings a variety of new features and improvements.

One of the key additions in this version is expanded support for advanced differential expression analysis using DESeq2 and Limma-Voom.
You can now test continuous covariates, perform likelihood ratio tests for factors, interactions, and polynomials, and take advantage of sample quality weights in Limma-Voom analysis.
We've also added several new visualization options, such as a p-value histogram plot and more customization choices for gene expression plots.

This release also includes various usability enhancements and bug fixes to provide a smoother analysis experience. We've clarified error messages, improved parameter annotations, and resolved issues that could lead to crashes or incorrect results in certain scenarios. RNAlysis now integrates with the latest versions of the OrthoInspector and ShortStack APIs as well.

As always, your support and feedback are greatly appreciated.
If you have any questions, encounter issues, or have suggestions for future development, please don't hesitate to reach out.
Happy analysis!
Guy

Added

  • Added support for advanced differential expression analysis with DESeq2/Limma-Voom, including testing continuous covariates, as well as likelihood ratio tests for factors, interactions, and polynomials.
  • Added support for sample quality weights in Limma-Voom differential expression analysis.
  • Added support for the 'cooksCutoff' parameter in DESq2 differential expression analysis.
  • Added support for user-provided scaling factors in DESeq2 differential expression analysis.
  • bowtie2 and kallisto paired-end modes now support the new file naming method 'smart', that attempts to automatically determine the common name of each pair of paired-end fastq files.
  • Added a p-value histogram plot (DESeqFilter.pval_histogram) that displays the distribution of p-values in a differential expression table.
  • Added new visualization options for 'plot expression of specific genes' function (CountFilter.plot_expression).
  • Added new function 'Histogram of a column' (Filter.histogram) that generates a histogram of a column in a table.
  • Added new function 'Concatenate two tables' (Filter.concatenate) that concatenates two tables along their rows.
  • The filtering function 'Filter with a number filter' now supports filtering based on absolute values.

Changed

  • When running differential expression analysis, RNAlysis will automatically ensure that the order of samples in your design matrix matches the order of samples in your count matrix, avoiding erronious results.
  • The 'Filter genes with low expression in all columns' function (CountFilter.filter_low_reads) now supports the 'n_samples' parameter, allowing users to filter genes with a minimal expression threshold in a specific number of samples.
  • The 'Plot expressino of specific genes' function (CountFilter.plot_expression) now supports the 'jitter', 'group_names' and 'log_scale' parameters, allowing users to further customize the plot.
  • The 'Scatter plot - sample VS sample' function (CountFilter.scatter_sample_vs_sample) now always displays highlighted points on top of the plot, making it easier to see which points are highlighted.
  • Kallisto quantification (kallisto_quantify_single_end and kallisto_quantify_paired_end) now supports the 'summation_method' parameter, allowing users to choose between 'raw' and 'scaled_tpm' transcript summation methods. The default behavior of the functions did not change (it corresponds to 'scaled_tpm').
  • Enrichment bar plots now have optional parameters that control font sizes for titles and labels.
  • Moved the enrichment analysis and enrichment graphs actions to the "Enrichment" menu in the graphical interface to make the actions easier to find.
  • Improved the clarity of error messages when attempting to read an invalid GTF file.
  • RNAlysis now supports the latest version of OrthoInspector API.
  • RNAlysis now supports the latest version of Ensembl Ortholog API.
  • Improved annotation for the 'metric' parameter of the 'Hierarchical clustergram plot' function (CountFilter.clustergram).
  • Improved performance of RNAlysis when generating automatic session reports.
  • RNAlysis now offers default values for differential expression tables' column names.
  • Functions that average replicates now display clearer group names by default.
  • The RNAlysis interface to ShortStack now uses the most recent API (replaced 'knownRNAs' with 'known_miRNAs').
  • When running differential expression, RNAlysis session reports will automatically include the auto-generated R script, as well as the sanitized design matrix used.
  • Added optional parameters to all differential expression functions, allowing users to return a path to the auto-generated R script and data sanitized design matrix used.

Fixed

  • Fixed bug where some FASTQ/SAM functions could not be added to a FASTQ pipeline.
  • Fixed bug where bowtie2 could not be run in/from directories with spaces in their names.
  • Fixed bug where RNAlysis would crash when launched without an internet connection.
  • Fixed bug that cause ID-mapping functions to raise an error when called from the MacOS stand-alone app (thanks to Mitchzw in #34).
  • Fixed bug that caused R package installations (DESeq2, limma, etc) to fail on some computers (thanks to Celine-075 in #35).
  • Fixed bug where Limma-Voom differential expression would fit numerical covariates as categorical factors.
  • Fixed bug where generating enrichment bar plots with ylim='auto' would cause bars with 100% depletion (log2FC=-inf) to disappear.
  • Fixed bug where defining 10 or more sample groups in the graphical interface would cause the groups to be ordered incorrectly in graphs.
  • Fixed bug where the 'return_scaling_factors' argument would not return the normalization scaling factors on the graphical interface.
  • Fixed various visual issues in the graphical interface
  • Fixed bug where the 'filter_by_row_sum' function would raise an error when the table contains non-numerical columns
  • Fixed bug where running enrichment on an empty gene set would raise an error.
  • Fixed bug where RNAlysis would suggest resuming an auto-report from loaded session even when auto-report is turned off.
  • Fixed bug where disabling auto-report in the middle of the session would raise errors when trying to create new graphs.
  • Fixed bug where generating multiple gene expression plots (split_plots=True) with auto-generated report would only add the last graph to the session report.
  • Fixed bug where the function 'normalize_to_quantile' generated unclear table names.
  • Fixed bug where sometimes the first operation performed in a session would not display correctly in the automatic session report.

New Contributors

  • [Mitchzw] in [#34]
  • [Celine-075] in [#35]

Stable release V3.11

05 Jan 16:47
Compare
Choose a tag to compare

3.11.0 (2024-01-05)

This release brings several exciting new features.
Notably, these inclue the ability to run functions from the Picardtools suite, and the ability to automatically save interactive session reports and later resume them from any saved session.
In addition, this release includes several visual upgrades, bug fixes, and quality-of-life improvements.
Happy analysis!

Added

  • RNAlysis can now run functions from the Picardtools suite, including conversion functions (BAM to SAM, SAM to FASTQ, FASTQ to SAM, etc), quality control (validate SAM), and post-processing functions (remove PCR duplicates, sort SAM, create BAM index).
  • RNAlysis interactive session reports can now be resumed from any saved session, instead of having to start a new report from scratch. When loading a session created by RNAlysis 3.11 and beyond, you will have the option to resume the interactive report from the last saved state.

Changed

  • CutAdapt adapter trimming functions can now receive an optional "new_filenames" parameter, which allows users to specify the names of the output files.
  • Hierarchical clustergram plot (CountFilter.clustergram) now supports the 'colormap' parameter, which allows users to specify a custom color map for the plot.
  • Hierarchical clustergram plot (CountFilter.clustergram) now displays continuous values on the color bar, instead of discrete values.
  • Generally improved the looks of Hierarchical clustergram plot (CountFilter.clustergram).
  • Previously-added functions (such as ortholog/paralog mapping, biotype summary by GTF file, etc) can now be applied to gene sets. Previously, some of these functions could only be applied to data tables.

Fixed

  • Function "Summarize feature biotypes (based on a reference table)" (biotypes_from_ref_table) now treats rows with missing values as "_missing_from_biotype_reference" instead of ignoring them entirely.
  • Fixed bug where the Ensembl paralog-finding function would appear under the wrong tab in the graphical interface.
  • Fixed bug where the description of the MA Plot function and parameters would not display correctly in the graphical interface.

Stable release V3.10.1

22 Nov 22:12
Compare
Choose a tag to compare

3.10.1 (2023-11-22)

Version 3.10.1 introduces several bug fixes, as well as well as support for random effect analysis in Limma-Voom differential expression.

Added

  • Limma-Voom differential expression can now fit mixed linear models containing a random effect (e.g. nested design).

Fixed

  • Fixed bug where trying to load sessions created with RNAlysis version

Stable release V3.10.0

31 Oct 18:48
Compare
Choose a tag to compare

3.10.0 (2023-10-31)

I'm thrilled to introduce RNAlysis version 3.10.0.
This version includes features that were requested by users for a while, alongside quality-of-life improvements and bug fixes.
Here is a brief highlight of the most important additions:

Ortholog Mapping: RNAlysis can now map genes to their closest orthologs in different organisms.
You can map genes to their orthologs using four different databases - Ensembl, Panther, PhylomeDB, and OrthoInspector - extracting both one-to-one and one-to-many ortholog relationships and filtering them based on their reliability.

Discovering Paralogs: In the same vein, RNAlysis now facilitates the discovery of paralogs within a specific organism, using either the Ensembl or Panther databases.

New visualization and analysis options for Principal Component Analysis (PCA): We've introduced new functions and parameters to allow you to get more out of your principal component analysis.

I would also like to extend my personal apology for the delay in bringing you this update.
Due to personal reasons, this release, originally scheduled for the end of August, took longer than expected.
Your patience and support have been invaluable, and I'm eager to share these exciting additions with you.
Thank you for being a part of the RNAlysis community, and stay tuned for more updates in the near future!

Added

  • Added new functions to the filtering module that map genes to their closest orthologs in a different organism, using four different databases: Ensembl, Panther, PhylomeDB, and OrthoInspector.
  • Added new functions to the filtering module that find paralogs of genes in a given organism, using two different databases: Ensembl and Panther.
  • Added new function 'Sort table by contribution to a Principal Component (PCA)' (CountFilter.sort_by_principal_component), which allows sorting of genes in a count matrix by their contribution (gene loadings) to a principal component.
  • Added a new parameter called 'legend' to 'Principal Component Analysis (PCA) plot' (CountFilter.pca), which allows users to display a legend on the PCA plot with a name for each sample group/color.

Changed

  • RNAlysis now issues a warning when users run PCA or PCA-based functions on an unnormalized count matrix.
  • The 'seek_fusion_genes' and 'learn_bias' arguments for kallisto quantification (fastq.kallisto_quant_single_end and fastq.kallisto_quant_paired_end), which were depracated in kallisto versions >0.48, are no longer displayed on the graphical interface. Old Pipelines that contain these arguments will still run, but new Pipelines will not contain them.
  • Long-running functions now run in the background even when the 'inplace' parameter is set to True, instead of freezing the entire graphical interface.

Fixed

  • Fixed bug where functions would sometimes fail to run without displaying an error message.
  • Fixed bug where progress bars in the graphical interface would sometimes not disappear after reaching 100% completion.
  • RNAlysis should no longer display warning messages about graph layout when graphs are scaled down.
  • Fixed bug where the clustergram function (CountFilter.clustergram) would raise an error with specific sets of dependecy versions.
  • Loading tables no longer raises a depracation warning when using newer versions of Pandas.

Stable release V3.9.2

23 Jun 15:44
Compare
Choose a tag to compare

3.9.2 (2023-06-23)

This patch contains bug fixes and improved functionality for enrichment lollipop plots,
as well as bug fixes for issues with the stand-alone version.

Changed

  • Single-set enrichment result tables now contain observed/expected values based on the XL-mHG test cutoff.
  • When loading a differential expression design matrix, RNAlysis now issues an error if the design matrix column names contain invalid characters.
  • Updated the scaling of enrichment lollipop plots to make small 'observed' values easier to discern.

Fixed

  • Fixed bug where an error message would sometimes appear after RNAlysis finishes generating an automatic session report on the stand-alone app.
  • Fixed bug where enrichment lollipop plots in horizontal mode would display the observed/expected values in reverse order.
  • Fixed bug where enrichment lollipop plots and the 'show_exp' parameter would not work on single-set enrichment data.
  • Fixed bug where sometimes the tab label of clustering/differential expression output tables would not match the name of the generated table.

Stable release V3.9.1

19 Jun 13:21
Compare
Choose a tag to compare

3.9.1 (2023-06-19)

Version 3.9.1 of RNAL=lysis introduces several improvements and fixes to further improve your analysis experience.
The release includes new optional parameters for single-set enrichment functions, compatibility improvements with newer Python versions,
improved error messaging for R scripts, and adresses minor issues related to enrichment analysis, documentation, plotting parameters, and Pipeline saving.

Added

  • Added new optional parameters to single-set enrichment functions, allowing users to determine the top and bottom cutoffs for the XL-mHG test ("X" and "L").

Changed

  • RNAlysis single-set enrichment analysis using the XL-mHG test now supports Python versions >= 3.8.
  • RNAlysis stand-alone app is now built on Python 3.11, improving overall performance.
  • Error messages caused by running R tools such as DESeq2, Limma-Voom, and FeatureCounts will now clearly state the reason the script failed, making it easier to understand what went wrong.

Fixed

  • Fixed bug where enrichment analysis would raise an error when running enrichment analysis on a gene set with no relevant annotations, or a gene set that does not intersect at all with the background gene set.
  • Added missing documentation for plotting parameters in some enrichment functions.
  • Depracation Warning should no longer appear when generating a box-plot or enhanced box-plot with scatter=True (CountFilter.box_plot, CountFilter.enhanced_box_plot)
  • Fixed bug in featureCounts single-end mode where the 'output_folder' parameter could appear as disabled.
  • Fixed bug where RNAlysis would raise an error message after saving a Pipeline, even when the Pipeline was saved successfully.

Stable release V3.9.0

09 Jun 17:31
Compare
Choose a tag to compare

3.9.0 (2023-06-09)

Version 3.9.0 of RNAlysis introduces several enhancements and fixes to improve your experience.
The release includes additional enrichment plot styles, a new option for PCA plots,
the ability to load and save data tables in Parquet format, and new actions in the Help menu for reporting issues and suggesting improvements.
The update also improves the performance of various functions, ensures consistency in font and theme settings,
and addresses multiple bug fixes, including issues with automatic session reports and visualization functions.

Added

  • Added additional parameters to enrichment bar plots (enrichment.enrichment_bar_plot), including a new plot style ('lollipop') and observed/expected labels on the graph.
  • Added a new parameter to Principal Component Analysis plots (CountFilter.pca) 'plot_grid', which can enable or disable adding a grid to PCA plots.
  • RNAlysis can now load and save data tables in Parquet format (.parquet)
  • Added new actions to the Help menu, allowing users to report issues, suggest issues, or open discussions.

Changed

  • Functions in the FASTQ model are now added to automatic session reports.
  • Many of the functions in RNAlysis should now run faster.
  • Font type, size, and color for help tooltips should now match the global font settings.
  • True/False toggle switches now scale with font size.
  • When loading data tables into RNAlysis, you will now see only supported file formats by default.
  • Clustering PCA plots are now plotted in proportion to the % variance explained by each PC.
  • The legend in clustering PCA plots is now draggable.

Fixed

  • Fixed bug where data tables generated through the FASTQ model would not display properly in automatic session reports.
  • Fixed bug where graphs generated through the Visualize Gene Sets window would not be added to automatic analysis report.
  • When saving a file through the graphical interface, automatically-suggested filenames no longer contain illegal characters.
  • Improved clarity of error message when R installation folder is not found.
  • Fixed bug where some input parameter widgets in the RNAlysis graphical interface would not display properly.
  • RNAlysis now provides a clearer warning message when attempting to run HDBSCAN clustering, if the hdbscan package is not installed.
  • Label text in PCA plots and hierarchical clustergrams should no longer be cropped outside of the visible region of the plot.
  • Fixed bug where some visualization functions, such as pair-plot (CountFilter.pairplot) would not display properly due to version mismatches between pandas and seaborn.
  • Improved clarity of error messages when external apps' (kallisto, bowtie2, etc) installation folders are not found.
  • Fixed bug where running the RNAlysis graphical interface on a new computer would sometimes raise an error (thanks to NeuroRookie in #25).
  • Fixed a bug where the 'min_samples' parameter in HDBSCAN clustering could not be disabled.
  • Fixed a bug where applying a function to a gene set with inplace=False would cause the new gene set to be called 'New Table'.
  • Fixed a bug where RNAlysis would display the message "Pipeline saved successfully", even when the user cancels the save operation.

New Contributors

  • [NeuroRookie] in [#25]

Stable release V3.8.0

06 May 20:10
Compare
Choose a tag to compare

3.8.0 (2023-05-07)

Version 3.8.0 of RNAlysis comes with several exciting new features, including the ability to generate interactive analysis reports automatically.
This feature allows users to create an interactive graph of all the datasets they loaded into RNAlysis, the functions applied, and the outputs generated during the analysis.
You can read more about this feature in the Tutorial chapter and the User Guide chapter.

The new release also includes Pipelines for FASTQ functions, the ability to export normalization scaling factors, and other changes to improve the software's performance.
RNAlysis now supports Python 3.11, and many functions should now run faster. The software's graphic interface has also improved significantly, and users will now see a clearer error message when attempting to load unsupported file formats.
Lastly, the release also fixes several bugs.

Please note that, since Python 3.7 will be reaching end of life as of June 2023, new versions of RNAlysis will no longer support it.

Added

  • You can now generate interactive analysis reports automatically using the RNAlysis graphical interface. Read more about this feature here.
  • Added Pipelines for the FASTQ module (SingleEndPipeline and PairedEndPipeline), allowing you to apply a series of functions (adapter trimming, alignment, quantification, etc) to a batch of sequence/alignment files.
  • Added new parameter 'return_scaling_factors' to normalization functions, that allows you to access the scaling factors calculated by RNAlysis.
  • Added new parameter 'gzip_output' to CutAdapt adapter trimming (fastq.trim_adapters_single_end and fastq.trim_adapters_paired_end), allowing users to determine whether or not the output files will be gzipped.

Changed

  • RNAlysis now supports Python 3.11, and no longer supports Python 3.7.
  • Many of the functions in RNAlysis should now run faster.
  • The RNAlysis graphic interface should now boot up significantly faster.
  • RNAlysis now shows an easier to understand error message when users attempt to load a table in an unsupported format (e.g. Excel files).
  • CutAdapt adapter trimming (fastq.trim_adapters_single_end and fastq.trim_adapters_paired_end) now outputs non-gzipped files by default.
  • Standardized all plotting functions in the filtering module to return a matplotlib Figure object, which can be further modified by users.

Fixed

  • RNAlysis failing to map gene IDs during GO enrichment analysis should no longer raise an error (thanks to clockgene in #16).
  • Fixed bug where the Command History window would not display history of the current tab immediately after clearing the current session.
  • Fixed bug where adapter trimming would fail to run when using CutAdapt version >= 4.4.0.
  • Fixed bug where 'Filter rows with duplicate names/IDs' (Filter.filter_duplicate_ids) would raise an error when applied to some tables.

New Contributors

  • [clockgene] in [#16]

Stable release V3.7.0

07 Apr 11:01
Compare
Choose a tag to compare

3.7.0 (2023-04-07)

This version introduces small RNA read alignment using ShortStack, new filtering functions, a new optional parameter for Principal Component Analysis, improvements to the graphical interface, and bug fixes.

Added

  • Added small RNA read alignment using ShortStack (fastq.shortstack_align_smallrna).
  • Added new filtering function 'Filter specific rows by name' (Filter.filter_by_row_name).
  • Added new filtering function 'Filter rows with duplicate names/IDs' (Filter.filter_duplicate_ids).
  • Function parameters in pop-up windows in the graphical interface can now be imported and exported.
  • Added new parameter to Principal Component Analysis (CountFilter.pca) 'proportional_axes', that allows you to make the PCA projection axes proportional to the percentage of variance each PC explains.
  • Improved clarity of error messages in the graphical interface.

Changed

  • Tables loaded into RNAlysis that use integer-indices will now be converted to use string-indices.
  • Refactored CountFilter.from_folder into CountFilter.from_folder_htseqcount, and added a new CountFilter.from_folder method that accepts a folder of count files in any format.
  • In the RNAlysis graphical interface, optional parameters that can be disabled will now display the hint "disable this parammeter?" instead of "disable input?".
  • Added optional parameter 'ylim' to 'create enrichment bar-plot' function (enrichment.enrichment_bar_plot), allowing users to determine the Y-axis limits of the bar plot.
  • Updated function signatures of 'Visualize gene ontology' and 'Visualize KEGG pathway' (enrichment.gene_ontology_graph and enrichment.kegg_pathway_graph) to make more sense.
  • Removed parameter 'report_read_assignment_path' from featureCounts feature counting (fastq.featurecounts_single_end and fastq.featurecounts_paired_end).
  • The RNAlysis graphical interface should now load more quickly.
  • Progress bars in the graphical interface should now reflect elapsed/remaining time for tasks more accurately.

Fixed

  • Fixed bug in the function 'Split into Higly and Lowly expressed genes' (Filter.split_by_reads) where the two resulting tables would be named incorrectly (highly-expressed genes would be labeled 'belowXreads' and vice-versa).
  • Fixed bug where the 'column' parameter of some functions ('Filter by percentile', 'Split by percentile', 'Filter with a number filter', 'Filter with a text filter') would not automatically detect column names in the graphical interface.
  • Fixed bug where the 'numerator' and 'denominator' parameters of of the function 'Calculate fold change' would not automatically detect column names in the graphical interface.
  • Fixed bug where tables with integer-indices could not be visualized properly through the graphical interface.
  • Fixed bug in the function 'featureCounts single-end' (fastq.featurecounts_single_end) where setting parameter 'report_read_assignment' to any value other than None would raise an error.
  • Functions that take column name/names as input (transform, filter_missing_values, filter_percentile, split_percentile) can now be applied to fold change tables (FoldCangeFilter objects).
  • Fixed bug where the description for the 'n_bars' parameter of the 'create enrichment bar-plot' function (enrichment.enrichment_bar_plot) would not display properly.
  • 'Visualize gene ontology' and 'Visualize KEGG pathway' (enrichment.gene_ontology_graph and enrichment.kegg_pathway_graph) now have proper parameter descriptions.
  • Fixed bug where in-place intersection and difference in the filtering module would fail when using recent versions of pandas.
  • Fixed bug where graphs generated through the Visualize Gene Sets window would not immediately display when using the RNAlysis stand-alone app.

Stable release V3.6.2

25 Mar 17:37
Compare
Choose a tag to compare

3.6.2 (2023-03-25)

This version introduces quality-of-life changes to the graphical interface, as well as bug fixes.

Added

  • Sample groupings in functions such as PCA, Pair-plot, etc., can now be imported and exported through the graphical interface.

Fixed

  • Fixed bug where the stand-alone Mac version of RNAlysis would sometimes fail to map gene IDs (directly or in enrichment analysis).