Skip to content

ISAnalytics is an R package developed to analyze gene therapy vector insertion sites data identified from genomics next generation sequencing reads for clonal tracking studies.

License

Notifications You must be signed in to change notification settings

calabrialab/ISAnalytics

Repository files navigation

ISAnalytics

codecov R-CMD-check-bioc DEVEL BioC status Lifecycle: stable

ISAnalytics is an R package developed to analyze gene therapy vector insertion sites data identified from genomics next generation sequencing reads for clonal tracking studies.

In gene therapy, stem cells are modified using viral vectors to deliver the therapeutic transgene and replace functional properties since the genetic modification is stable and inherited in all cell progeny. The retrieval and mapping of the sequences flanking the virus-host DNA junctions allows the identification of insertion sites (IS), essential for monitoring the evolution of genetically modified cells in vivo. A comprehensive toolkit for the analysis of IS is required to foster clonal tracking studies and supporting the assessment of safety and long term efficacy in vivo. This package is aimed at (1) supporting automation of IS workflow, (2) performing base and advance analysis for IS tracking (clonal abundance, clonal expansions and statistics for insertional mutagenesis, etc.), (3) providing basic biology insights of transduced stem cells in vivo.

The paper is available here https://academic.oup.com/bib/article/24/1/bbac551/6955274?login=false

Visit the package website

You can visit the package website to view documentation, vignettes and more.

Installation and options

ISAnalytics can be installed quickly in different ways:

  • You can install it via Bioconductor
  • You can install it via GitHub using the package devtools

There are always 2 versions of the package active:

  • RELEASE is the latest stable version
  • DEVEL is the development version, it is the most up-to-date version where all new features are introduced

Installation from bioconductor

RELEASE version:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("ISAnalytics")

DEVEL version:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

# The following initializes usage of Bioc devel
BiocManager::install(version='devel')

BiocManager::install("ISAnalytics")

Installation from GitHub

RELEASE:

if (!require(devtools)) {
  install.packages("devtools")
}
devtools::install_github("calabrialab/ISAnalytics",
                         ref = "RELEASE_3_17",
                         dependencies = TRUE,
                         build_vignettes = TRUE)

DEVEL:

if (!require(devtools)) {
  install.packages("devtools")
}
devtools::install_github("calabrialab/ISAnalytics",
                         ref = "devel",
                         dependencies = TRUE,
                         build_vignettes = TRUE)

Setting options

ISAnalytics has a verbose option that allows some functions to print additional information to the console while they’re executing. To disable this feature do:

# DISABLE
options("ISAnalytics.verbose" = FALSE)

# ENABLE
options("ISAnalytics.verbose" = TRUE)

Some functions also produce report in a user-friendly HTML format, to set this feature:

# DISABLE HTML REPORTS
options("ISAnalytics.reports" = FALSE)

# ENABLE HTML REPORTS
options("ISAnalytics.reports" = TRUE)

NEWS

Show more

ISAnalytics 1.11.2 (2023-07-26)

FIXES

  • Fixed issues in html report for outlier filtering - reported incorrect numbers due to missing conversion in percentage
  • Fixed warnings for bslib::nav deprecation
  • Fixed minor issue in default_af_transform(), transformation failed if NAs were present in the columns

ISAnalytics 1.11.1 (2023-05-09)

FIXES

  • Fixed broken tests with new updates in underlying packages

ISAnalytics 1.9.3 (2023-04-04)

ENHANCES AND REFACTORING

  • The package no longer depends on magrittr
  • All functionality associated with data.table now it’s completely optional and will be used internally only if the package is available
  • Several packages were moved from Imports to Suggests - functions will notify when additional packages are requested for the specific functionality
  • All known deprecated or superseded functions from other packages have been removed or substituted

FIXES AND GENERAL UPDATES

  • Added a new tag “barcode_mux” in available_tags()
  • The function HSC_population_size_estimate() now better supports the computation of estimates from different groups of cell types and tissues at the same time. The tabular output now contains an additional column “Timepoints_included” that specifies how many time points the estimate contains
  • Function is_sharing() can now handle better limit cases and has the option of being parallelised provided appropriate packages are available (better performance)

DEPRECATIONS & BREAKING CHANGES

  • Functions import_parallel_Vispa2Matrices_auto() and import_parallel_Vispa2Matrices_interactive() are officially defunct and will not be exported anymore starting from the next release cycle
  • The argument mode of import_parallel_Vispa2Matrices() no longer accepts INTERACTIVE as a valid option and the interactive mode is considered now defunct, since the usage is very limiting and limited
  • The argument association_file of import_parallel_Vispa2Matrices() no longer accepts a string representing a path. Association file import is delegated solely to its dedicated function from now on.
  • The function threshold_filter() is deprecated, since its use is rather complicated instead of using standard filtering with dplyr or similar tools

MINOR CHANGES

  • default_af_transform() now pads time points based on the maximum number of characters + 1 in the column

ISAnalytics 1.9.2 (2023-01-26)

FIXES AND GENERAL UPDATES

  • Fixed an issue with ifelse function in top_abund_tableGrob() - now the function has a new argument transform_by which is useful for controlling ordering of columns
  • Updated CITATION file
  • Package DT has been moved (likely temporarily) in Imports - linked to issue #2
  • Fixed other typos and minor issues

ISAnalytics 1.9.1 (2022-12-01)

  • Fixed all tidyselect warnings (internal use of .data$ in selection context)
  • Added bonferroni correction in gene_frequency_fisher

ISAnalytics 1.7.7 (2022-10-26)

  • Added new section in workflow vignette + sample files

ISAnalytics 1.7.6 (2022-10-25)

  • Fixed minor bugs and typos

ISAnalytics 1.7.5 (2022-10-05)

  • Fixed build issues

ISAnalytics 1.7.4 (2022-10-04)

VISIBLE USER CHANGES

  • Progress bars for long processing functions are now implemented via the package progressr, added a wrapper function for fast enabling progress bars, enable_progress_bars()
  • Introduced logging for issues in HSC_population_size_estimate() - signals eventual problems in computing estimates and why

BUG FIXES AND MINOR CHANGES

  • Fixed minor bugs and typos

ISAnalytics 1.7.3 (2022-06-17)

BUG FIXES AND MINOR CHANGES

  • All functions that check for options now have a default value if option is not set
  • CIS_grubbs function is now faster (removed dependency from psych::describe)

NEW

  • New functions CIS_grubbs_overtime() and associated plotting function top_cis_overtime_heatmap() to compute CIS_grubbs test over time

ISAnalytics 1.7.2 (2022-05-23)

BUG FIXES AND MINOR CHANGES

  • Fixed minor issues in import_association_file() - function had minor issues when importing *.xlsx files and missing optional columns threw errors
  • Fixed bug in as_sparse_matrix() - function failed when trying to process an aggregated matrix

NEW

  • Added 2 new utility functions export_ISA_settings() and import_ISA_settings() that allow a faster workflow setup

ISAnalytics 1.7.1 (2022-05-04)

BUG FIXES AND MINOR CHANGES

  • Fixed minor issue in compute_near_integrations() - function errored when report_path argument was set to NULL
  • Fixed dplyr warning in integration_alluvial_plot() internals
  • Fixed issue with report of VISPA2 stats - report failed due to minor error in rmd fragment
  • Internals of remove_collisions() use again dplyr internally for joining and grouping operations - needed because of performance issues with data.table
  • fisher_scatterplot() has 2 new arguments that allow the disabling of highlighting for some genes even if their p-value is under the threshold

ISAnalytics 1.5.4 (2022-04-20)

MAJOR CHANGES

  • ISAnalytics has now a new “dynamic vars system” to allow more flexibility on user inputs, view the dedicated vignette with vignette("workflow_start", package="ISAnalytics")
  • All package functions were reviewed to work properly with this system

NEW FEATURES

  • gene_frequency_fisher() is a new function of the analysis family that allows the computation of Fisher’s exact test p-values on gene frequency - fisher_scatterplot() is the associated plotting function
  • top_targeted_genes() is a new function of the analysis family that produces the top n targeted genes based on the number of IS
  • NGSdataExplorer() is a newly implemented Shiny interface that allows the exploration and plotting of data
  • zipped examples were removed from the package to contain size. To compensate, the new function generate_default_folder_structure() generates the standard folder structure with package-included data on-demand
  • transform_columns() is a new utility function, also used internally by other exported functions, that allows arbitrary transformations on data frame columns

MINOR CHANGES

  • remove_collisions() now has a dedicated parameter to specify how independent samples are identified
  • compute_near_integration_sites() now has a parameter called additional_agg_lambda() to allow aggregation of additional columns
  • CIS_grubbs() now signals if there are missing genes in the refgenes table and eventually returns them as a df
  • outlier_filter() is now able to take multiple tests in input and combine them with a given logic. It now also produces an HTML report.
  • Several functions now use data.table under the hood
  • Color of the strata containing IS below threshold can now be set in integration_alluvial_plot()

BUG FIXES

  • Fixed a minor bug in import_Vispa2_stats() - function failed when passing report_path = NULL
  • Fixed minor issue in circos_genomic_density() when trying to use a pdf device

DEPRECATED FUNCTIONS

  • unzip_file_system() was made defunct in favor of generate_default_folder_structure()
  • cumulative_count_union() was deprecated and its functionality was moved to cumulative_is()

ISAnalytics 1.5.3 (2022-01-13)

MINOR CHANGES

  • Added arguments fragmentEstimate_column and fragmentEstimate_threshold in HSC_population_size_estimate(). Slightly revised filtering logic.
  • Updated package logo and website

ISAnalytics 1.5.2 (2021-12-14)

NEW (MINOR)

  • Added function to check for annotation problems in IS matrices

MINOR CHANGES

  • Added argument max_workers in function remove_collisions()
  • Updated default functions for aggregate_metadata()
  • Added annotation issues section in import matrices report

FIXES

  • Fixed minor issue in internals for file system alignment checks
  • Fixed minor issue in internal call to import_Vispa2_stats() from import_association_file()
  • Added safe computation of sharing in remove_collisions(): if process fails function doesn’t stop

ISAnalytics 1.5.1 (2021-10-28)

FIXES

  • Attempt to fix issues with parallel computation on Windows for some plotting functions

ISAnalytics 1.3.9 (2021-10-25)

FIXES

  • Fixed issues with function that make use of BiocParallel that sometimes failed on Windows platform

ISAnalytics 1.3.7 (2021-10-20)

NEW

  • Added new feature iss_source()

FIXES

  • Fixed minor issues in data files refGenes_mm9 and function compute_near_integrations()

ISAnalytics 1.3.6 (2021-10-05)

NEW

  • Added new feature purity_filter()

FIXES

  • Fixed small issue in printing information in reports

ISAnalytics 1.3.5 (2021-09-21)

MAJOR CHANGES

  • Reworked is_sharing() function, detailed usage in vignette vignette("sharing_analyses", package = "ISAnalytics")

NEW

  • New function cumulative_is()
  • New function for plotting sharing as venn/euler diagrams sharing_venn()

ISAnalytics 1.3.4 (2021-08 -03)

FIXES/MINOR UPDATES

  • Fixed issue in tests that lead to broken build
  • Slightly modified included data set for better examples

ISAnalytics 1.3.3 (2021-07-30)

MAJOR CHANGES

  • Completely reworked interactive HTML report system, for details take a look at the new vignette vignette("report_system", package = "ISAnalytics")
  • Old ISAnalytics.widgets option has been replaced by ISAnalytics.reports
  • In remove_collisions(), removed arguments seq_count_col, max_rows_reports and save_widget_path, added arguments quant_cols and report_path (see documentation for details)

MINOR CHANGES

  • import_single_Vispa2Matrix() now allows keeping additional non-standard columns
  • compute_near_integrations() is now faster on bigger data sets
  • Changed default values for arguments columns and key in compute_abundance()
  • compute_near_integrations() now produces only re-calibration map in *.tsv format
  • CIS_grubbs() now supports calculations for each group specified in argument by
  • In sample_statistics() now there is the option to include the calculation of distinct integration sites for each group (if mandatory vars are present)

NEW FUNCTIONALITY

  • Added new plotting function circos_genomic_density()

FIXES

  • Fixed minor issue with NA values in alluvial plots

DEPRECATIONS

  • import_parallel_Vispa2Matrices_interactive() and import_parallel_Vispa2Matrices_auto() are officially deprecated in favor of import_parallel_Vispa2Matrices()

OTHER

  • The package has now a more complete and functional example data set for executable examples
  • Reworked documentation

ISAnalytics 1.3.2 (2021-06-28)

FIXES

  • Corrected issues in man pages

ISAnalytics 1.3.1 (2021-06-24)

NEW FUNCTIONALITY

  • is_sharing computes the sharing of IS between groups
  • sharing_heatmap allows visualization of sharing data through heatmaps
  • integration_alluvial_plot allows visualization of integration sites distribution in groups over time.
  • top_abund_tableGrob can be used in combination with the previous function or by itself to obtain a summary of top abundant integrations as an R graphic (tableGrob) object that can be combined with plots.

MINOR UPDATES

  • Added more default stats functions to default_stats
  • Added optional automatic conversion of time points in months and years when importing association file
  • Minor fixes in generate_Vispa2_launch_AF

ISAnalytics 1.1.11 (2021-05-11)

NEW FUNCTIONALITY

  • HSC_population_size_estimate and HSC_population_plot allow estimates on hematopoietic stem cell population size
  • Importing of Vispa2 stats per pool now has a dedicated function, import_Vispa2_stats
  • outlier_filter and outliers_by_pool_fragments offer a mean to filter poorly represented samples based on custom outliers tests

VISIBLE USER CHANGES

  • The argument import_stats of aggregate_metadata is officially deprecated in favor of import_Vispa2_stats
  • aggregate_metadata is now a lot more flexible on what operations can be performed on columns via the new argument aggregating_functions
  • import_association_file allows directly for the import of Vispa2 stats and converts time points to months and years where not already present
  • File system alignment of import_association_file now produces 3 separate columns for paths
  • separate_quant_matrices and comparison_matrix now do not require mandatory columns other than the quantifications - this allows for separation or joining also for aggregated matrices

FIXES

  • Fixed a minor issue in CIS_volcano_plot that caused duplication of some labels if highlighted genes were provided in input

ISAnalytics 1.1.10 (2021-04-08)

FIXES

  • Fixed issue in compute_near_integrations: when provided recalibration map export path as a folder now the function works correctly and produces an automatically generated file name
  • Fixed issue in aggregate_metadata: now paths to folder that contains Vispa2 stats is looked up correctly. Also, VISPA2 stats columns are aggregated if found in the input data frame independently from the parameter import_stats.

IMPROVEMENTS

  • compute_abundance can now take as input aggregated matrices and has additional parameters to offer more flexibility to the user. Major updates and improvements also on documentation and reproducible examples.
  • Major improvements in function import_single_Vispa2Matrix: import is now preferentially carried out using data.table::fread greatly speeding up the process - where not possible readr::read_delim is used instead
  • Major improvements in function import_association_file: greatly improved parsing precision (each column has a dedicated type), import report now signals parsing problems and their location and signals also problems in parsing dates. Report also includes potential problems in column names and signals missing data in important columns. Added also the possibility to give various file formats in input including *.xls(x) formats.
  • Function top_integrations can now take additional parameters to compute top n genes for each specified group
  • Removed faceting parameters in CIS_volcano_plot due to poor precision (easier to add faceting manually) and added parameters to return the data frame that generated the plot as an additional result. Also, it is now possible to specify a vector of gene names to highlight even if they’re not above the annotation threshold.

MINOR

  • ISAnalytics website has improved graphic theme and has an additional button on the right that leads to the devel (or release) version of the website
  • Updated vignettes

FOR DEVS ONLY

  • Complete rework of test suite to be compliant to testthat v.3

ISAnalytics 1.1.9 (2021-02-17)

FIXES

  • Fixed minor issues in internal functions with absolute file paths & corrected typos

ISAnalytics 1.1.8 (2020-02-15)

FIXES

  • Fixed minor issues in internal functions to optimize file system alignment

ISAnalytics 1.1.7 (2020-02-10)

FIXES

  • Fixed minor issues in import_association_file when checking parameters

ISAnalytics 1.1.6 (2020-02-06)

UPGRADES

  • It is now possible to save html reports to file from import_parallel_Vispa2Matrices_auto and import_parallel_Vispa2Matrices_interactive, remove_collisions and compute_near_integrations

FIXES

  • Fixed sample_statistics: now functions that have data frame output do not produce nested tables. Flat tables are ready to be saved to file or can be nested.
  • Simplified association file check logic in remove_collisions: now function blocks only if the af doesn’t contain the needed columns

ISAnalytics 1.1.5 (2020-02-03)

UPGRADES

  • Upgraded import_association_file function: now file alignment is not mandatory anymore and it is possible to save the html report to file
  • Updated vignettes and documentation

ISAnalytics 1.1.4 (2020-11-16)

UPGRADES

  • Greatly improved reports for collision removal function
  • General improvements for all widget reports

ISAnalytics 1.1.3 (2020-11-10)

FIXES

  • Further fixes for printing reports when widgets not available
  • Added progress bar to collision processing in remove_collisions
  • Updated vignettes

NEW

  • Added vignette “Using ISAnalytics without RStudio support”

ISAnalytics 1.1.2 (2020-11-05)

FIXES

  • Fixed missing restarts for non-blocking widgets

ISAnalytics 1.1.1 (2020-11-04)

FIXES

  • Functions that make use of widgets do not interrupt execution anymore if errors are thrown while producing or printing the widgets
  • Optimized widget printing for importing functions
  • If widgets can’t be printed and verbose option is active, reports are now displayed on console instead (needed for usage in environments that do not have access to a browser)
  • Other minor fixes (typos)
  • Bug fixes: fixed a few bugs in importing and recalibration functions
  • Minor fix in import_association_file file function: added multiple strings to be translated as NA

IMPORTANT NOTES

  • Vignette building might fail due to the fact that package “knitcitations” is temporarily unavailable through CRAN
  • ISAnalytics is finally in release on bioconductor!

ISAnalytics 0.99.14 (2020-10-21)

  • Minor fixes in tests

ISAnalytics 0.99.13 (2020-10-19)

NEW FEATURES

  • Added analysis functions CIS_grubbs and cumulative_count_union
  • Added plotting functions CIS_volcano_plot

ISAnalytics 0.99.12 (2020-10-04)

NEW FEATURES

  • Added analysis function sample_statistics

SIGNIFICANT USER-VISIBLE CHANGES

  • aggregate_values_by_key has a simplified interface and supports multi-quantification matrices

MINOR CHANGES

  • Updated vignettes
  • import_parallel_Vispa2Matrices_interactive and import_parallel_Vispa2Matrices_auto now have an option to return a multi-quantification matrix directly after import instead of a list

ISAnalytics 0.99.11 (2020-09-21)

NEW FEATURES

  • Added analysis functions threshold_filter, top_integrations
  • Added support for multi-quantification matrices in compute_abundance

MINOR FIXES

  • Fixed bug in comparison_matrix that ignored custom column names
  • Fixed issues in some documentation pages

ISAnalytics 0.99.10 (2020-09-14)

ISanalytics is officially on bioconductor!

NEW FEATURES

  • Added analysis functions comparison_matrix and separate_quant_matrices
  • Added utility function as_sparse_matrix
  • Added package logo

SIGNIFICANT USER-VISIBLE CHANGES

  • Changed algorithm for compute_near_integrations
  • Added support for multi-quantification matrices to remove_collisions
  • Added usage of lifecycle badges in documentation: users can now see if a feature is experimental/maturing/stable etc

MINOR FIXES

  • Added fix for import_single_Vispa2Matrix to remove non significant 0 values

ISAnalytics 0.99.9 (2020-09-01)

NEW FEATURES

  • Added functionality: aggregate functions
  • Added vignette on aggregate functions
  • Added recalibration functions
  • Added first analysis function (compute_abundance)

SIGNIFICANT USER-VISIBLE CHANGES

  • Dropped structure ISADataFrame: now the package only uses standard tibbles
  • Modified package documentation

ISAnalytics 0.99.8 (2020-08-12)

  • Submitted to Bioconductor

Getting help

For help please contact the maintainer of the package or open an issue on GitHub.

About

ISAnalytics is an R package developed to analyze gene therapy vector insertion sites data identified from genomics next generation sequencing reads for clonal tracking studies.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages