Small clarifications to the report

fmicompbio · May 4, 2024 · fbe7a21 · fbe7a21
1 parent 3359934
commit fbe7a21
Show file tree

Hide file tree

Showing 2 changed files with 39 additions and 18 deletions.
diff --git a/R/textSnippets.R b/R/textSnippets.R
@@ -74,7 +74,8 @@ testText <- function(testType, minlFC = 0, samSignificance = TRUE) {
                "feature, see section 13.2 in the [limma user guide]",
                "(https://www.bioconductor.org/packages/devel/bioc/vignettes",
                "/limma/inst/doc/usersguide.pdf). ",
-               "In addition to the feature-wise tests, we apply the camera ",
+               "If requested, in addition to the feature-wise tests, we ",
+               "apply the camera ",
                "method [@Wu2012camera] to test for significance of each ",
                "included feature collection. These tests are based on the ",
                "t-statistics returned from limma.")
@@ -91,7 +92,8 @@ testText <- function(testType, minlFC = 0, samSignificance = TRUE) {
                "feature, see section 13.2 in the [limma user guide]",
                "(https://www.bioconductor.org/packages/devel/bioc/vignettes",
                "/limma/inst/doc/usersguide.pdf). ",
-               "In addition to the feature-wise tests, we apply the camera ",
+               "If requested, in addition to the feature-wise tests, we ",
+               "apply the camera ",
                "method [@Wu2012camera] to test for significance of each ",
                "included feature collection. These tests are based on the ",
                "t-statistics returned from limma.")
@@ -103,7 +105,8 @@ testText <- function(testType, minlFC = 0, samSignificance = TRUE) {
                "statistic [@Tusher2001sam], and estimate the false ",
                "discovery rate at different thresholds using permutations, ",
                "mimicking the approach used by Perseus [@Tyanova2016perseus]. ",
-               "In addition to the feature-wise tests, we apply the camera ",
+               "If requested, in addition to the feature-wise tests, we ",
+               "apply the camera ",
                "method [@Wu2012camera] to test for significance of each ",
                "included feature collection. These tests are based on the ",
                "SAM statistics calculated from the t-statistics and the ",
@@ -115,7 +118,8 @@ testText <- function(testType, minlFC = 0, samSignificance = TRUE) {
                "features show significant changes, we calculate ",
                "adjusted p-values using the Benjamini-Hochberg method ",
                "[@BenjaminiHochberg1995fdr]. ",
-               "In addition to the feature-wise tests, we apply the camera ",
+               "If requested, in addition to the feature-wise tests, we ",
+               "apply the camera ",
                "method [@Wu2012camera] to test for significance of each ",
                "included feature collection. These tests are based on the ",
                "t-statistics.")
@@ -125,7 +129,8 @@ testText <- function(testType, minlFC = 0, samSignificance = TRUE) {
                "For this, we use the ",
                "[proDA](https://bioconductor.org/packages/proDA/) ",
                "R/Bioconductor package [@AhlmannEltze2020proda]. ",
-               "In addition to the feature-wise tests, we apply the camera ",
+               "If requested, in addition to the feature-wise tests, we ",
+               "apply the camera ",
                "method [@Wu2012camera] to test for significance of each ",
                "included feature collection. These tests are based on the ",
                "t-statistics returned from proDA.")

diff --git a/inst/extdata/process_basic_template.Rmd b/inst/extdata/process_basic_template.Rmd
@@ -171,6 +171,9 @@ cat("\n````\n\n")
 
 # Settings {#settings-table}
 
+The table below provides a summary of the settings that were specified when 
+running `einprot`. 
+
 ```{r settings-table}
 settingsList <- list(
     "Include only samples (if applicable)" = paste(includeOnlySamples, 
@@ -356,7 +359,9 @@ DT::datatable(as.data.frame(colData(sce)),
 # Overview of the workflow
 
 We already now define the names of the assays we will be generating 
-and using later in the workflow. 
+and using later in the workflow. The first column in the table below 
+contains generic names representing the 'stage' that each assay 
+corresponds to. The second column contains the actual assay names. 
 
 ```{r define-assaynames}
 aNames <- defineAssayNames(aName = aName, normMethod = normMethod, 
@@ -633,7 +638,7 @@ plotImputationDistribution(sce, assayToPlot = aNames$assayImputed,
 
 # Overall distribution of log2 feature intensities
 
-Next we consider the overall distribution of log2-intensities among the 
+The boxplots below show the overall distribution of log2-intensities among the 
 samples (after imputation).
 
 ```{r intensity-distribution-imputed, fig.width = min(14, max(7, 0.5 * ncol(sce))), fig.height = 5/7 * min(14, max(7, 0.5 * ncol(sce)))}
@@ -892,12 +897,12 @@ to generate STRING networks [@Szklarczyk2021string] (separately for the
 up- and downregulated ones), which are included in the pdf file. Any features 
 explicitly requested (see the [table above](#settings-table)) are also labeled
 in the volcano plots. 
-In addition to these pdf files, if "complexes" is specified to be included in 
-the feature collections (and tested for significance using camera), we also 
+In addition to these pdf files, if any feature collection is specified 
+(and tested for significance using camera), we also 
 generate a multi-page pdf file showing the position of the features of each 
-significantly differentially abundant complex in the volcano plot, as well 
+significantly differentially abundant collection in the volcano plot, as well 
 as bar plots of the features' abundance values in the compared samples. This pdf 
-file is only generated if there is at least one significant complex (with 
+file is only generated if there is at least one significant collection (with 
 adjusted p-value below the specified complexFDRThr=`r complexFDRThr`). 
 
 
@@ -1089,9 +1094,13 @@ for (nm in names(testres$topsets)) {
 
 # Table with direct database links to sequences, functional information and predicted structures {#linktable}
 
-The table below provides autogenerated links to the UniProt and 
-AlphaFold pages (as well as selected organism-specific databases) for the 
-majority protein IDs corresponding to each feature in the data set. 
+The table below provides autogenerated links to the 
+[UniProt](https://www.uniprot.org/), 
+[AlphaFold](https://alphafold.ebi.ac.uk/), 
+[Complex Portal](https://www.ebi.ac.uk/complexportal/home) and 
+[BioGRID pages](https://thebiogrid.org/) (as well as selected 
+organism-specific databases) for the 
+protein IDs corresponding to each feature in the data set. 
 The 'pid' column represents the unique feature ID used by `einprot`, and 
 the `einprotLabel` column contains the user-defined feature labels.
 UniProt is a resource of protein sequence and functional information 
@@ -1101,7 +1110,8 @@ predictions for the human proteome and other key proteins of interest.
 Note that (depending on the species) many proteins are not yet covered in 
 AlphaFold (in this case, the link below will lead to a non-existent page), and 
 that numeric values are rounded to four significant digits to increase 
-readability. 
+readability. The table can be filtered and searched, and exported to either 
+csv or Excel format. 
 
 ```{r linktable, warning = FALSE}
 linkTable <- makeDbLinkTable(
@@ -1227,11 +1237,11 @@ interactivePCAs
 # Heatmap with hierarchical clustering
 
 For another birds-eye view of the data, we represent it using a heatmap of 
-the (imputed and normalized) log intensities, and cluster the samples and 
+the log intensities (imputed and normalized using the methods defined above), 
+and cluster the samples and 
 proteins using hierarchical clustering. In the first heatmap below, the values 
 represent the normalized log intensities directly. In the second heatmap, 
-the values for each protein have been centered to mean 0. The latter is also 
-exported to a pdf file with row labels for further exploration. 
+the values for each protein have been centered to mean 0.  
 
 ```{r heatmap, message=FALSE, fig.width = min(14, max(7, 0.5 * ncol(sce))), fig.height = 8/7 * min(14, max(7, 0.5 * ncol(sce)))}
 if (addHeatmaps) {
@@ -1246,6 +1256,12 @@ if (addHeatmaps) {
 }
 ```
 
+A mean-centered heatmap is also exported to a pdf file with row labels for 
+further exploration. In this heatmap, no row dendrogram is displayed, and 
+samples are split by the `group` annotation and clustered within each such 
+group. In addition, an annotation column with the fraction of missing values for 
+each feature is added. 
+
 ```{r save-heatmap, results="hide"}
 ## Save to pdf (show row names, but no row dendrogram, order samples by group)
 if (addHeatmaps) {