Skip to content

Commit

Permalink
downsize vignette
Browse files Browse the repository at this point in the history
  • Loading branch information
arnesmits committed Aug 23, 2017
1 parent 0b8d4c9 commit 1cccb1c
Showing 1 changed file with 16 additions and 30 deletions.
46 changes: 16 additions & 30 deletions vignettes/DEP.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ It requires tabular input (e.g. txt files) as generated by quantitative analysis
Functions are provided for data preparation, filtering, variance normalization and imputation of missing values, as well as statistical testing of differentially enriched / expressed proteins.
It also includes tools to check intermediate steps in the workflow, such as normalization and missing values imputation.
Finally, visualization tools are provided to explore the results, including heatmap, volcano plot and barplot representations.
For scientists with limited experience in R, the package also contains wrapper functions that entail the complete analysis workflow and generate a report (see the chapter on [Wrapper functions](#wrapper-functions-for-the-entire-workflow)).
For scientists with limited experience in R, the package also contains wrapper functions that entail the complete analysis workflow and generate a report (see the chapter on [Workflow functions](#workflow-functions-for-the-entire-analysis)).
Even easier to use are the interactive Shiny apps that are provided by the package (see the chapter on [Shiny apps](#interactive-analysis-using-the-dep-shiny-apps)).

# Installation
Expand Down Expand Up @@ -296,16 +296,16 @@ These visualizations assist in the determination of the optimal cutoffs to be us
The PCA plot can be used to get a high-level overview of the data.
This can be very useful to observe batch effects, such as clear differences between replicates.

``` {r pca, fig.height = 4, fig.width = 5}
``` {r pca, fig.height = 3, fig.width = 4}
# Plot the first and second principal components
plot_pca(dep, x = 1, y = 2, n = 500)
plot_pca(dep, x = 1, y = 2, n = 500, point_size = 4)
```

### Correlation matrix

A correlation matrix can be plotted as a heatmap, to visualize the Pearson correlations between the different samples.

``` {r corr, fig.height = 4, fig.width = 6}
``` {r corr, fig.height = 3, fig.width = 5}
# Plot the Pearson correlation matrix
plot_cor(dep, significant = TRUE, lower = 0, upper = 1, pal = "Reds")
```
Expand All @@ -316,10 +316,10 @@ The heatmap representation gives an overview of all significant proteins (rows)
This allows to see general trends, for example if one sample or replicate is really different compared to the others.
Additionally, the clustering of samples (columns) can indicate closer related samples and clustering of proteins (rows) indicates similarly behaving proteins. The proteins are clustered by k-means clustering and the number of clusters can be defined by argument _k_.

``` {r heatmap, fig.height = 6, fig.width = 4}
``` {r heatmap, fig.height = 5, fig.width = 3}
# Plot a heatmap of all significant proteins with the data centered per protein
plot_heatmap(dep, type = "centered", kmeans = TRUE,
k = 6, col_limit = 4, row_font_size = 2)
k = 6, col_limit = 4, show_row_names = FALSE)
```

The heatmap shows a clustering of replicates and indicates that 4Ubi and 6Ubi enrich a similar repertoire of proteins. The k-means clustering of proteins (general clusters of rows) nicely separates protein classes with different binding behaviors.
Expand All @@ -328,10 +328,10 @@ The heatmap shows a clustering of replicates and indicates that 4Ubi and 6Ubi en

Alternatively, a heatmap can be plotted using the contrasts, ie. the direct sample comparisons, as columns. Here, this emphasises the enrichment of ubiquitin interactors compared to the control sample.

``` {r heatmap2, fig.height = 6, fig.width = 4}
``` {r heatmap2, fig.height = 5, fig.width = 3}
# Plot a heatmap of all significant proteins (rows) and the tested contrasts (columns)
plot_heatmap(dep, type = "contrast", kmeans = TRUE,
k = 6, col_limit = 10, row_font_size = 2)
k = 6, col_limit = 10, show_row_names = FALSE)
```

### Volcano plots of specific contrasts
Expand Down Expand Up @@ -413,15 +413,15 @@ save(data_se, data_norm, data_imp, data_diff, dep, file = "data.RData")
load("data.RData")
```

# Wrapper functions for the entire workflow
# Workflow functions for the entire analysis

The package contains wrapper functions that entail the complete analysis workflow and generate a report.
The package contains workflow functions that entail the complete analysis and generate a report.

## LFQ-based DEP analysis

Differential enrichment analysis of label-free proteomics data can be performed using the _LFQ_ wrapper function.
Differential enrichment analysis of label-free proteomics data can be performed using the _LFQ_ workflow function.

``` {r LFQ}
``` {r LFQ, message = FALSE}
# The data is provided with the package
data <- UbiLength
experimental_design <- UbiLength_ExpDesign
Expand All @@ -434,28 +434,19 @@ data_results <- LFQ(data, experimental_design, fun = "MinProb",
This wrapper produces a list of objects, which can be used to create a report and/or for further analysis.
The _report_ function produces two reports (pdf and html), a results table and a RData object, which are saved in a generated "Report" folder.

``` {r report2, results = "hide"}
``` {r report2, eval = FALSE}
# Make a markdown report and save the results
report(data_results)
```

The results generated by the LFQ function contain, among others, the _results_ object (data.frame object) and the _sign_ object (SummarizedExperiment object).
The results generated by the LFQ function contain, among others, the _results_ object (data.frame object) and the _dep_ object (SummarizedExperiment object).

``` {r report1}
# See all objects saved within the results object
names(data_results)
```

The _results_ object contains the essential results and the _sign_ object contains the full SummarizedExperiment object which can be used for further visualization.

``` {r LFQ_results}
# Colnames of the results object
colnames(data_results$results)
# the dep object
data_results$dep
```

The _results_ object contains the essential results and the _dep_ object contains the full SummarizedExperiment object.
The results table can be explored by selecting the _$results_ object

``` {r LFQ_results3}
Expand All @@ -466,7 +457,7 @@ results_table <- data_results$results
results_table %>% filter(significant) %>% nrow()
```

The full data (_sign_ object) can be used for the plotting functions as described in the chapter ["Visualization of the results"](#visualization-of-the-results).
The full data (_dep_ object) can be used for the plotting functions as described in the chapter ["Visualization of the results"](#visualization-of-the-results), for example a volcano plot.

``` {r LFQ_results4, fig.height = 5, fig.width = 5}
# Extract the sign object
Expand All @@ -476,11 +467,6 @@ full_data <- data_results$dep
plot_volcano(full_data, "Ubi4_vs_Ctrl", label_size = 2, add_names = TRUE)
```

``` {r LFQ_results5, fig.height = 6, fig.width = 3}
# Use the full data to generate a heatmap
plot_heatmap(full_data, "contrast", row_font_size = 2)
```

## TMT-based DEP analysis

Differential enrichment analysis of Tandem-Mass-Tag labeled proteomics data is also supported.
Expand Down

0 comments on commit 1cccb1c

Please sign in to comment.