-
Notifications
You must be signed in to change notification settings - Fork 2
/
MSstatsPTM_TMT_Workflow.Rmd
161 lines (124 loc) · 5.3 KB
/
MSstatsPTM_TMT_Workflow.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
---
title: "MSstatsPTM TMT Workflow"
author: "Devon Kohler (<kohler.d@northeastern.edu>)"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{MSstatsPTM TMT Workflow}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.width=8,
fig.height=8
)
```
```{r, message=FALSE, warning=FALSE}
library(MSstatsPTM)
```
This Vignette provides an example workflow for how to use the package
MSstatsPTM for a TMT dataset.
## Installation
To install this package, start R (version "4.0") and enter:
``` {r, eval = FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("MSstatsPTM")
```
## 1. Workflow
### 1.1 Raw Data Format
**Note: We are actively developing dedicated converters for MSstatsPTM. If you
have data from a processing tool that does not have a dedicated converter in
MSstatsPTM please add a github issue
`https://github.com/Vitek-Lab/MSstatsPTM/issues` and we will add the
converter.**
We go in depth on all converters included in this package in the
`MSstatsPTM_LabelFree_Workflow`. For more information about data conversion
please review the relevant sections there.
### 1.2 Summarization - dataSummarizationPTM_TMT
After loading in the input data, the next step is to use the
dataSummarizationPTM_TMT function This provides the summarized dataset needed to
model the protein/PTM abundance. The function will summarize the
Protein dataset up to the protein level and will summarize the PTM dataset up to
the peptide level. There are multiple options for normalization and missing
value imputation. These options should be reviewed in the package documentation.
```{r summarize, echo=FALSE, message=FALSE, warning=FALSE}
MSstatsPTM.summary <- dataSummarizationPTM_TMT(raw.input.tmt,
use_log_file = FALSE,
append = FALSE, verbose = FALSE)
```
```{r show_summ}
head(MSstatsPTM.summary$PTM$ProteinLevelData)
head(MSstatsPTM.summary$PROTEIN$ProteinLevelData)
```
The summarize function returns a list with PTM and Protein summarization
information.
### 1.2.1 QCPlot
Once summarized, MSstatsPTM provides multiple plots to analyze the experiment.
Here we show the quality control boxplot. The first plot shows the modified data
and the second plot shows the global protein dataset.
```{r qcplot, message=FALSE, warning=FALSE}
dataProcessPlotsPTM(MSstatsPTM.summary,
type = 'QCPLOT',
which.PTM = "allonly",
address = FALSE)
```
### 1.2.2 Profile Plot
Here we show a profile plot. Again the top plot shows the modified peptide, and
the bottom shows the overall protein.
```{r profileplot, message=FALSE, warning=FALSE}
dataProcessPlotsPTM(MSstatsPTM.summary,
type = 'PROFILEPLOT',
which.Protein = c("Protein_12"),
address = FALSE)
```
### 1.3 Modeling - groupComparisonPTM
After summarization, the summarized datasets can be modeled using the
groupComparisonPTM function. This function will model the PTM and Protein
summarized datasets, and then adjust the PTM model for changes in overall
protein abundance. The output of the function is a list containing these three
models named: `PTM.Model`, `PROTEIN.Model`, `ADJUSTED.Model`.
```{r model, message=FALSE, warning=FALSE}
# Specify contrast matrix
comparison <- matrix(c(1,0,0,-1,0,0,
0,1,0,0,-1,0,
0,0,1,0,0,-1,
1,0,-1,0,0,0,
0,1,-1,0,0,0,
0,0,0,1,0,-1,
0,0,0,0,1,-1),nrow=7, ncol=6, byrow=TRUE)
# Set the names of each row
row.names(comparison)<-c('1-4', '2-5', '3-6', '1-3',
'2-3', '4-6', '5-6')
colnames(comparison) <- c('Condition_1','Condition_2','Condition_3',
'Condition_4','Condition_5','Condition_6')
MSstatsPTM.model <- groupComparisonPTM(MSstatsPTM.summary,
data.type = "TMT",
contrast.matrix = comparison,
use_log_file = FALSE, append = FALSE)
head(MSstatsPTM.model$PTM.Model)
head(MSstatsPTM.model$PROTEIN.Model)
head(MSstatsPTM.model$ADJUSTED.Model)
```
### 1.3.1 Volcano Plot
The models from the `groupComparisonPTM` function can be used in the model
visualization function, `groupComparisonPlotsPTM`. Here we show Volcano Plots
for the models.
``` {r volcano, message=FALSE, warning=FALSE}
groupComparisonPlotsPTM(data = MSstatsPTM.model,
type = "VolcanoPlot",
which.Comparison = c('1-4'),
which.PTM = 1:50,
address=FALSE)
```
### 1.3.2 Heatmap Plot
Here we show a Heatmap for the models.
``` {r meatmap, message=FALSE, warning=FALSE}
groupComparisonPlotsPTM(data = MSstatsPTM.model,
type = "Heatmap",
which.PTM = 1:49,
address=FALSE)
```