/
CytoExploreR-Transformations.Rmd
432 lines (354 loc) · 20.4 KB
/
CytoExploreR-Transformations.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
---
title: "Data Transformations"
author: Dillon Hammill
date: "`r Sys.Date()`"
output:
rmarkdown::html_vignette:
includes:
in_header: logo.html
vignette: >
%\VignetteIndexEntry{Data Transformations}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
# Introduction
<div style="line-height: 1.8em;"> Data transformations are essential for appropriately visualising cytometry data. This requirement for data transformations is due to the large dynamic range of cytometry data makes it difficult visualise both positive and negative events on a linear scale. As you can see below, the values for flow cytometry extend from 0 to 262144 which is an enormous range for fluorescent intensity values. Clearly, visualising the data on this scale is not going to allow us to identify discrete negative and positive populations. </div>
<br>
```{r, eval = FALSE, echo = FALSE}
# Activation GatingSet
library(CytoExploreRData)
gs_linear <- GatingSet(Activation)
```
```{r, eval = FALSE, echo = FALSE}
cyto_plot_save("Transformations-1.png")
cyto_plot(gs_linear[[1]],
parent = "root",
channels = "CD8",
title = "Linear Data",
density_fill = "orange")
cyto_plot_complete()
```
```{r echo = FALSE, fig.align="center", out.width = '60%'}
knitr::include_graphics('Transformations/Transformations-1.png')
```
<br>
<div style="line-height: 1.8em;"> To overcome this visualisation problem, traditionally cytometry data has been transformed using log transformations to improve visualisation and separation of negative and positive events into discrete populations. As you can see below, log transformations handle high fluorescent intensity values quite well, but struggle with values approaching zero. This results in these lower values being dumped on the lower end of the scale, which makes it appear as though the data contains three distinct populations based on fluorescent intensity. This problem is further highlighted when applying compensation to the data as it introduces more values that approach zero and below. </div>
<br>
```{r, eval = FALSE, echo = FALSE}
gs_log <- cyto_transform(cyto_copy(gs_linear),
channels = "CD8",
type = "log",
copy = TRUE)
cyto_plot_save("Transformations-2.png")
cyto_plot_custom(c(1,1))
cyto_plot(gs_log[[1]],
parent = "root",
channels = "CD8",
title = "Log Transformation",
density_fill = "deepskyblue")
text(0.3, 60, "values \n approaching \n zero dumped \n in first bin")
arrows(0.15, 60, 0.05, 40)
rect(-0.08,-14,0.5,-2, border = "red", lwd = 2, xpd = NA)
cyto_plot_complete()
```
```{r echo = FALSE, fig.align="center", out.width = '60%'}
knitr::include_graphics('Transformations/Transformations-2.png')
```
<br>
<div style="line-height: 1.8em;"> Recently, more sophisticated transformations such as the hyperbolic arcsine, biexponential and logicle transformations have been used to overcome the limitations of the log transformation. The result of applying these transformations using the default parameters to the Alexa Fluor 488 channel is shown below for easy comparison. As you can see, all the transformations seem to handle the higher fluorescent intensity values in a similar manner, but differ in their approach to dealing with values nearing zero. For flow cytometry data, the biexponential and logicle transformations seem to provide the best visualisation of the data as discrete negative and positive populations are clearly visible. It is important to note that the optimal transformation for each parameter is data-dependant and users should explore the different transformation types to obtain the best visualisation of the data. This process will be documented later in this vignette. </div>
<br>
```{r, eval = FALSE, echo = FALSE}
gs_arc <- cyto_transform(cyto_copy(gs_linear),
channels = "CD8",
type = "arcsinh",
copy = TRUE)
gs_biex <- cyto_transform(cyto_copy(gs_linear),
channels = "CD8",
type = "biex",
copy = TRUE)
gs_logicle <- cyto_transform(cyto_copy(gs_linear),
channels = "CD8",
type = "logicle",
copy = TRUE)
cyto_plot_save("Transformations-3.png",
width = 10,
height = 10)
cyto_plot_custom(c(2,2))
# LOG
cyto_plot(gs_log[[1]],
parent = "root",
channels = "CD8",
title = "Log Transformation",
density_fill = "deepskyblue")
rect(-0.08,-14,0.5,-2, border = "red", lwd = 2, xpd = NA)
# ARCSINH
cyto_plot(gs_arc[[1]],
parent = "root",
channels = "CD8",
title = "Arcsinh Transformation",
density_fill = "deeppink")
rect(-0.31,-14,0.3,-2, border = "red", lwd = 2, xpd = NA)
# BIEX
cyto_plot(gs_biex[[1]],
parent = "root",
channels = "CD8",
title = "Biexponential Transformation",
density_fill = "green")
rect(-165,-14,1050,-2, border = "red", lwd = 2, xpd = NA)
# LOGICLE
cyto_plot(gs_logicle[[1]],
parent = "root",
channels = "CD8",
title = "Logicle Transformation",
density_fill = "yellow")
rect(0.1,-14,1.15,-2, border = "red", lwd = 2, xpd = NA)
cyto_plot_complete()
```
```{r echo = FALSE, fig.align="center", out.width = '100%'}
knitr::include_graphics('Transformations/Transformations-3.png')
```
<br>
<div style="line-height: 1.8em;"> **CytoExploreR** has full support for log, arcsinh, biexponential and logicle transformations implemented in the flowWorkspace package. In this vignette we will demonstrate how **CytoExploreR** facilitates fine tuning of transformation parameters and how these optimised transformations can be applied cytometry data. For consistency with the RGLab suite of cytometry data analysis packages, **CytoExploreR** uses a series of wrapper functions to add support for these transformations with improved support for customisation. Below is a list of the key functions that you will encounter in this vignette:
- `cyto_transformer_log` implements the `flowjo_log_trans` version of the log transformation.
- `cyto_transformer_arcsinh` implements the `asing_Gml2` version of the arcsinh transformation.
- `cyto_transformer_biex` implements the `flowjo_biexp` version of the biexponential transformation.
- `cyto_transformer_logicle` implements the `estimateLogicle` version of the logicle transformation.
- `cyto_transformer_combine` combines individual transformation definitions into a single list of transformers that can be applied to the data using `cyto_transform`.
- `cyto_transform` is capable of automatically computing transformers and is used to apply a set of transformers to the data. </div>
# Demonstration
<div style="line-height: 1.8em;"> To demonstrate the use of the transformation functions, we will need to download the Activation FCS files shipped with **CytoExploreRData**. If you have not already done so, these FCS files can be easily downloaded to a folder in your current working directory by following these steps: </div>
```{r, eval = FALSE}
# Load required packages
library(CytoExploreR)
library(CytoExploreRData)
# Activation dataset
Activation
# Save Activation dataset FCS files to Activation-Samples folder
cyto_save(Activation, save_as = "Activation-Samples")
```
<div style="line-height: 1.8em;"> Now hat we have the FCS files stored locally, let's setup up the Activation GatingSet using `cyto_setup`. It is recommended that experiment details be filled in at this point as this information can be used to easily select samples to use when customising transformations. If your computer struggles with applying the transformations, it is recommended that you set the **restrict** argument to TRUE in `cyto_setup` to restrict the data to only the parameters for which markers have been assigned. Although not required for this vignette, it is always a good idea to ensure that the data has been compensated prior to data transformations. The spillover matrices attached to each of the files can be applied using `cyto_compensate`. </div>
```{r, eval = FALSE}
# Activation GatingSet
gs <- cyto_setup("Activation-Samples")
# Apply compensation
gs <- cyto_compensate(gs)
```
<div style="line-height: 1.8em;"> Next we will demonstrate the use of the different `cyto_transformer` functions to obtain optimised transformation definitions for each of the fluorescent channels. These transformer functions will automatically pool the supplied data, apply the transformation using the specified parameters to the specified channels and plot the resultant transformed data using `cyto_plot`. The definition of the transformation can be altered by changing the arguments of the underlying flowWorkspace transformer function. It is important to note that these `cyto_transformer` functions do not apply these transformers to the data, they are purely used to optimise the transformers prior to applying the combined transformers to the data using `cyto_transform`. As a result, these `cyto_transformer` functions do not return the transformed data, but instead the transformer definitions that can be applied to the data using `cyto_transform`. </div>
## Batch Transform Fluorescent Channels
<div style="line-height: 1.8em;"> To make things a little easier, all these `cyto_transformer` functions will automatically generate transformers for all fluorescent channels. To obtain transformers for a specific set of parameters. simply supply the names of these channels/markers to the **channels** argument. To use a specific subset of the data to visualise and customise the transformations, users can supply specific slection criteria to the **select** argument. Next we will explore each of these `cyto_transformer` functions to obtain transformation definitions for all the fluorescent channels. This provides a quick overview of all possible transformer types to identify those that provide the best visualisation of the data. </div>
## Log Transformation
```{r, eval = FALSE}
# Default log transformer
trans_log <- cyto_transformer_log(gs)
```
```{r, eval = FALSE, echo = FALSE}
gs_log <- cyto_transform(cyto_copy(gs_linear),
type = "log",
copy = TRUE)
cyto_plot_save("Transformations-4.png",
height = 10,
width = 12)
cyto_plot_profile(gs_log[[1]],
parent = "root",
channels = cyto_fluor_channels(gs_log),
header = "Log Transformers",
density_fill = "deepskyblue")
cyto_plot_complete()
gs_arc <- cyto_transform(cyto_copy(gs_linear),
type = "arcsinh",
copy = TRUE)
cyto_plot_save("Transformations-5.png",
height = 10,
width = 12)
cyto_plot_profile(gs_arc[[1]],
parent = "root",
channels = cyto_fluor_channels(gs_arc),
header = "Arcsinh Transformers",
density_fill = "deeppink")
cyto_plot_complete()
gs_biex <- cyto_transform(cyto_copy(gs_linear),
type = "biex",
copy = TRUE)
cyto_plot_save("Transformations-6.png",
height = 10,
width = 12)
cyto_plot_profile(gs_biex[[1]],
parent = "root",
channels = cyto_fluor_channels(gs_biex),
header = "Biexponential Transformers",
density_fill = "green")
cyto_plot_complete()
gs_logicle <- cyto_transform(cyto_copy(gs_linear),
type = "logicle",
copy = TRUE)
cyto_plot_save("Transformations-7.png",
height = 10,
width = 12)
cyto_plot_profile(gs_logicle[[1]],
parent = "root",
channels = cyto_fluor_channels(gs_logicle),
header = "Logicle Transformers",
density_fill = "yellow")
cyto_plot_complete()
```
```{r echo = FALSE, fig.align="center", out.width = '75%'}
knitr::include_graphics('Transformations/Transformations-4.png')
```
## Arcsinh Transformation
```{r, eval = FALSE}
# Default arcsinh transformer
trans_arcsinh <- cyto_transformer_arcsinh(gs)
```
```{r echo = FALSE, fig.align="center", out.width = '75%'}
knitr::include_graphics('Transformations/Transformations-5.png')
```
## Biexponential Transformation
```{r, eval = FALSE}
# Default biexponentail transformer
trans_biex <- cyto_transformer_biex(gs)
```
```{r echo = FALSE, fig.align="center", out.width = '75%'}
knitr::include_graphics('Transformations/Transformations-6.png')
```
## Logicle Transformation
```{r, eval = FALSE}
# Default biexponentail transformer
trans_logicle <- cyto_transformer_logicle(gs)
```
```{r echo = FALSE, fig.align="center", out.width = '75%'}
knitr::include_graphics('Transformations/Transformations-7.png')
```
## Optimisation of Transformation Parameters
<div style="line-height: 1.8em;"> In the examples above, we showed how we can quickly apply different types of transformations to all the fluorescent channels to identify which transformation type to use for each fluorescent channel. Based on these plots, it seems as though the logicle transformation provides the best overall result when using the default settings. In my experience, I find the logicle transformation to perform consistently the best for flow cytometry data, and it often does not even require much optimisation. For the purpose of demonstration, I will therefore switch over to the biexponential transformations to show you how we can optimise these transformation parameters to better visualise the data. Based on the plots above (green) it seems as if we can improve the transformer definitions for the PE-A and 7-AAD-A parameters. Firstly, let's remove the unwanted transformer definitions from `trans_biex` which contains our transformation definitions for all the fluorescent channels. </div>
```{r, eval = FALSE}
# Remove PE-A & 7-AAD-A transformers
trans_biex <- trans_biex[-match(c("PE-A", "7-AAD-A"), names(trans_biex))]
# Check transformers have been removed
trans_biex
```
<div style="line-height: 1.8em;"> Let's spend some time optimising the biexponential transformation for the PE channel. Since the parameters for the transformers are defined in the **flowWorkspace** package, we will need to refer to that documentation to optimise the transformers. Running `?cyto_transformer_biex` will open the help documentation that contains links to the relevant documents in **flowWorkspace**. After consulting these documents, it looks like we can use the `widthBasis` argument to fine-tune the transformation definition. The default value is set to -10, so let's try changing this to -100 and -1000 to see if it improves the visualisation of the data. </div>
```{r, eval = FALSE, echo = FALSE}
cyto_plot_save("Transformations-8.png",
height = 6,
width = 6)
gs_biex <- cyto_transform(cyto_copy(gs_linear[1]),
channels = "Va2",
type = "biex",
widthBasis = -10,
copy = TRUE,
plot = FALSE)
cyto_plot(gs_biex[[1]],
parent = "root",
channels = "Va2",
title = "widthBasis = -10")
cyto_plot_complete()
```
```{r, eval = FALSE}
# default PE transformer
PE_biex <- cyto_transformer_biex(gs,
channels = "PE-A",
widthBasis = -10)
```
```{r echo = FALSE, fig.align="center", out.width = '60%'}
knitr::include_graphics('Transformations/Transformations-8.png')
```
```{r, eval = FALSE, echo = FALSE}
cyto_plot_save("Transformations-9.png",
height = 6,
width = 6)
```
```{r, eval = FALSE, echo = FALSE}
cyto_plot_save("Transformations-9.png",
height = 6,
width = 6)
gs_biex <- cyto_transform(cyto_copy(gs_linear[1]),
channels = "Va2",
type = "biex",
widthBasis = -100,
copy = TRUE,
plot = FALSE)
cyto_plot(gs_biex[[1]],
parent = "root",
channels = "Va2",
title = "widthBasis = -100")
cyto_plot_complete()
```
```{r, eval = FALSE}
# PE transformer
PE_biex <- cyto_transformer_biex(gs,
channels = "PE-A",.
widthBasis = -100)
```
```{r echo = FALSE, fig.align="center", out.width = '60%'}
knitr::include_graphics('Transformations/Transformations-9.png')
```
```{r, eval = FALSE, echo = FALSE}
cyto_plot_save("Transformations-10.png",
height = 6,
width = 6)
```
```{r, eval = FALSE, echo = FALSE}
cyto_plot_save("Transformations-10.png",
height = 6,
width = 6)
gs_biex <- cyto_transform(cyto_copy(gs_linear[1]),
channels = "Va2",
type = "biex",
widthBasis = -1000,
copy = TRUE,
plot = FALSE)
cyto_plot(gs_biex[[1]],
parent = "root",
channels = "Va2",
title = "widthBasis = -1000")
cyto_plot_complete()
```
```{r, eval = FALSE}
# PE transformer
PE_biex <- cyto_transformer_biex(gs,
channels = "PE-A",.
widthBasis = -1000)
```
```{r echo = FALSE, fig.align="center", out.width = '60%'}
knitr::include_graphics('Transformations/Transformations-10.png')
```
<div style="line-height: 1.8em;"> Based on these plots and our understanding of the underlying biology, it looks like a widthBasis of -100 will provide a decent visualisation of the data. Before we can apply our transformers to the data. We will need to combine our transformers into a single list of transformers using `cyto_transformer_combine`. </div>
```{r, eval = FALSE}
# PE transformer
PE_biex <- cyto_transformer_biex(gs,
channels = "PE-A",.
widthBasis = -100)
# Combine transformer definitions
trans <- cyto_transformer_combine(trans_biex,
PE_biex,
trans_logicle["7-AAD-A"])
# All transformer definitions
trans
```
## Apply Manually-Optimised Transformers to Data
<div style="line-height: 1.8em;"> The complete list of transformers can be applied to the data using `cyto_transform`. It is important to optimise the transformers before using `cyto_transform` as it is not currently possible to inverse the transformations applied to a GatingSet. </div>
```{r, eval = FALSE}
# Apply transformers to data
gs <- cyto_transform(gs,
trans = trans)
```
## Apply Default Transformers to Data
<div style="line-height: 1.8em;"> `cyto_transform` can also apply batch transformers to the data using the flowWorkspace defaults. Simply indicate the type of transformation to apply through the `type` argument and `cyto_transform` will make a call to the relevant `cyto_transformer` function to obtain the transformer definitions using the defaults in flowWorkspace. These transformers will then be applied to the data within `cyto_transform` and the transformed data will be returned. It is important to note that transformer definitions cannot be customised on a per channel basis using this method. For this reason, it is recommended that the transformer definitions be optimised using the `cyto_transformer` functions prior to apply them to the data using `cyto_transform`. </div>
```{r, eval= FALSE}
# Activation GatingSet
gs <- cyto_setup("Activation-Samples")
# Apply compensation
gs <- cyto_compensate(gs)
# Apply transformers
gs <- cyto_transform(gs,
type = "logicle")
```
# Summary
<div style="line-height: 1.8em;"> Data transformations are essential to appropriate visualisation of cytometry data. **CytoExploreR** inherits support for the log, arcsinh, biexponential and logicle from the flowWorkspace package. Transformation parameters can be optimised using a `cyto_transformer` function, by altering the parameters in the underlying flowWorkspace function. Transformers can be applied to the data using `cyto_transform`. `cyto_transform` is also capable of applying batch transformers using the default flowWorkspace parameters. Flow cytometry users must remember to always apply the data transformations post compensation. </div>
<br>