/
gc10_using_make_design.Rmd
305 lines (241 loc) · 11.5 KB
/
gc10_using_make_design.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
---
title: "Using make_design to generate experimental designs"
author: "Mike Blazanin"
output:
rmarkdown::html_vignette:
toc: true
toc_depth: 4
vignette: >
%\VignetteIndexEntry{Using make_design to generate experimental designs}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r global options, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
knitr::opts_knit$set(root.dir = tempdir())
```
# Where are we so far?
1. Introduction: `vignette("gc01_gcplyr")`
2. Importing and reshaping data: `vignette("gc02_import_reshape")`
3. Incorporating experimental designs:** `vignette("gc03_incorporate_designs")`
4. Pre-processing and plotting your data: `vignette("gc04_preprocess_plot")`
5. Processing your data: `vignette("gc05_process")`
6. Analyzing your data: `vignette("gc06_analyze")`
7. Dealing with noise: `vignette("gc07_noise")`
8. Best practices and other tips: `vignette("gc08_conclusion")`
9. Working with multiple plates: `vignette("gc09_multiple_plates")`
10. **Using make_design to generate experimental designs: `vignette("gc10_using_make_design")`**
In `vignette("gc03_incorporate_designs")`, we focused on importing designs from files, since that's the most common way of creating designs. Here, we're going to show how designs can alternatively be generated within `R` using the `gcplyr` function `make_design`.
If you haven't already, load the necessary packages.
```{r setup}
library(gcplyr)
```
# Including design elements
As a reminder, `gcplyr` enables incorporation of design elements in two ways:
1. Designs can be imported from files
2. Designs can be generated in `R` using `make_design`
For generating designs in `R`, `make_design` can create:
* block-shaped data.frames with your design information (for saving to files)
* tidy-shaped data.frames with your design information (for saving to files and merging with tidy-shaped data)
## An example with a single design
Let's start with a simple design.
Imagine you have a 96 well plate (12 columns and 8 rows) with a different bacterial strain in each row, leaving the first and last rows and columns empty.
Row names | Column 1 | Column 2 | Column 3 | ... | Column 11 | Column 12
--------- | -------- | -------- | -------- | --- | --------- | --------
Row A | Blank | Blank | Blank | ... | Blank | Blank
Row B | Blank | Strain #1 | Strain #1 | ... | Strain #1 | Blank
Row B | Blank | Strain #2 | Strain #2 | ... | Strain #2 | Blank
... |... | ... | ... | ... | ... | ...
Row G | Blank | Strain #5 | Strain #5 | ... | Strain #5 | Blank
Row G | Blank | Strain #6 | Strain #6 | ... | Strain #6 | Blank
Row H | Blank | Blank | Blank | ... | Blank | Blank
Typing a design like this manually into a spreadsheet can be tedious. But generating it with `make_design` is easier.
`make_design` first needs some general information, like the `nrows` and `ncols` in the plate, and the `output_format` you'd like (typically `blocks` or `tidy`).
Then, for each different design component, `make_design` needs five different pieces of information:
* a vector containing the possible values
* a vector specifying which rows these values should be applied to
* a vector specifying which columns these values should be applied to
* a string or vector of the pattern of these values
* a Boolean for whether this pattern should be filled byrow (defaults to TRUE)
```{r}
my_design_blk <- make_design(
output_format = "blocks",
nrows = 8, ncols = 12,
Bacteria = list(c("Str1", "Str2", "Str3", "Str4", "Str5", "Str6"),
2:7,
2:11,
"123456",
FALSE)
)
```
So for our example above, we can see:
* the possible values are `c("Strain 1", "Strain 2", "Strain 3", "Strain 4", "Strain 5", "Strain 6")`
* the rows these values should be applied to are `2:7`
* the columns these values should be applied to are `2:11`
* the pattern these values should be filled in by is `"123456"`
* and these values should *not* be filled by row (they should be filled by column)
```{r}
my_design_blk
```
This produces a `data.frame` with `Bacteria` as the `block_name` in the metadata. If we save this design to a file or transform it to tidy-shaped, this `block_name` metadata will come in handy.
## A few notes on the pattern
The pattern in `make_design` is flexible to make it easy to input designs.
**The "0" character is reserved for `NA` values**, and can be put into your pattern anywhere you'd like to have the value be `NA`
```{r}
my_design_blk <- make_design(
output_format = "blocks",
nrows = 8, ncols = 12,
Bacteria = list(c("Str1", "Str2", "Str3",
"Str4", "Str5", "Str6"),
2:7,
2:11,
"123056",
FALSE)
)
my_design_blk
```
In the previous examples, I used the numbers 1 through 6 to correspond to our values. If you have more than 9 values, you can use letters too. By default, the order is numbers first, then uppercase letters, then lowercase letters (so "A" is the 10th index). However, if you'd like to only use letters, you can simply specify a different `lookup_tbl_start` so that `make_design` knows what letter you're using as the `1` index.
```{r}
my_design_blk <- make_design(
output_format = "blocks",
nrows = 8, ncols = 12, lookup_tbl_start = "A",
Bacteria = list(
c("Str1", "Str2", "Str3", "Str4", "Str5", "Str6"),
2:7,
2:11,
"ABCDEF",
FALSE)
)
```
You can also specify the pattern as a vector rather than a string.
```{r}
my_design_blk <- make_design(
output_format = "blocks",
nrows = 8, ncols = 12,
Bacteria = list(
c("Str1", "Str2", "Str3", "Str4", "Str5", "Str6"),
2:7,
2:11,
c(1,2,3,4,5,6),
FALSE)
)
```
## Continuing with the example: multiple designs
Now let's return to our example growth curve experiment. *In addition* to having a different bacterial strain in each row, we now also have a different media in each column of the plate.
Row names | Column 1 | Column 2 | Column 3 | ... | Column 11 | Column 12
--------- | -------- | -------- | -------- | --- | --------- | --------
Row A | Blank | Blank | Blank | ... | Blank | Blank
Row B | Blank | Media #1 | Media #2 | ... | Media #10 | Blank
... |... | ... | ... | ... | ... | ...
Row G | Blank | Media #1 | Media #2 | ... | Media #10 | Blank
Row H | Blank | Blank | Blank | ... | Blank | Blank
We can generate both designs with `make_design`:
```{r}
my_design_blk <- make_design(
output_format = "blocks",
nrows = 8, ncols = 12, lookup_tbl_start = "a",
Bacteria = list(c("Str1", "Str2", "Str3",
"Str4", "Str5", "Str6"),
2:7,
2:11,
"abcdef",
FALSE),
Media = list(c("Med1", "Med2", "Med3",
"Med4", "Med5", "Med6",
"Med7", "Med8", "Med9",
"Med10", "Med11", "Med12"),
2:7,
2:11,
"abcdefghij")
)
my_design_blk
```
However, the real strength of `make_design` is that it is not limited to simple alternating patterns. `make_design` can use irregular patterns too, replicating them as needed to fill all the wells.
```{r}
my_design_blk <- make_design(
output_format = "blocks",
nrows = 8, ncols = 12, lookup_tbl_start = "a",
Bacteria = list(c("Str1", "Str2"),
2:7,
2:11,
"abaaabbbab",
FALSE),
Media = list(c("Med1", "Med2", "Med3"),
2:7,
2:11,
"aabbbc000abc"))
my_design_blk
```
There is also an optional helper function called `make_designpattern`, or `mdp` for short. `make_designpattern` just reminds us what arguments are necessary for each design. For example:
```{r}
my_design_blk <- make_design(
output_format = "blocks",
nrows = 8, ncols = 12, lookup_tbl_start = "a",
Bacteria = mdp(
values = c("Str1", "Str2", "Str3",
"Str4", "Str5", "Str6"),
rows = 2:7, cols = 2:11, pattern = "abc0ef",
byrow = FALSE),
Media = mdp(
values = c("Med1", "Med2", "Med3",
"Med4", "Med5", "Med6",
"Med7", "Med8", "Med9",
"Med10", "Med11", "Med12"),
rows = 2:7, cols = 2:11, pattern = "abcde0ghij"))
my_design_blk
```
**For merging our designs with plate reader data, we need it tidy-shaped**, so we just need to change the `output_format` to `tidy`.
```{r}
my_design_tdy <- make_design(
output_format = "tidy",
nrows = 8, ncols = 12, lookup_tbl_start = "a",
Bacteria = mdp(
values = c("Str1", "Str2", "Str3",
"Str4", "Str5", "Str6"),
rows = 2:7, cols = 2:11, pattern = "abc0ef",
byrow = FALSE),
Media = mdp(
values = c("Med1", "Med2", "Med3",
"Med4", "Med5", "Med6",
"Med7", "Med8", "Med9",
"Med10", "Med11", "Med12"),
rows = 2:7, cols = 2:11, pattern = "abcde0ghij"))
head(my_design_tdy, 20)
```
## Saving designs to files
If you'd like to save the designs you've created with `make_design` to files, you just need to decide if you'd like them tidy-shaped or block-shaped. Both formats can easily be read back into `R` by `gcplyr`.
### Saving tidy-shaped designs
These design files will be less human-readable, but easier to import and merge. Additionally, tidy-shaped files are often better for data repositories, like Dryad. To save tidy-shaped designs, simply use the built-in `write.csv` function.
```{r}
#See the previous section where we created my_design_tdy
write.csv(x = my_design_tdy, file = "tidy_design.csv",
row.names = FALSE)
```
### Saving block-shaped designs
These design files will be more human-readable but slightly more computationally involved to import and merge. For these, use the `gcplyr` function `write_blocks`. Typically, you'll use `write_blocks` to save files in one of two formats:
* `multiple` - each block will be saved to its own `.csv` file
* `single` - all the blocks will be saved to a single `.csv` file, with an empty row in between them
#### Saving block-shaped designs to multiple files
The default setting for `write_blocks` is `output_format = 'multiple'`. This creates one `csv` file for each block. If we set `file = NULL`, the default is to name the files according to the `block_names` in the metadata.
```{r}
# See the previous section where we created my_design_blk
write_blocks(my_design_blk, file = NULL)
# Let's see what the files look like
print_df(read.csv("Bacteria.csv", header = FALSE, colClasses = "character"))
print_df(read.csv("Media.csv", header = FALSE, colClasses = "character"))
```
#### Saving block-shaped designs to a single file
The other setting for `write_blocks` is `output_format = 'single'`. This creates a single `csv` file that contains all the blocks, putting metadata like `block_names` in rows that precede each block.
Let's take a look what the `single` output format looks like:
```{r}
# See the previous section where we created my_design_blk
write_blocks(my_design_blk, file = "Design.csv", output_format = "single")
# Let's see what the file looks like
print_df(read.csv("Design.csv", header = FALSE, colClasses = "character"))
```
Here we can see all our design information has been saved to a single file, and the metadata has been added in rows before each block.
# Merging growth curve data with designs
Once we have both our design and data in `R` and tidy-shaped, we can merge them just the same way as described in `vignette("gc03_incorporate_designs")`