-
Notifications
You must be signed in to change notification settings - Fork 1
/
NMsim-basics.Rmd
358 lines (280 loc) · 14.9 KB
/
NMsim-basics.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
---
title: "Simulation of New Subjects"
output:
rmarkdown::html_vignette:
toc: true
code_folding: show
Suggests: markdown
VignetteBuilder: knitr
vignette: >
%\VignetteIndexEntry{01BASIC}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
header-includes:
- \usepackage{ae}
---
```{r,include = FALSE}
##knitr::opts_chunk$set(dev = "cairo_pdf")
knitr::opts_chunk$set(
collapse = TRUE
,comment = "#>"
,fig.width=7
,cache=FALSE
)
## this changes data.table syntax. I think we can do without.
## knitr::opts_chunk$set(tidy.opts=list(width.cutoff=60), tidy=TRUE)
```
```{r,setup,include=F}
## library(devtools)
## unloadNamespace("NMsim")
## unloadNamespace("NMdata")
## load_all("~/wdirs/NMdata")
## load_all()
library(NMsim)
library(data.table)
library(NMdata)
## library(dplyr)
## library(tibble)
library(ggplot2)
library(patchwork)
library(tracee)
## library(tidyr)
library(fst)
library(knitr)
## NMdataConf(path.nonmem="/opt/NONMEM/nm75/run/nmfe75")
## NMdataConf(path.nonmem="/opt/nonmem/nm751/run/nmfe75")
## NMdataConf(dir.psn=NULL)
theme_set(theme_bw())
this.script <- "NMsim-simulate.Rmd"
writeOutput <- TRUE
file.project <- function(...)file.path(system.file("examples",package="NMsim"),...)
## file.project <- function(...)file.path("../inst/examples",...)
## file.project <- function(...)file.path("~/wdirs/NMsim/inst/examples",...)
found.files <- list.files(file.project("nonmem/NMsim"),pattern="noname\\.(lst|xml|ext|cov|cor|coi|phi|msf|msfi|msfo|tab)",full.names=TRUE)
unlink(found.files)
run.simuls <- FALSE
NMdataConf(as.fun="data.table")
NMdataConf(dir.sims="~/NMsim_vignette",
dir.res="simulate-results",
allow.unknown=TRUE ## necessary for dir.sims and dir.res
## until NMdata 0.1.5
)
```
Built `r Sys.Date()` using NMsim `r packageVersion("NMsim")`.
## Objectives
This vignettes aims at enabling you to
* Use `NMsim` to simulate Nonmem models with a given input data set
* Distinguish between and perform the most common types of simulations:
- new subjects,
* Simulate multiple new subjects and derive prediction intervals
* Simulate more than one Nonmem model in one `NMsim()` function call
* Important arguments
* Speed up NMsim by avoiding large table statements
* Add residual variability to an already performed model simulation
using `NMsim`.
## Prerequisites
You should have configured `NMsim` with the path to the Nonmem
installation and maybe also PSN. See
[`NMsim-config.html`](https://philipdelff.github.io/NMsim/articles/NMsim-config.html). Don't
worry - it is very easy.
## Estimation based on single dose, simulation of multiple doses
The situation is like this: We collected PK and PD data from a single ascending dose
trial on a drug candidate. A PK model was estimated using Nonmem. We have on file
the model input and output control streams (here with extensions `.mod`
and `.lst` respetively), parameter estimates (`.ext`).
<!-- The following plot shows the data for no other reason than pointing out clearly that the data informing the model was single dose, and we will be simulating a different dosing regimen. -->
```{r,include=FALSE,echo=FALSE}
file.mod <- file.project("nonmem/xgxr021.mod")
res <- NMscanInput(file.mod,quiet=TRUE)
ggplot(res[!is.na(DV)],aes(TIME,DV))+
geom_point()+
facet_wrap(~trtact)
```
We want to predict concentrations in a multiple dose regimen. This is a regimen that we have not
studied in clinical trials so far, and we have decided to use
population PK simulations for this purpose.
### The Simulation data set
You can create a Nonmem-compatible simulation data set however you want. We will keep that as a separate topic and read one that was already created in [`NMsim-DataCreate.html`](https://philipdelff.github.io/NMsim/articles/NMsim-DataCreate.html) using tools provided with NMsim to make that task simpler and faster to do:
```{r}
dat.sim <- read_fst(path="simulate-results/dat_sim.fst")
```
## Simulation of a new subject
This is the first time we are using `NMsim`, and we just want to try
the simplest thing we can think of. Simulate a new subject on the
considerd multiple dose regimen with our estimated PK model from the
single dose study.
```{r,sim-simplest,eval=FALSE}
file.mod <- file.project("nonmem/xgxr021.mod")
simres <- NMsim(file.mod=file.mod,
data=dat.sim)
```
```{r,include=FALSE,eval=TRUE}
simres <- NMreadSim("simulate-results/NMsim_xgxr021_noname_paths.rds")
```
We plot population and individual predictions as the simulations of
(in this case) the typical subject and one simulated subject. The variable called `Y` is the individual prediction plus residual variability.
paper. The code is included to show that the results from `NMsim` are ready to be plotted. The main reason data is transformed to long format (`melt`) is to get ggplot2 to generate the legend automatically.
```{r,eval=TRUE}
datl <- as.data.table(simres) |>
melt(measure.vars=cc(PRED,IPRED,Y))
ggplot(datl,aes(TIME,value,colour=variable))+
geom_line(data=function(x)x[variable!="Y"])+
geom_point(data=function(x)x[variable=="Y"])+
labs(x="Hours since first dose",y="Concentration (ng/mL)")
```
The reason we can plot a simulation with residual variability is that the control stream includes a variable `Y` defined with residual variability in `$ERROR`:
```
Y=F+F*ERR(1)+ERR(2)
```
More on residual variability in case you don't have such a line later
in this paper.
## What happened?
`NMsim` uses automation tools from `NMdata` to
* Save the data in a Nonmem-friendly format
* Create a simulation control stream based on the referenced input control stream
* Update the initial values based with estimated values
* Modify input data-related sections for reading input siulation data
* Modify output table file names and paths to generate simulation output tables
* Run Nonmem
* Read output tables and combine them with input data into one data object
The generated files have all been stored in a folder called `NMsim` next to the estimation control stream. More on that shortly in the section "A few basic additional arguments to `NMsim`".
Let's see the first few lines of the returned object:
```{r}
simres[1:3]
```
Notice a few things about the returned data:
* All columns from the output tables defined in the input control stream are there. We will soon learn how to modify this
* Input data columns are there too (like `trt`).
* Additional columns (`model` and `nmout`) may be familiar to `NMdata` users.
* We will soon learn where the "_noname" in the `model` column comes from.
## How to (re-)read the simulation results
`NMsim()` creates an `.rds` file with information about where all the results are stored. It is named based on the model name and the `name.sim` argument. NMsim provides the path to it in the console at every run. You can also specify exactly where it should be stored and the file name using the `file.res` argument. Run the function `NMreadSim` on this file to (re-)read all the simulation results.
`NMreadSim` also supports the `wait` argument making it wait for all the simulation results to be available in case you submitted a large simulation to a cluster and want to continue your execution only when it's all done.
## Multiple models
Before we continue with that model, we want to compare a simulation based on this model to another model we are considering. `NMsim` can do this and collect the data into one object:
```{r,sim-twomodels,eval=FALSE}
files.2.mod <- file.project(c("nonmem/xgxr021.mod","nonmem/xgxr114.mod"))
simres.2models <- NMsim(file.mod=files.2.mod,
data=dat.sim,
file.res="simulate-results/simres_2models_paths.rds"
)
```
We included the `file.res` argument to specify one single `rds` file because if not NMsim would create one for each of the two models simulated. We can re-read the results like this:
```{r,include=TRUE,eval=TRUE}
simres.2models <- NMreadSim("simulate-results/simres_2models_paths.rds")
```
In case multiple models are provided, `NMsim` simply loops over them. It does collect all the results, and we can use the `model` column to separate the two simulations. Since we are so far just simulating on subject with each model, it makes litlle sense to compare individual preditions. We just plot the population prediction (`PRED`):
```{r,eval=TRUE}
ggplot(simres.2models,aes(TIME,PRED,colour=model))+geom_line()+
facet_wrap(c("trt"),scales="free")
```
For simplicity, we shall show the rest of the examples for just one
model. Any of them could be run on multiple models the same way as
shown above.
<!-- subproblems vs repetition of input data -->
## A few basic additional arguments to `NMsim`
The first couple of examples were run with the bare minimum of
arguments - estimation control stream and simulation data set. You are
obviously encouraged to read the help of `NMsim` to learn about the
many useful features it has, and you will learn some more in this
vignette. But there are a few arguments that you should learn about at
this point already. Here they are:
* `dir.sims` Path to a folder in which all generated files will be
stored. Use this to avoid `NMsim` will write into the directories
where your estimation models are. They may not belong there at
all. You may want to separate the model development step from the
post-processing step. You are encouraged to explore what `NMsim`
leaves in this directory (you will find fully reproducible
simulation Nonmem runs including simulation input data).
* `dir.res` while `dir.sims` contains all the Nonmem files, you can
specify a separate directory just for compressed results. The `rds`
files containing information about where the simulations were
performed will also be saved here. This means that as soon as
results have been read once, all contents of `dir.sims` can be
purged without loss of critical data. This can save a lot of disk
space.
* `name.sim` Give your simulation a meaningfull name. We did not do
this above, so `NMsim` called it "noname".
* `table.vars` Very important. This redefines the output table section
from the estimation control stream to the simulation control
stream. The estimation control stream may have too many variables
printed (which will make Nonmem slow), or it may not have some that
are useful for the simulation analysis. See how this is used
below. Once you get used to this argument, you will use it very
frequently.
* `wait` Wait for simulation to be done and return the resulting data?
If not a path to the rds file to be read with `NMreadSim` will be
returned.
* `reuse.results` If `TRUE` and results are found on file, those will
be read instead of rerunning the simulation.
* `seed` A numeric value that will be used in Nonmem's $SIMULATION section
You will learn about a few more arguments in the next examples.
## More subjects and prediction intervals
To create a prediction interval based on the selected model, we need
to simulate multiple new subjects. There are two ways to easily obtain
that. One is to repeat (`rbind`) the simulation input dataset, one
repetetion per new subject, and then update the `ID` column to get
distinct subjects.
### Multiple subjects created in simulation input data
The follwing shows how one could generate 1000 subjects using
`data.table`. (I use `data.table` a lot, if you can provide a good way
to do this without, I am happy to include that).
```{r,eval=FALSE,collapse=TRUE,class.source = "fold-hide"}
dat.sim.1000 <- NMdata::egdt(
as.data.table(dat.sim)[,!("ID")]
,
data.table(ID=1:1000)
)
dat.sim.1000[,ID:=.GRP,by=.(ID,trt)]
## order with respect to new IDs
setorder(dat.sim.1000,trt,ID,TIME,EVID)
## check dataset
NMcheckData(dat.sim.1000,type.data="sim")
```
We now simulate 1000 subjects by plugging in this data object:
```{r,sim-1000-data,eval=FALSE}
simres.n1000.1 <- NMsim(file.mod=file.mod,
data=dat.sim.1000,
dir.sims="~/NMsim_vignette", ## where to store simulation files
name.sim="N1000_datarep"
)
```
```{r,include=FALSE,eval=TRUE}
simres.n1000.1 <- NMreadSim("simulate-results/NMsim_xgxr021_N1000_datarep_paths.rds")
```
### Multiple subjects generated by Nonmem
The other way to simulate multiple subjects is making use of Nonmem's `SUBPROBLEMS` simulation feature which makes Nonmem rerun the simulation the specified number of times. Notice that to do this, we use the `dat.sim` data without the 1000 replications. We then make use of the NMREP column generated by `NMdata::NMscanData` to redefine the `ID` column:
```{r,sim-1000-subproblem,eval=FALSE}
simres.n1000.2 <- NMsim(file.mod=file.mod,
data=dat.sim,
subproblems=1000,
dir.sims="~/NMsim_vignette", ## where to save and run Nonmem simulations
dir.res="simulate-results", ## where to save simulation results
name.sim="N1000_subproblems",
)
simres.n1000.2 <- as.data.table(simres.n1000.2)[,ID:=.GRP,by=.(NMREP,ID,trt)]
```
```{r,include=FALSE,eval=TRUE}
simres.n1000.2 <- NMreadSim("simulate-results/NMsim_xgxr021_N1000_subproblems_paths.rds")
```
The two approaches are computationally about equally fast, the most significant difference probably being in Nonmem reading a smaller or larger simulation input data file. Unless the input dataset becomes very large, it is merely a question of preference of the modeler which one to use. In a case where the simulated patients need different dosing or sample schedules, the manual construction of the data is needed - because it's not a straightforward replication.
### The prediction interval
We now plot a prediction interval - in this case based on the results of the simulation using `SUBPROBLEMS`; this makes no difference to how to derive the prediction interval.
```{r,eval=TRUE}
simres.pi <- as.data.table(simres.n1000.2)[,setNames(as.list(quantile(IPRED,probs=c(.05,.5,.95))),cc(ll,median,ul)),
by=.(trt,TIME)]
simres.pi$type <- "pi"
simres.pi$pi.cover <- "90%"
p.pi.typ <- ggplot(simres.pi,aes(TIME,fill=trt))+
geom_ribbon(aes(ymin=ll,ymax=ul,alpha=pi.cover))+
geom_line(aes(y=median,colour=trt))+
scale_alpha_manual(values=c("90%"=.5))+
labs(x="Hours since first dose",y="Concentration (ng/mL)")
p.pi.typ
```
## Read previously generated simulations
There is no need to save simulation results because they are already saved by `NMsim`. Instead, use arguments `dir.sims`, `dir.res` and `name.sim` to make sure to get a meaningful structure for the generated files. Then read the results with `NMreadSim()`.
```{r,eval=FALSE}
simres.n1000.1 <- NMreadSim("simulate-results/NMsim_xgxr021_N1000_datarep_paths.rds")
```
In fact, that is also what `NMsim` does once Nonmem has run.