-
Notifications
You must be signed in to change notification settings - Fork 0
/
luajr.Rmd
424 lines (330 loc) · 16.9 KB
/
luajr.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
---
title: "Introduction to `luajr`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Introduction to `luajr`}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
luajr allows you to run [Lua](https://www.lua.org) code from R.
Lua is a lightweight, simple, and fast scripting language that is used in a
variety of settings. The standard Lua interpreter is already reasonably fast,
but there is also a just-in-time compiler for Lua called
[LuaJIT](https://luajit.org) that is even faster. luajr uses LuaJIT.
This is **not** a guide to Lua or LuaJIT; it is a quick-start guide to luajr for
people who already know how to program in Lua. See the
[Lua web site](https://www.lua.org) for resources related to coding in Lua.
# Running Lua code: `lua()` and `lua_shell()`
```{r setup}
library(luajr)
```
To get a feel for luajr or to run "one-off" Lua code from your R project, use
`lua()` and `lua_shell()`.
When you pass a character string to `lua()`, it is run as Lua code:
```{r}
lua("return 'Hello ' .. 'world!'")
```
Assignments to global variables will persist between calls to `lua()`:
```{r}
lua("my_animal = 'walrus'")
lua("return my_animal")
```
This is because luajr maintains a "default Lua state" which holds all global
variables. This default Lua state is opened the first time a package function
is used. You can create your own, separate Lua states, or reset the default Lua
state (see [Lua states](#states), below).
Assignments to local variables will *not* persist between calls to `lua()`:
```{r}
lua("local my_animal = 'donkey'")
lua("return my_animal")
```
In this case, the second line returns `"walrus"` because the local variable
`my_animal` goes out of scope after the first call to `lua()` ends, so the
second call to `lua()` is referring back to the global variable `my_animal` from
before.
You can include more than one statement in the code run by `lua()`:
```{r}
lua("local my_veg = 'potato'; local my_dish = my_veg .. ' pie'; return my_dish")
```
You can also use the `filename` argument to `lua()` to load and run a Lua
source file, instead of running the contents of a string.
Call `lua_shell()` to open an interactive Lua shell at the R prompt. This can
can be helpful for debugging or for testing Lua statements.
# Calling Lua functions from R: `lua_func()`
The key piece of functionality for luajr is probably `lua_func()`. This allows
you to call Lua functions from R.
The first argument to `lua_func()`, `func`, is a string that should evaluate to
a Lua function. `lua_func()` then returns an R function that can be used to call
that Lua function from R. For example, you can use `lua_func()` to access an
existing Lua function from R:
```{r}
luatype = lua_func("type")
luatype(TRUE)
```
Here, `"type"` is just referring to the built-in Lua function `type` which
returns a string describing the Lua type of the value passed to it. You can also
use `lua_func()` to refer to a previously defined function in the default Lua
state:
```{r}
lua("function squared(x) return x^2 end")
lua("return squared(4)")
sq = lua_func("squared")
sq(8)
```
Or you can use `lua_func()` to define an anonymous Lua function:
```{r}
timestwo = lua_func("function(x) return x*2 end")
timestwo(123)
```
Under the hood, `lua_func()` just takes its first parameter (a string), adds
`"return "` to the front of it, executes it as Lua code, and registers the
result as the function.
The second argument to `lua_func()`, `argcode`, is also very important.
`argcode` determines how the arguments passed to the function from R are
translated into Lua values for use inside the function.
The permissible arg codes are:
* `'s'`: **s**implest Lua type (the default)
* `'a'`: **a**rray type
* `'1'`: same as `'s'`, but require that the argument has length 1
* ...
* `'9'`: same as `'s'`, but require that the argument has length 9
* `'v'`: pass by **v**alue
* `'r'`: pass by **r**eference
The kinds of R values that can be passed to Lua functions, and their behaviour
under different arg codes, is summarized in the following table:
|R type |Example R value |**arg code** 's' | 'a' | '1' |'2' | 'v' | 'r' |
|:--------------:|:--------------:|-----------------|-----------------|------------|-----------------|---------------------------------------------|-----------------------------------------------|
|`NULL` |`NULL` |`nil` |`nil` |`nil` |`nil` |`nil` |`nil` |
|`logical(1)` |`TRUE` |`true` |`{true}` |`true` |**error** |`luajr.logical({true})` |`luajr.logical_r({true})` |
|`integer(1)` |`1L` |`1` |`{1}` |`1` |**error** |`luajr.integer({1})` |`luajr.integer_r({1})` |
|`numeric(1)` |`3.14159` |`3.14159` |`{3.14159}` |`3.14149` |**error** |`luajr.numeric({3.14159})` |`luajr.numeric_r({3.14159})` |
|`character(1)` |`"howdy"` |`"howdy"` |`{"howdy"}` |`"howdy"` |**error** |`luajr.character({"howdy"})` |`luajr.character_r({"howdy"})` |
|`logical(nn)` |`c(TRUE, FALSE)`|`{true, false}` |`{true, false}` |**error** |`{true, false}` |`luajr.logical({true, false})` |`luajr.logical_r({true, false})` |
|`integer(nn)` |`1:2` |`{1, 2}` |`{1.0, 2.0}` |**error** |`{1.0, 2.0}` |`luajr.integer({1, 2})` |`luajr.integer_r({1, 2})` |
|`numeric(nn)` |`exp(0:1)` |`{1, 2.71828...}`|`{1, 2.71828...}`|**error** |`{1, 2.71828...}`|`luajr.numeric({1, 2.71828...})` |`luajr.numeric_r({1, 2.71828...})` |
|`character(nn)` |`letters[1:2]` |`{"a", "b"}` |`{"a", "b"}` |**error** |`{"a", "b"}` |`luajr.character({"a", "b"})` |`luajr.character_r({"a", "b"})` |
|`list(...)` |`list(1, b='b')`|`{[1]=1, b='b'}` |**error** |**error** |**error** |`x = luajr.list(); x[1] = luajr.numeric({1});`<br/>`x.b = luajr.character({'b'}); return x`|`x = luajr.list(); x[1] = luajr.numeric_r({1});`<br/>`x.b = luajr.character_r({'b'}); return x`|
|external pointer|`lua_open()` |`userdata:…` |`userdata:…` |`userdata:…`|`userdata:…` |`userdata:…` |`userdata:…` |
Above, `nn` stands for an integer that is greater than 1; in the examples, it
stands specifically for 2.
There should be one character in `argcode` for every argument of the function,
but the string is "recycled" when there are more arguments passed than
characters in the `argcode` string. So, for example, just passing `"s"` as
`argcode` means all parameters will be passed as the **s**implest Lua type,
while if `argcode` is `"sr"`, then the first argument has arg code `"s"`, the
second argument has arg code `"r"`, the third argument has arg code `"s"`, etc.
When a vector (logical vector, integer vector, numeric vector, or character
vector) is passed from R to Lua by **r**eference, modifications made to the
elements of that passed-in vector persist back in the R calling frame. For
example:
```{r}
values = c(1.0, 2.0, 3.0)
keep = lua_func("function(x) x[1] = 999 end", "v") # passed by value
keep(values)
print(values)
change = lua_func("function(x) x[1] = 999 end", "r") # passed by reference
change(values)
print(values)
```
Vectors can never be resized by reference; only their already-existing elements
can be changed by reference.
Lists are always passed by value, but their vector elements can be either passed
by value or by reference depending on the arg code. In order to make changes to
a list, it needs to be returned to the calling function. For example:
```{r}
x = list(1)
f1 = lua_func("function(x) x[1][1] = 999; x.a = 42; end", "v")
f1(x)
print(x)
f2 = lua_func("function(x) x[1][1] = 999; x.a = 42; end", "r")
f2(x)
print(x)
```
Note that in the examples above, we modify `x[1][1]`, not just `x[1]`. That is
because, even when lists are passed by reference, we need to modify specific
elements of their vector elements if we want to make changes by reference.
Setting `x[1] = 999` would only reassign an element of the list `x`, and lists
cannot be passed by reference itself, so this would have no effect on the R
object passed in. By contrast, setting `x[1][1] = 999` modifies the passed-in
vector `x[1]`, which can be passed in by reference, so this does change the R
object that is passed in.
To modify a list, we have to return its new value from the function:
```{r}
x = list(1)
f3 = lua_func("function(x) x[1][1] = 999; x.a = 42; return x; end", "v")
x = f3(x)
print(x)
f4 = lua_func("function(x) x[1] = luajr.numeric({888, 999}); return x; end", "v")
x = f4(x)
print(x)
```
Here, because we are returning a modified value, we can add elements to the list
(`f3`) and change whole entries of the list (`f4`).
# Benchmarking
In the same way that using R packages
[cpp11](https://CRAN.R-project.org/package=cpp11) or
[Rcpp](https://CRAN.R-project.org/package=Rcpp) allows you to
bridge your R code with C++ code that runs faster than R, you can use `luajr` to
bridge your R code with Lua code that runs faster than R.
In general, C/C++ code runs about 5-1,000 times faster than the equivalent R
code.
In my experience, luajr code often presents a similar speedup, about the same as
C/C++ code, or in the worst case, maybe half as fast. Sometimes luajr code can
be faster than C/C++, though usually it isn't quite as good.
Why use luajr then? Rcpp and cpp11 require a C++ toolchain (e.g. gcc, clang,
etc.) and requires long compilation times, whereas luajr doesn't. This means
that luajr is usable when a C++ compiler isn't available, or when compilation
times are prohibitive or an annoyance.
In the following section, we look at two aspects of benchmarking. In the first
example, we compare different ways of passing vectors to Lua functions relative
to R. In the second example, we compare more fundamentally the difference in
running a whole algorithm in Lua versus R.
## Parameter passing example: sum of squares
For reasonably long vectors, passing by reference (`'r'`) is faster than passing
by value (`'v'`), which is faster than passing by simplify (`'s'`, `'a'`, or
`'1'`–`'9'`). But for relatively short vectors, passing by simplify can
avoid some overhead, and for very simple operations on very short vectors, it
might not be worth using Lua at all.
To illustrate the above points, here is some code that takes a numeric vector
and calculates the sum of squares, comparing a pure R function with three
alternatives written in Lua, respectively passing the vector by reference, by
value, and by simplify.
```{r}
v1 = rnorm(1e1)
v4 = rnorm(1e4)
v7 = rnorm(1e7)
lua("sum2 = function(x) local s = 0; for i=1,#x do s = s + x[i]*x[i] end; return s end")
sum2 = function(x) sum(x*x)
sum2_r = lua_func("sum2", "r")
sum2_v = lua_func("sum2", "v")
sum2_s = lua_func("sum2", "s")
# Comparing the results of each function:
sum2(v1) # Pure R version
sum2_r(v1) # luajr pass-by-reference
sum2_v(v1) # luajr pass-by-value
sum2_s(v1) # luajr pass-by-simplify
```
The time taken to sum `v1`, `v4` and `v7`, depending upon the function kind, on
a 2019-era MacBook Pro is summarised in this table:
|Call |Number of summands | Method | Median runtime |
|:---------------|:-----------------:|:-------------------------:|---------------:|
|`sum2(v1)` | 10 | Pure R | **1 µs** |
|`sum2_r(v1)` | 10 | `luajr` pass by reference | 6 µs |
|`sum2_v(v1)` | 10 | `luajr` pass by value | 7 µs |
|`sum2_s(v1)` | 10 | `luajr` pass by simplify | 4 µs |
| | | | |
|`sum2(v4)` | 10,000 | Pure R | 63 µs |
|`sum2_r(v4)` | 10,000 | `luajr` pass by reference | **18 µs** |
|`sum2_v(v4)` | 10,000 | `luajr` pass by value | 93 µs |
|`sum2_s(v4)` | 10,000 | `luajr` pass by simplify | 78 µs |
| | | | |
|`sum2(v7)` | 10,000,000 | Pure R | 49,000 µs |
|`sum2_r(v7)` | 10,000,000 | `luajr` pass by reference | **13,000 µs** |
|`sum2_v(v7)` | 10,000,000 | `luajr` pass by value | 96,000 µs |
|`sum2_s(v7)` | 10,000,000 | `luajr` pass by simplify | 124,000 µs |
When the vector is 10 elements long, the R version wins handily.
When the vector is 10,000 elements long, the pass-by-reference Lua version is
fastest, but the other methods are all comparable.
When the vector is 10,000,000 elements long, the story is similar.
This is an example where luajr doesn't add that much speed—the function
is relatively short, and there is a certain amount of overhead in invoking it,
while the R function doesn't have the overhead of transferring control between
languages.
## Processing time example: logistic map
Consider the following:
```{r}
logistic_map_R = function(x0, burn, iter, A)
{
result_x = numeric(length(A) * iter)
j = 1
for (a in A) {
x = x0
for (i in 1:burn) {
x = a * x * (1 - x)
}
for (i in 1:iter) {
result_x[j] = x
x = a * x * (1 - x)
j = j + 1
}
}
return (list2DF(list(a = rep(A, each = iter), x = result_x)))
}
logistic_map_L = lua_func(
"function(x0, burn, iter, A)
local dflen = #A * iter
local result = luajr.dataframe()
result.a = luajr.numeric_r(dflen, 0)
result.x = luajr.numeric_r(dflen, 0)
local j = 1
for k,a in pairs(A) do
local x = x0
for i = 1, burn do
x = a * x * (1 - x)
end
for i = 1, iter do
result.a[j] = a
result.x[j] = x
x = a * x * (1 - x)
j = j + 1
end
end
return result
end", "sssr")
```
Here we are comparing two different versions (R versus Lua) of running a
parameter sweep of the logistic map, a chaotic dynamical system popularized by
Bob May in a [1976 Nature article](https://sites.ifi.unicamp.br/aguiar/files/2014/10/May-Nature-1976.pdf).
The output looks like this:
```{r}
logistic_map = logistic_map_L(0.5, 50, 100, 200:385/100)
plot(logistic_map$a, logistic_map$x, pch = ".")
```
The times taken by each function on a 2019-era MacBook Pro are as follows:
|Call | Method | Median runtime |
|---------------------------------------------|-------------------|---------------:|
|`logistic_map_R(0.5, 50, 100, 200:385/100))` | R function | 1900 µs |
|`logistic_map_L(0.5, 50, 100, 200:385/100))` | Lua function | **200 µs** |
The version written in Lua is around 10 times faster than the version in R.
The speedup was much more notable in an earlier test where the R version first
created the data frame and then performed the iteration, i.e. with the line
`result$x[j] = x` instead of `result_x[j] = x`. The median runtime for that R
version was two orders of magnitude slower; the extra overhead associated with
`data.frame` methods was pointed out by Tim Taylor.
# Working with Lua States: `lua_open()`, `lua_reset()` {#states}
All the functions mentioned above (`lua()`, `lua_shell()`, and `lua_func()`)
can also take an argument `L` that specifies a particular Lua state that the
function operates in.
When `L = NULL` (the default) the functions operate on the default Lua state.
But you can also open alternative Lua states using `lua_open()`, and then by
passing the result as the parameter `L`, specify that the function operates in
that specific state. For example:
```{r}
L1 = lua_open()
lua("a = 2")
lua("a = 4", L = L1)
lua("return a")
lua("return a", L = L1)
```
There is no `lua_close` in luajr because Lua states are closed automatically
when they are garbage collected in R.
`lua_reset()` resets the default Lua state:
```{r}
lua("a = 2")
lua("return a")
lua_reset()
lua("return a")
#> NULL
```
To reset a non-default Lua state `L` returned by `lua_open()`, just do
`L = lua_open()` again. The memory previously used by `L` will be cleaned up at
the next garbage collection.
# Further reading
For notes on the `luajr` Lua module -- which contains functions and types for
interacting with R from Lua code -- see `vignette("luajr-module")`.