Skip to content

Commit

Permalink
Basic vignette
Browse files Browse the repository at this point in the history
  • Loading branch information
richfitz committed Mar 12, 2015
1 parent e780fb9 commit 6bd0ef6
Show file tree
Hide file tree
Showing 5 changed files with 243 additions and 1 deletion.
2 changes: 2 additions & 0 deletions .Rbuildignore
Expand Up @@ -6,3 +6,5 @@ rrlite_.*.tar.gz
^.travis.yml$
extra
^docker$
^.*\.Rproj$
^\.Rproj\.user$
1 change: 1 addition & 0 deletions .gitignore
Expand Up @@ -7,3 +7,4 @@ inst/rlite.COPYING
notes.md
extra/redis-doc
docker/
inst/doc
4 changes: 3 additions & 1 deletion DESCRIPTION
Expand Up @@ -9,5 +9,7 @@ License: BSD_2_clause + file LICENSE
LazyData: true
Author: Rich FitzJohn
Maintainer: Rich FitzJohn <rich.fitzjohn@gmail.com>
Suggests: testthat
Suggests: testthat,
knitr
Imports: R6
VignetteBuilder: knitr
8 changes: 8 additions & 0 deletions README.md
Expand Up @@ -5,6 +5,14 @@

R interface to rlite https://github.com/seppo0010/rlite

# Installation

Because the source currently contains a submodule, the usual approach with `devtools::install_github` does not work, but this does:

```
devtools::install_git("https://github.com/richfitz/rrlite", args="--recursive")
```

# Usage

## High level
Expand Down
229 changes: 229 additions & 0 deletions vignettes/rrlite.Rmd
@@ -0,0 +1,229 @@
---
title: "rrlite introduction"
author: "Rich FitzJohn"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{rrlite introduction}
%\VignetteEngine{knitr::rmarkdown}
\usepackage[utf8]{inputenc}
---

What is this thing?

- `rrlite` is an R interface to the [`rlite`](https://github.com/seppo0010/rlite) library.
- `rlite` is a self-contained, serverless, zero-configuration, transactional redis-compatible database engine. `rlite` is to `Redis` what SQLite is to SQL.
- `Redis` is a *data structures* server; at the simplest level it can be used as a key-value store, but it can store other data types (hashes, lists, sets and more).

`rrlite` makes it easy to use `Redis` features without having to install and run a server just to save bits of data. It can run a persistent storage to disk, or it can run ephemeral storage in memory. The package will be mostly useful to package developers who need to store reasonable amounts of data, or for people developing HPC applications where the amount of data needed to be addressed is more than can be fit into memory.

```{r}
library(rrlite)
```

# Interfaces

There are three levels of interface built into the package, with the idea that these provide different levels of abstraction to build off.

* Highest level `rdb`: This is an example of what one might build on top of `rrlite`. It provides a very simple key/value store for saving and restoring any R object that can be serialised.
* Medium level `rlite`: This is the main interface. It provdes access to `r length(ls(rlite())) - 1L` `Redis` commands as a set of user-friendly R functions that do basic error checking. This is the level that new applications would be build from, and `rdb` is built on top of `rlite`.
* Lowest level `rlite_context`: This is direct access to the `rlite` library; commands are passed unchecked, and this can work in an asynchronous mode. The `rlite` interface is built on top of this level.

# Usage

## High level

Create a new database in memory (this is not written to disk, and I've not actually worked out how to save to disk once the database is created...)
```{r}
db <- rdb()
```

Newly created databases are empty - they have no keys
```{r}
db$keys()
```

R objects can be stored against keys, for example:
```{r}
db$set("mykey", 1:10)
db$keys()
```

Retrieve the value of a key with `$get`:
```{r}
db$get("mykey")
```

Trying to get a nonexistant key does not throw an error but returns `NULL`
```{r}
db$get("no_such_key")
```

That's it. Arbitrary R objects can be stored in keys, and they will be returned intact with few exceptions (the exceptions are things like `rdb` itself which includes an "external pointer" object which can't be serialised - see `?serialize` for more information):

```{r}
db$set("mtcars", mtcars)
identical(db$get("mtcars"), mtcars)
db$set("a_function", sin)
db$get("a_function")(pi / 2) # 1
```

This seems really silly, but is potentially very useful. There are file-based key/value systems on CRAN, and this would be another but backed by a potentailly very efficient store (and without the overhead of disk access).

## Intermediate level

This is the richest level, and a full demonstration is out of scope for this tutorial. This gives you all the power of `Redis`, but you are more limited in what you can store.

```{r}
r <- rlite()
```

The `rlite` object is an [`R6`](https://github.com/wch/R6) class with many methods, corresponding to different `Redis` commands. For example, `SET` and `GET`:

```{r}
r$SET("mykey", "mydata") # set the key "mykey" to the value "mydata"
r$GET("mykey")
```

The value must be a string or will be coerced into one. So if you want to save an arbitrary R object, you need to convert it to a string. The `object_to_string` function and its inverse `string_to_object` can help here:

```{r}
s <- object_to_string(1:10)
s # ew. but this does encode everything about this object
string_to_object(s) # here's the original back
```

So:
```{r}
r$SET("mylist", object_to_string(1:10))
r$GET("mylist")
string_to_object(r$GET("mylist"))
```

This is how the `rdb` object is implemented!

However, `Redis` offers far better ways of holding lists, if that is the aim:

```{r}
r$RPUSH("mylist2", 1:10)
```

(the returned value `10` indicates that the list "mylist2" is 10 elements long). There are [lots of commands](http://redis.io/commands/#list) for operating on lits, but you can do things like


* get an element by its index (note tht this uses C-style base0 indexing for consistency with the `Redis` documentation rather than R's semantics)
```{r}
r$LINDEX("mylist2", 1)
```

* set an element by its index
```{r}
r$LSET("mylist2", 1, "carrot")
```

* get all of a list:
```{r}
r$LRANGE("mylist2", 0, -1)
```

* or part of it:
```{r}
r$LRANGE("mylist2", 0, 2)
```

* pop elements off the front or back
```{r}
r$LLEN("mylist2")
r$LPOP("mylist2")
r$RPOP("mylist2")
r$LLEN("mylist2")
```

Of course, each element of the list can be an R object if you run it through `object_to_string`:

```{r}
r$LPUSH("mylist2", object_to_string(1:10))
```

but you'll be responsible for converting that back:

```{r}
dat <- r$LRANGE("mylist2", 0, 2)
dat
dat[[1]] <- string_to_object(dat[[1]])
dat
```

## Low level interface

Probably best not to use this unless you really want to, but this provides the most complete interface to `rlite` with the least amount of checking. If complete speed is the aim then working via this interface and doing just the checking that you want might be the best bet.

```{r}
con <- rlite_context(":memory:")
con$run(c("SET", "foo", "bar"))
con$run(c("GET", "foo"))
```

You can run asynchronously

```{r}
con$write(c("set", "foo", "bar"))
con$read()
```

but you're responsible for making sure there are no pending replies by matching every `write` call with a `read` call. This might be useful for blocking operations.

# Potential applications

Because `rrlite` exposes all of `rlite`, you can roll your own data structures.

First, a generator object that sets up a new list at `key` within the database `r`.

```{r}
rlist <- function(..., key=NULL, r=rlite(":memory:")) {
if (is.null(key)) {
key <- paste(sample(letters, 32, replace=TRUE), collapse="")
}
dat <- vapply(c(...), object_to_string, character(1))
r$RPUSH(key, dat)
ret <- list(r=r, key=key)
class(ret) <- "rlist"
ret
}
```

Then some S3 methods that work with this object. I've only implemented `length` and `[[`, but `[` would be useful here too as would `print`.

```{r}
length.rlist <- function(x) {
x$r$LLEN(x$key)
}
`[[.rlist` <- function(x, i, ...) {
string_to_object(x$r$LINDEX(x$key, i - 1L))
}
`[[<-.rlist` <- function(x, i, value, ...) {
x$r$LSET(x$key, i - 1L, object_to_string(value))
x
}
```

Then we have this weird object we can add things to.
```{r}
obj <- rlist(1:10)
length(obj) # 10
obj[[3]]
obj[[3]] <- "an element"
obj[[3]]
```

The object has reference semantics so that assignment does *not* make a copy:

```{r}
obj2 <- obj
obj2[[2]] <- obj2[[2]] * 2
obj[[2]] == obj2[[2]]
```

What would be nice is a set of tools for working with any R/`Redis` package that can convert R objects into `Redis` data structures so that they can be accessed in pieces even if they are far too big to fit into memory. Of course, these objects could be read/written by programs *other* than R if they also support `Redis`.

0 comments on commit 6bd0ef6

Please sign in to comment.