/
RSQLite.Rmd
127 lines (95 loc) · 3.73 KB
/
RSQLite.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
---
title: "RSQLite"
author: "Hadley Wickham"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{RSQLite}
%\VignetteEngine{knitr::rmarkdown}
\usepackage[utf8]{inputenc}
---
```{r, echo = FALSE}
knitr::opts_chunk$set(comment = "#>", collapse = TRUE)
```
RSQLite is the easiest way to use a database from R because the package itself contains [SQLite](https://www.sqlite.org); no external software is needed. This vignette will walk you through the basics of using a SQLite database.
RSQLite is a DBI-compatible interface which means you primarily use functions defined in the DBI package, so you should always start by loading DBI, not RSQLite:
```{r}
library(DBI)
```
## Creating a new database
To create a new SQLite database, you simply supply the filename to `dbConnect()`:
```{r}
mydb <- dbConnect(RSQLite::SQLite(), "my-db.sqlite")
dbDisconnect(mydb)
unlink("my-db.sqlite")
```
If you just need a temporary database, use either `""` (for an on-disk database) or `":memory:"` or `"file::memory:"` (for a in-memory database). This database will be automatically deleted when you disconnect from it.
```{r}
mydb <- dbConnect(RSQLite::SQLite(), "")
dbDisconnect(mydb)
```
## Loading data
You can easily copy an R data frame into a SQLite database with `dbWriteTable()`:
```{r}
mydb <- dbConnect(RSQLite::SQLite(), "")
dbWriteTable(mydb, "mtcars", mtcars)
dbWriteTable(mydb, "iris", iris)
dbListTables(mydb)
```
## Queries
Issue a query with `dbGetQuery()`:
```{r}
dbGetQuery(mydb, 'SELECT * FROM mtcars LIMIT 5')
```
Not all R variable names are valid SQL variable names, so you may need to escape them with `"`:
```{r}
dbGetQuery(mydb, 'SELECT * FROM iris WHERE "Sepal.Length" < 4.6')
```
If you need to insert the value from a user into a query, don't use `paste()`! That makes it easy for a malicious attacker to insert SQL that might damager your database or reveal sensitive information. Instead, use a parameterised query:
```{r}
dbGetQuery(mydb, 'SELECT * FROM iris WHERE "Sepal.Length" < :x',
params = list(x = 4.6))
```
This is a little more typing, but much much safer.
## Batched queries
If you run a query and the results don't fit in memory, you can use `dbSendQuery()`, `dbFetch()` and `dbClearResults()` to retrieve the results in batches. By default `dbFetch()` will retrieve all available rows: use `n` to set the maximum number of rows to return.
```{r}
rs <- dbSendQuery(mydb, 'SELECT * FROM mtcars')
while (!dbHasCompleted(rs)) {
df <- dbFetch(rs, n = 10)
print(nrow(df))
}
dbClearResult(rs)
```
## Multiple parameterised queries
You can use the same approach to run the same parameterised query with different parameters. Call `dbBind()` to set the parameters:
```{r}
rs <- dbSendQuery(mydb, 'SELECT * FROM iris WHERE "Sepal.Length" < :x')
dbBind(rs, param = list(x = 4.5))
nrow(dbFetch(rs))
dbBind(rs, param = list(x = 4))
nrow(dbFetch(rs))
dbClearResult(rs)
```
You can also pass multiple parameters in one call to `dbBind()`:
```{r}
rs <- dbSendQuery(mydb, 'SELECT * FROM iris WHERE "Sepal.Length" = :x')
dbBind(rs, param = list(x = seq(4, 4.4, by = 0.1)))
nrow(dbFetch(rs))
dbClearResult(rs)
```
## Statements
DBI has new functions `dbSendStatement()` and `dbExecute()`,
which are the counterparts of `dbSendQuery()` and `dbGetQuery()`
for SQL statements that do not return a tabular result,
such as inserting records into a table, updating a table,
or setting engine parameters.
It is good practice, although currently not enforced, to use the new functions
when you don't expect a result.
```{r}
dbExecute(mydb, 'DELETE FROM iris WHERE "Sepal.Length" < 4')
rs <- dbSendStatement(mydb, 'DELETE FROM iris WHERE "Sepal.Length" < :x')
dbBind(rs, param = list(x = 4.5))
dbGetRowsAffected(rs)
dbClearResult(rs)
```