-
Notifications
You must be signed in to change notification settings - Fork 0
/
use_case.Rmd
112 lines (90 loc) · 4.02 KB
/
use_case.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
---
title: "Using sqlhelper in packages"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Using sqlhelper in packages}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
`sqlhelper` was written to streamline SQL integration in packages designed to
meet organizations' particular analytical needs, in stable data contexts. The
pattern of integrating SQL with R code in analysis packages is intended to:
* make use of the data-wrangling capacity of databases and SQL;
* take advantage of R's packaging system to write reliable, understandable code;
* keep all the code for a task in one place.
This note describes how to use `sqlhelper` safely inside a package.
## Connections
If more than one connection is needed for a project, the easiest approach will be to define them in a yaml file as described in `vignette("connections")`. The yaml file is placed under `inst` in the root of the package and accessed with `devtools::system.file()` (see [r-pkgs](https://r-pkgs.org/misc.html#sec-misc-inst) on using installed files).
```r
sqlhelper::connect(
system.file( "sqlhelper_connection_conf.yml" ),
exclusive=TRUE
)
```
Connections that cannot be defined this way may be defined within the package. How this is handled will depend on considerations such as how long each connection will be needed for, how widely it will need to be shared between other package components, and whether it will need to be exposed to users.
If a connection is needed once and may be used and closed during the execution
of a single function it may be sufficient to define it within the namespace of
that function and close it on or before the function exits, for example:
```r
get_some_data <- function(){
conn <- DBI::dbConnect(a_driver, "a connection string")
d <- sqlhelper::run_files(
system.file("SQL/my_sql_file.SQL"),
default.conn = conn
)
DBI::dbDisconnect(conn)
d
}
```
A connection that needs to be shared across functions or function calls, but not
exposed to users, may be stored in an environment in the package's top-level namespace, for example:
```r
assign("connection_store",
new.env(parent = emptyenv()),
environment())
connect <- function(){
assign(
"c1",
DBI::dbConnect(a_driver, "a connection string"),
envir = connection_store
)
}
get_some_data <- function(){
sqlhelper::run_files(
system.file("SQL/my_sql_file.SQL"),
default.conn = connection_store$c1
)
}
# This is a bit belt-and-braces, but thorough.
disconnect <- function(){
DBI::dbDisconnect(connection_store$c1)
connection_store$c1 <- NULL
rm(list=c("c1), envir=connection_store)
}
```
(this approach is in fact more or less the one take by `sqlhelper` itself)
## SQL files
If multiple connections are required but cannot be defined in yaml it will
somewhat diminish the ability to control execution on a query-by-query basis and
may produce a need to split queries into more files than would otherwise be
necessary. mf a package contains many SQL files, it may be useful to store them
in a SQL directory under `inst/`. They may be accessed in the same way, with
`system.file()`, either `system.file("SQL", "file_name.SQL")` or
`system.file("SQL/file_name.SQL")` (the latter _may_ be somewhat less portable).
It is often convenient to define SQL parameters in a different scope to the
calling scope of `run_files()`. In this case it is important to ensure that the
package does not interfere with the user's global environment. The easiest way
to do this is to store them in an environment (e.g. in the same way as
illustrated for connections, above) and pass them to the `values` parameter of
`run_files()`, `run_queries()` or `prepare_sql()`. See `run_queries()` or
`vignette("execution")` for examples.
## Exit
It may be desirable to close `sqlhelper`'s connections when a calling package
has completed it's operations. This can be achieved easily with
`sqlhelper::disconnect()`.