-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
125 lines (94 loc) · 6.09 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
---
title: "rpg"
author: "Timothy H. Keitt"
date: "02/08/2017"
output:
html_document:
keep_md: yes
---
[![CRAN Version](http://www.r-pkg.org/badges/version/rpg)](http://www.r-pkg.org/badges/version/rpg) [![CRAN Downloads](http://cranlogs.r-pkg.org/badges/rpg)](http://cran.rstudio.com/web/packages/rpg/index.html) [![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/thk686/rpg?branch=master&svg=true)](https://ci.appveyor.com/project/thk686/rpg) [![Travis-CI Build Status](https://travis-ci.org/thk686/rpg.svg?branch=master)](https://travis-ci.org/thk686/rpg) [![Depsy](http://depsy.org/api/package/cran/rpg/badge.svg)](http://depsy.org/package/r/rpg)
This package wraps PostgreSQL's libpq, a library for interacting with a PostreSQL database. Unlike other database access packages for R, ```rpg``` is designed to be specific to PostgreSQL and, as such, exposes a more of the functionality of libpq. A great deal of thought went into making ```rpg``` simple to use. The major difference between ```rpg``` and most other database packages is that ```rpg``` does not use an object-oriented model. There are no connection objects, result objects and so on. This simplifies the interface and makes using ```rpg``` a lot like using psql, PostgreSQL's command line interface. I basically wrote ```rpg``` as a nice comfy environment for my own work. If you are building infrastructure, you probably want to investigate ```DBI``` and [RPostgres](https://github.com/rstats-db/RPostgres). There is also the excellent [RPostgreSQL](https://cran.r-project.org/web/packages/RPostgreSQL/index.html) package on CRAN.
### Installation
```
install.packages(rpg) # Released
devtools::install_github("thk686/rpg") # Development
```
### Queries
```rpg``` supports parameterized queries and prepared statements.
```{r}
library(rpg)
createdb("exampledb"); connect("exampledb")
data(mtcars); write_table(mtcars)
fetch("select * from mtcars where mpg > $1", 30)
prepared = prepare("select * from mtcars where mpg > $1")
prepared(25); fetch(); prepared(30); fetch()
```
### Connection stack
Even though the database connection is implicit, you are not restricted to a single connection. Connections can be managed on a stack.
```{r}
push_conn()
createdb("exampledb2"); connect("exampledb2")
data(iris); write_table(iris); list_tables()
swap_conn(); list_tables()
show_conn_stack()
swap_conn(); describe_table("iris")
pop_conn(); get_conn_info("dbname")
dropdb("exampledb2")
```
### Database cursors and foreach
The ```rpg``` package integrates nicely with the iterators package. This allows you to use ```foreach``` to iterate over query results. This is perhaps the most unique feature of ```rpg```.
```{r}
library(foreach)
c1 = cursor("SELECT * FROM mtcars", by = 8)
x = foreach(i = c1, .combine = rbind) %do% { i$mpg }
print(x, digits = 2)
```
This usage is compatible with ```%dopar%```; you can access these cursors from multiple threads of execution and even across a heterogeneous cluster of machines. See the documentation for ```cursor``` for an example.
Note: after some additional experimentation, I have concluded that parallel processing of cursors in PostgreSQL does not in fact work. You can iterate over a cursor on multiple backends. However it appears that only one of these backends will do any work, probably to ensure the access is serialized. My guess is that requests are forwarded to the portal where the cursor was opened.
### Asynchronous queries
You can submit asynchronous queries with ```rpg```. When combined with the connection stack, this allows you to launch queries, push the connection, make a new connection while the first query is completing. By swapping the first connection back to the active state, you can then check for completion or try to cancel the transaction.
```{r}
async_query("SELECT a.* FROM mtcars a, mtcars b, mtcars c") # >30,000 rows
count = 0
repeat {
status = async_status()
if (status == "BUSY") {
if (count > 2) {
cat("calling cancel...\n")
cancel() }}
else break
cat("busy... \n")
Sys.sleep(1)
count = count + 1 }
print(status)
finish_async()
```
### Transactions and save points
A PostgreSQL-specific feature are save points for nested transactions. ```rpg``` provides a convenient wrapper.
```{r}
sp1 = savepoint(); write_table(iris)
sp2 = savepoint(); data(Loblolly); write_table(Loblolly); list_tables()
rollback(sp2); list_tables()
rollback(sp1); list_tables()
```
### Object serialization and storage
The ```stow``` and ```stow.image``` functions write entire R objects into a database. You can stow your entire working environment and reload it anywhere else you can reconnect to the database. (A facility listing and commenting images is in the works.)
```{r}
stow("mtcars")
rm(list = objects())
ls()
list_stowed()
retrieve("mtcars")
ls()
disconnect()
dropdb("exampledb")
```
### Additional features
1. Intelligent handling of passwords: ```rpg``` will query for the password only if it is needed. You can set a default password.
1. Easy creation of connection aliases: call ```make_service``` and your connection settings will be saved as a named service. You only have to type complex connection options once. If you used a password, that will be saved as well.
1. Lots of options for getting and setting connection options and defaults.
1. Easy tracing of data flowing between client and server.
1. High bandwidth options for reading/writing bulk data from/to the database.
### News
2/7/17 -- Gave up trying to build libpq on the fly and instead borrowed the excellent work done by the [RPostgres](https://github.com/rstats-db/RPostgres) team. RPostgres is looking very promising for DBI work.
4/15/16 -- I ripped out and replaced the build setup. I was using the autoconf bits and libpq files distributed with RPostgreSQL (a very good package you should check out). However it had a few peculiarities, like never using the included libpq on Linux. Also, I could not check the libpq version number. So now the package will check your postgres install using the pg_config command and if its not found or the version is not new enough, then libpq is downloaded and built.