Skip to content
DBI-based adapter for Presto for the statistical programming language R.
Branch: master
Clone or download
onurfiliz Merge pull request #110 from copernican/map-subscript
Translate `[[` to allow indexing arrays and maps with dplyr
Latest commit b66020d May 17, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R simplify translation of `[[` May 16, 2019
man Use the delayed registration mechanism in R-devel for dplyr method im… Oct 21, 2018
src add copyright header to remaining files Nov 12, 2018
tests simplify test cases for character and numeric indices May 16, 2019
.Rbuildignore Prepare for releasing 1.3.3 May 7, 2019
.gitignore Add cpp binaries to .gitignore May 10, 2016
.travis.yml add .travis.yml Oct 31, 2018
CODE_OF_CONDUCT.md Create CODE_OF_CONDUCT.md Oct 31, 2018
CONTRIBUTING.md Fix markdown link for devtools Apr 15, 2015
DESCRIPTION Change the version number to 1.3.3.9000 to emphasize the in-developme… May 7, 2019
LICENSE Make changes requested by the CRAN submission Mar 17, 2016
NAMESPACE
NEWS.md Change the version number to 1.3.3.9000 to emphasize the in-developme… May 7, 2019
PATENTS Fix PATENTS file to refer to this project Apr 29, 2015
README.md Add source to dplyr example in README Dec 23, 2016
RPresto.Rproj add copyright header to remaining files Nov 12, 2018
cran-comments.md Prepare for releasing 1.3.3 May 7, 2019

README.md

RPresto

RPresto is a DBI-based adapter for the open source distributed SQL query engine Presto for running interactive analytic queries.

Installation

RPresto is both on CRAN and github. For the CRAN version, you can use

install.packages('RPresto')

You can install the github development version via

devtools::install_github('prestodb/RPresto')

Examples

The standard DBI approach works with RPresto:

library('DBI')

con <- dbConnect(
  RPresto::Presto(),
  host='http://localhost',
  port=7777,
  user=Sys.getenv('USER'),
  schema='<schema>',
  catalog='<catalog>',
  source='<source>'
)

res <- dbSendQuery(con, 'SELECT 1')
# dbFetch without arguments only returns the current chunk, so we need to
# loop until the query completes.
while (!dbHasCompleted(res)) {
    chunk <- dbFetch(res)
    print(chunk)
}

res <- dbSendQuery(con, 'SELECT CAST(NULL AS VARCHAR)')
# Due to the unpredictability of chunk sizes with presto, we do not support
# custom number of rows
# testthat::expect_error(dbFetch(res, 5))

# To get all rows using dbFetch, pass in a -1 argument
print(dbFetch(res, -1))

# An alternative is to use dbGetQuery directly

# `source` for iris.sql()
source(system.file('tests', 'testthat', 'utilities.R', package='RPresto'))

iris <- dbGetQuery(con, paste("SELECT * FROM", iris.sql()))

dbDisconnect(con)

We also include dplyr integration.

library(dplyr)

db <- src_presto(
  host='http://localhost',
  port=7777,
  user=Sys.getenv('USER'),
  schema='<schema>',
  catalog='<catalog>',
  source='<source>'
)

# Assuming you have a table like iris in the database
iris <- tbl(db, 'iris')

iris %>%
  group_by(species) %>%
  summarise(mean_sepal_length = mean(as(sepal_length, 0.0))) %>%
  arrange(species) %>%
  collect()

How RPresto works

Presto exposes its interface via a REST based API1. We utilize the httr package to make the API calls and use jsonlite to reshape the data into a data.frame. Note that as of now, only read operations are supported.

RPresto has been tested on Presto 0.100.

License

RPresto is BSD-licensed. We also provide an additional patent grant.

[1] See https://github.com/prestodb/presto/wiki/HTTP-Protocol for a description of the API.

You can’t perform that action at this time.