Skip to content

piersharding/dplyr-calcite

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dplyr-calcite

dplyrcalcite is a Database connector for Apache Calcite for dplyr the next iteration of plyr (from Hadley Wickham), focussed on tools for working with data frames (hence the d in the name).

Installing dependencies

The interface to Apache Calcite is driven by the Calcite JDBC driver. This will require the driver to be installed, and one of the easiest ways to achieve this is by following the installation instructions provided here: http://calcite.apache.org/docs/howto.html and: http://calcite.apache.org/docs/tutorial.html With SQL reference here: http://calcite.apache.org/docs/reference.html

Basically :-

$ git clone https://github.com/apache/calcite.git
$ cd calcite
$ mvn install -DskipTests -Dcheckstyle.skip=true

The examples provided in the data directory are dependent on the sample CSV file driver implementation.

Install dependent R packages

install RJDBC and assertthat with:

  • the latest released version from CRAN with

    install.packages(c("RJDBC", "assertthat"))

next install lazyeval with:

  • the latest released version from CRAN with

    devtools::install_github("hadley/lazyeval")

next install dplyr with:

  • the latest released version from CRAN with

    install.packages("dplyr")
  • the latest development version from github with

    devtools::install_github("hadley/dplyr")
  • Finally install dplyr-calcite from github:

    devtools::install_github("piersharding/dplyr-calcite")

To get started, read the notes below, then read the help(src_calcite).

If you encounter a clear bug, please file a minimal reproducible example on github.

src_calcite

Connect to the Database:

library(dplyrcalcite)

# optionally set the class path for the Calcite JDBC connector - alternatively use the CLASSPATH environment variable
options(dplyr.jdbc.classpath = "~/.m2/repository")

# To connect to a database first create a src:
lhm <- src_calcite('./data/model.json')
lhm

# Simple query:
batting <- tbl(lhm, "Batting")
dim(batting)
colnames(batting)
head(batting)

See dplyr for many more examples.

About

No description, website, or topics provided.

Resources

License

Unknown, GPL-3.0 licenses found

Licenses found

Unknown
LICENSE
GPL-3.0
LICENSE.txt

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages