Skip to content
This repository has been archived by the owner on Aug 12, 2020. It is now read-only.

Commit

Permalink
partial readme update
Browse files Browse the repository at this point in the history
  • Loading branch information
ajdamico committed Jun 13, 2016
1 parent 750bc44 commit ed718af
Show file tree
Hide file tree
Showing 4 changed files with 72 additions and 8 deletions.
1 change: 1 addition & 0 deletions .Rbuildignore
Expand Up @@ -18,3 +18,4 @@
.*\.ag
.*\.msc
^Readme\.md
^speed_comparisons\.png
1 change: 0 additions & 1 deletion DESCRIPTION
Expand Up @@ -12,4 +12,3 @@ Depends: R (>= 3.2.0)
Imports: DBI (>= 0.3.1), digest (>= 0.6.4), methods, codetools
Suggests: testthat, survey, nycflights13, RSQLite, dplyr, gdata
Collate: mapi.R dbi.R dbapply.R dplyr.R control.R embedded.R
VignetteBuilder:knitr
78 changes: 71 additions & 7 deletions README.md
Expand Up @@ -5,7 +5,10 @@
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/MonetDBLite)](http://cran.r-project.org/package=MonetDBLite)
[![](http://cranlogs.r-pkg.org/badges/MonetDBLite)](http://monetdb.cwi.nl/testweb/web/sisyphus/wilbur.png)

[MonetDBLite](https://www.monetdb.org/blog/monetdblite-r) is a SQL database that runs inside the [R environment for statistical computing](https://www.r-project.org/) and does not require the installation of any external software. It is similar in functionality to [RSQLite](http://cran.r-project.org/package=RSQLite) and works seamlessly with [the dplyr grammar of data manipulation](https://cran.rstudio.com/web/packages/dplyr/vignettes/databases.html), but typically completes queries much faster due to its *columnar* storage architecture and bulk query processing model. MonetDBLite is based on [MonetDB](https://www.monetdb.org/Home).
[MonetDBLite](https://www.monetdb.org/blog/monetdblite-r) is a SQL database that runs inside the [R environment for statistical computing](https://www.r-project.org/) and does not require the installation of any external software. It is similar in functionality to [RSQLite](http://cran.r-project.org/package=RSQLite) and works seamlessly with [the dplyr grammar of data manipulation](https://cran.rstudio.com/web/packages/dplyr/vignettes/databases.html), but typically completes queries much faster due to its *columnar* storage architecture and bulk query processing model. MonetDBLite is based on free and open-source [MonetDB](https://www.monetdb.org/Home), a product of the [Centrum Wiskunde & Informatica](http://cwi.nl).

Both `MonetDBLite` and `RSQLite` rely on `library(DBI)` for their selection of functions, making the transfer of legacy `RSQLite` code over to `MonetDBLite` a cinch. MonetDBLite behaves like a blazingly fast RSQLite.


## Installation

Expand All @@ -23,42 +26,101 @@

If you encounter a bug, please file a minimal reproducible example on [github](https://github.com/hannesmuehleisen/MonetDBLite/issues). For questions and other discussion, please use [stack overflow](http://stackoverflow.com/questions/tagged/monetdblite) with the tag `monetdblite`. The development version of MonetDBLite endures [sisyphean perpetual testing](http://monetdb.cwi.nl/testweb/web/sisyphus/) on both unix and windows machines.

## Speed Comparisons

MonetDBLite outperforms other SQL databases available in R and ranks competitively among other High Performace Computing options available to R users.

![Alt text](speed_comparisons.png?raw=true "Speed Comparisons")

## Painless Startup

## Startup
MonetDBLite provides a [`DBI`](http://cran.r-project.org/package=DBI) interface.

MonetDBLite provides a [`DBI`](http://cran.r-project.org/package=DBI) interface. To startup a server, create a DBI connection as follows:
### Temporary Database

To create a server (or to reconnect to a previously-initiated one), create a DBI connection as follows:

```R
library(DBI)
dbdir <- tempdir()
con <- dbConnect(MonetDBLite::MonetDBLite(), dbdir)
```

If you want to keep the database around for later, change `dbdir` to point to some meaningful path.
### Permanent Database

## Data Import
If you want to store the database permanently on your local machine, you will need to initiate the `dbdir` using an empty folder on your local machine.

Writing a R `data.frame` into MonetDBLite is efficient, the easiest way of creating a table is through [`dbWriteTable`](http://www.inside-r.org/packages/cran/DBI/docs/dbWriteTable):
```R
library(DBI)
dbdir <- "C:/path/to/database_directory"
con <- dbConnect(MonetDBLite::MonetDBLite(), dbdir)
```

#### Notes

1. MonetDB often suffers hiccups when using network drives. Whenever possible, connect to a MonetDBLite server stored on the same machine as the R session.
2. Failure to specify a database directory, i.e. `con <- dbConnect(MonetDBLite::MonetDBLite())` will initiate the server within your current working directory.

## Data Importation

### Copying a `data.frame` object into a table within the MonetDBLite database.

Writing a R `data.frame` into MonetDBLite is efficient. The most straightforward way of transferring a `data.frame` into a table stored within the database is through [`dbWriteTable`](http://www.inside-r.org/packages/cran/DBI/docs/dbWriteTable):

```R
# directly copy a data.frame object to a table within the database
dbWriteTable(con, "mtcars", mtcars)
```

### Copying a CSV file into a table within the MonetDBLite database.

You can also directly import from a CSV by providing a file name instead of a `data.frame` to `dbWriteTable`:

```R
# construct an example CSV file on the local disk
csvfile <- tempfile()
write.csv(mtcars, csvfile, row.names = FALSE)

# directly copy a csv file to a table within the database
dbWriteTable(con, "mtcars2", csvfile)
```

### Manually constructing a SQL table

The SQL interface of MonetDBLite can also be used to manually create a table and import data:
```R
# construct an example CSV file on the local disk
csvfile <- tempfile()
write.csv(mtcars, csvfile, row.names = FALSE)

# start a SQL transaction
dbBegin(con)
# construct an empty table within the database, using a manually-defined structure
dbSendQuery(con, "CREATE TABLE mtcars3 (mpg DOUBLE PRECISION, cyl INTEGER, disp DOUBLE PRECISION, hp INTEGER, drat DOUBLE PRECISION, wt DOUBLE PRECISION, qsec DOUBLE PRECISION, vs INTEGER, am INTEGER, gear INTEGER, carb INTEGER)")
# copy the contents of a CSV file into the database, using the MonetDB COPY INTO command
dbSendQuery(con, paste0("COPY OFFSET 2 INTO mtcars3 FROM '", csvfile, "' USING DELIMITERS ',','\n','\"' NULL as ''"))
# finalize the SQL transaction
dbCommit(con)
```

Note how we wrap the two commands in a transaction using `dbBegin` and `dbCommit`. This creates all-or-nothing semantics. See the MonetDB documentation for details on [how to create a table](https://www.monetdb.org/Documentation/Manuals/SQLreference/Tables) and [how to perform bulk input](https://www.monetdb.org/Documentation/Manuals/SQLreference/CopyInto).
#### Notes

1. Note how we wrap the two commands in a transaction using `dbBegin` and `dbCommit`. This creates all-or-nothing semantics. See the MonetDB documentation for details on [how to create a table](https://www.monetdb.org/Documentation/Manuals/SQLreference/Tables) and [how to perform bulk input](https://www.monetdb.org/Documentation/Manuals/SQLreference/CopyInto).


## Data Export

The contents of an entire table within the database can be transferred to an R `data.frame` object with [`dbReadTable`](http://www.inside-r.org/packages/cran/DBI/docs/dbReadTable). Since MonetDBLite is most useful for the storage and analysis of large datasets, there might be limited utility to copying an entire table into working RAM in R. The `dbReadTable` function and a SQL `SELECT * FROM tablename` command are equivalent:

```R
# directly copy a table within the database to an R data.frame object
x <- dbReadTable(con, "mtcars")

# directly copy a table within the database to an R data.frame object
y <- dbGetQuery(con, "SELECT * FROM mtcars" )
```



## Shutdown

Expand All @@ -67,3 +129,5 @@ MonetDBLite allows multiple concurrent *connections* to a single database, but d
```R
dbDisconnect(con, shutdown=TRUE)
```

MonetDBLite does not allow multiple R sessions to connect to a single database concurrently. As soon as a single R session loads an embedded server, that server is locked to other R console windows.
Binary file added speed_comparisons.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit ed718af

Please sign in to comment.