Skip to content

bwlewis/duckdb_and_r

main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

My thoughts on DuckDB and R with examples

R packages discussed in these notes include duckdb (of course), dplyr, data.table, fst, xts, RSQLite, and vroom and a little Python Pandas by way of reticulate.

The notes exhibit a mild disdain for SQL. For a much more comprehensive discussion on difficulties with SQL, see these really interesting notes by Jamie Brandon: https://scattered-thoughts.net/writing/against-sql/. As an alternative to SQL I generally prefer dplyr.

These notes present several interesting, if somewhat eclectic, data-sciency examples. For more comprehensive and straight-up database-style performance comparisons, see the excellent work by H20 here: https://h2oai.github.io/db-benchmark/ (where both R's data.table and DuckDB perform very well in general).

Also, you should check out https://github.com/pola-rs/polars for a remarkably high-performance new data frame implementation in Rust and geared to Python right now. This is the first data frame-like environment I have seen that really gives R's data.table competition, aside from KDB+ of course.

Main overview:

The easy pieces:

A SQL rant born out of frustration while compiling these notes appears here:

Declarative, Schmerative

About

My thoughts and examples on DuckDB and R

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages