Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Working proposal for adding RSQLite (SQLite/json1) #25

Merged
merged 22 commits into from Jul 2, 2019
Merged

Working proposal for adding RSQLite (SQLite/json1) #25

merged 22 commits into from Jul 2, 2019

Conversation

rfhb
Copy link
Collaborator

@rfhb rfhb commented May 11, 2019

This is a PR to supersede #24, addressing review comments and extending functionality.

Included is a working implementation of nodbi for use with RSQLite, in analogy to existing methods (in particular MongoDB). Also, a nested ragged json dataset (contacts) is included for testing.

RSQLite ships since 2016 with an sqlite version that includes the json1 extension, thus enabling a JSON store for small projects or for exploring.

The query method has many lines, for translating json strings for query and fields into SQL.

Thanks

@sckott
Copy link
Contributor

sckott commented May 22, 2019

thanks @rfhb sorry for the delay, vacation and other work stuff.

Will have a look at this hopefully today

@rfhb
Copy link
Collaborator Author

rfhb commented May 22, 2019

Thanks. Am still working on the query method to make this work exactly like the mongo method. Perhaps I can ping you here once the additional works are ready.

Copy link
Contributor

@sckott sckott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create takes quite a long time for sqlite compared to all backends except etcd. Wondering if create can be sped up at all?

library(nodbi)
library(ggplot2)

src <- src_sqlite("test.sqlite")
system.time(docdb_create(src, "diamonds", diamonds))
#> 139 sec

src_mongo <- src_mongo()
system.time(docdb_create(src_mongo, "diamonds", diamonds))
#> 0.8 sec 

src_redis <- src_redis()
system.time(docdb_create(src_redis, "diamonds", diamonds))
#> 0.1 sec

src_elastic <- src_elastic()
system.time(docdb_create(src_elastic, "diamonds", diamonds))
#> 16 sec

src_couch <- src_couchdb()
system.time(docdb_create(src_couch, "diamonds", diamonds))
#> 80 sec

src_etcd <- src_etcd()
system.time(docdb_create(src_etcd, "/diamonds", diamonds))
#> really long time

R/create.R Outdated Show resolved Hide resolved
R/get.R Show resolved Hide resolved
NAMESPACE Outdated Show resolved Hide resolved
@sckott
Copy link
Contributor

sckott commented Jun 11, 2019

any thoughts @rfhb ?

@rfhb
Copy link
Collaborator Author

rfhb commented Jun 12, 2019

Many thanks @sckott for getting back - I had been working on this over the last weeks. Happy to hear back from your thoughts on the latest commits!

@sckott
Copy link
Contributor

sckott commented Jun 14, 2019

thanks @rfhb - can you address my question about performance?

@rfhb
Copy link
Collaborator Author

rfhb commented Jun 15, 2019

thanks @sckott - performance is well improved, mentioned this in my comment #25 (comment) and here are figures on a laptop:

library(nodbi)

## database in file system
unlink("test.sqlite")
src <- src_sqlite("test.sqlite")
system.time(docdb_create(src, "diamonds", diamonds))
#> 20 sec elapsed time

## memory-based database
src <- src_sqlite()
system.time(docdb_create(src, "diamonds", diamonds))
#> 0.8 sec elapsed time

The difference in execution time is solely due to RSQLite::dbWriteTable() or its generic function DBI::dbAppendTable(), which takes longer for writing to disk compared to memory.

@sckott
Copy link
Contributor

sckott commented Jun 15, 2019

thanks @rfhb - leaving on vacation for a week - will take another look then - sorry about the delay

@rfhb
Copy link
Collaborator Author

rfhb commented Jul 2, 2019

Hi, would you want to have another look at the PR, please? I am now starting to use it for another project and believe to have addressed comments so far. Many thanks.

@sckott
Copy link
Contributor

sckott commented Jul 2, 2019

having a look now

@sckott
Copy link
Contributor

sckott commented Jul 2, 2019

You're right, the performance is very good in memory, and much improved when writing to disk.

@sckott sckott added this to the v0.3 milestone Jul 2, 2019
@sckott
Copy link
Contributor

sckott commented Jul 2, 2019

looks good, thanks for your contribution, assigned on next milestone https://github.com/ropensci/nodbi/milestone/3

@sckott sckott merged commit d8a1e46 into ropensci:master Jul 2, 2019
@rfhb rfhb deleted the sqlite-addition branch July 23, 2019 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants