Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DB or RDS? which one is more performant? #1

Open
ivokwee opened this issue Jul 6, 2023 · 3 comments
Open

DB or RDS? which one is more performant? #1

ivokwee opened this issue Jul 6, 2023 · 3 comments

Comments

@ivokwee
Copy link
Contributor

ivokwee commented Jul 6, 2023

Little question. Which of the two is more performant say with the database or rds file in a S3 bucket in the cloud?

@ivokwee
Copy link
Contributor Author

ivokwee commented Jul 7, 2023

I have done some tests. RDS is slowest because you need to read the whole rds object and write on each adding of a new message. This can become slow if you have thousands of messages. DBI is much faster as you do not need to write the entire table upon each message.

I also implemented a CSV connector. It uses data.table::fread and fwrite. They are fast. But also each message is just appended to the file, so not the whole table needs to be written. Furthermore, we if we do not need the whole chat history, can read only the last say 100 lines, so not the entire table needs to be read. I think also for the DBI connector it would be good to just retrieve the last 100 chat posts.

@ivokwee
Copy link
Contributor Author

ivokwee commented Jul 7, 2023

Code to benchmark. DBI is fastest, CSV is good, RDS is slow.

c1 <- shinyChatR:::RDSConnection$new("chat-data.rds")
c2 <- shinyChatR:::CSVConnection$new("chat-data.csv",nlast=100)
c3 <- shinyChatR:::DBConnection$new(db_file="chat-data.db")

msg = lorem::ipsum(1)[[1]]

system.time(
  for(i in 1:4000) {
    c1$insert_message(msg, user="alice", time="00:00:00")
    d1 <- c1$get_data()
  }
)

system.time(
  for(i in 1:4000) {
    c2$insert_message(msg, user="alice", time="00:00:00")
    d1 <- c2$get_data()
  }
)

system.time(
  for(i in 1:4000) {
    c3$insert_message(msg, user="alice", time="00:00:00")
    d1 <- c3$get_data()
  }
)

@julianschmocker
Copy link
Owner

This is good to know. I guess that for large amounts of data the DBI with a SQLite connection is not optimal. In these case sever based database like PostgreSQL is preferred.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants