Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use databases for saving #99

Open
JoeyBF opened this issue Jun 14, 2022 · 4 comments
Open

Use databases for saving #99

JoeyBF opened this issue Jun 14, 2022 · 4 comments

Comments

@JoeyBF
Copy link
Collaborator

JoeyBF commented Jun 14, 2022

It would be more convenient to store resolutions using some kind of database, e.g. SQLite.

  • Since we have (tens of millions of) very small files, a lot of the disk space is just filesystem overhead. Aggregating them would move us closer to solving the problem of using too much space (Improve save file disk utilization #76).
  • This would make it easy to handle save files in an ACID way. I keep running into problems with using the very same files concurrently.

I would like to be assigned to this issue if that's ok. We can discuss the database scheme here or on Zulip.

@JoeyBF
Copy link
Collaborator Author

JoeyBF commented Jun 15, 2022

Update: I looked into it and SQLite might be a bad choice. On one hand, very small resolutions are in the ideal size range for SQLite, but on the other hand, our record resolutions take up multiple TBs and require concurrent access by several hundred threads (e.g. to compute products) which is the opposite of ideal. With that said, something like PostgreSQL would work great with the large resolutions but would be massive overkill for the small ones.

Maybe we should abstract over the backing databases so we can choose at compile/run time?

@dalcde
Copy link
Contributor

dalcde commented Jun 15, 2022 via email

@JoeyBF
Copy link
Collaborator Author

JoeyBF commented Jun 15, 2022

You're right. In that case we might want to have a new "database" feature that connects to postgres. That way we can keep the low-overhead save format for small resolutions

@JoeyBF
Copy link
Collaborator Author

JoeyBF commented Jun 17, 2022

I'll have to add some methods in fp for saving and loading from sql, so I think it would be better if we finalize #53 beforehand, and possibly the upcoming matrix rewrite as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants