-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store dependency table in different format #300
Comments
As a side note, |
We have now benchmark results for comparing using csv, pickle, parquet files to store the dependency table available at https://github.com/audeering/audb/tree/a8bb3367a37fae79601e189ccac76a1a12105bae/benchmarks#audbdependencies-loadingwriting-to-file. We first focus on the results for reading as this will be performed more often than writing.
When looking at writing performance we get:
Having those results in mind it seems to be reasonable to switch storing the dependency table directly as parquet files, both on the server and in cache. |
Solved by #372. |
For tables we support CSV to provide them in a human readable format, but this is not necessary for the dependency table. In addition, the dependency table is frequently accessed to gather basic information about a database.
I think it would make sense to switch to another format when storing it for new databases. It should be fast to read, and maybe support reading only parts like columns or rows of it to make sure it will always fit in memory.
The text was updated successfully, but these errors were encountered: