-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support incremental backup #1248
Comments
RocksDB's backup support: How to backup RocksDB But COPY DATABASE is different from BACKUP DATABASE as COPY is much simpler. We might also need to write some metadata to the target directory (to store the start/end time). |
Backup also involves backing up manifest files etc.
Necessary metadata I come up with:
|
Can we use parquet metadata https://parquet.apache.org/docs/file-format/metadata/ for our metadata? Using less files reduces chance of corrupted data. |
If "our metdata" refers to catalog/schema/table name, data time range and backup time, yes, we are going to write these to parquet footer's metadata section, juts like arrow does. We don't have a separate metadata file now. |
Closed via #1240 |
What problem does the new feature solve?
Given that all tables in GreptimeDB contains a timestamp column, we can allow users to backup data in some database within a specified time range into some directory, in an one file for one table manner.
What does the feature do?
Implement some SQL syntax like:
which export rows within given timerange of all tables in that database to target directory. All exported rows of one table will reside in the same parquet file.
Or maybe we can skip SQL and use HTTP Admin API first for prototype.
Implementation challenges
Writer
support can simplify stream parquet writer implementation feat: upgrade opendal #1245ParquetWriter
to a stream writer that does not dump all parquet content in memory and write to underlying storage at a time, which may cause huge memory consumption in this case feat: buffered parquet writer #1263greptimedb/src/storage/src/sst/parquet.rs
Lines 107 to 109 in 8e0fdb0
COPY DATABASE
or HTTP API handler.The text was updated successfully, but these errors were encountered: