Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track each SST's timestamp information as user properties #8959

Closed
riversand963 opened this issue Sep 25, 2021 · 5 comments
Closed

Track each SST's timestamp information as user properties #8959

riversand963 opened this issue Sep 25, 2021 · 5 comments
Assignees

Comments

@riversand963
Copy link
Contributor

riversand963 commented Sep 25, 2021

(Internal ref: T86717137)
We have been working on adding support for user-defined timestamp feature. Application can specify a timestamp when writing each key-value pair.
Each SST files consist of multiple (potentially many) data blocks and several metadata blocks. Among the metadata blocks, there is one called Properties block that tracks some pre-defined properties of this SST file.
We should track timestamp-related information in each SST file's properties block, e.g. min and max timestamps of all keys in the file. Otherwise, the SST file is hardly self-contained: without timestamp info, it's hard to tell whether the keys in the SST have timestamps or not. (In fact, the timestamp info can also be inferred from Comparator, but Comparator info stored in the properties block is a string, thus it is more convenient just to store the timestamp info as well.)

A few places to consider

  • When opening database, RocksDB allow application to pass a list of TablePropertiesCollector objects (a member of advanced column family options). Each of these collectors' methods will be invoked at certain points while building an SST file.
  • utilities/table_properties_collectors contain a few example properties collectors. They collect information about the keys in the SST files while processing the keys.
  • Maybe we should mandate such a user properties collector if application enables timestamp for a column family.

We can track these information in the properties blocks to enable additional validation.

We follow Google C++ style guide (https://google.github.io/styleguide/cppguide.html).

@riversand963
Copy link
Contributor Author

@wolfkdy @eharry

@riversand963
Copy link
Contributor Author

@wolfkdy @eharry just want to clarify, would you be interested in this one? If you have not started, can I take this back and re-assign? If I understand correctly, #8957 can achieve the same purpose for you.

@eharry
Copy link
Contributor

eharry commented Oct 5, 2021

@riversand963 We are very willing to do this work. The development work has already begun. We will submit a patch for review this weekend.

@sunlike-Lipo
Copy link
Contributor

I post a merge request for this issue
#8997

@sunlike-Lipo
Copy link
Contributor

sunlike-Lipo commented Oct 30, 2021

@riversand963 I repost a new pull request for this issue.
#9093

sunlike-Lipo added a commit to sunlike-Lipo/rocksdb that referenced this issue Nov 8, 2021
sunlike-Lipo added a commit to sunlike-Lipo/rocksdb that referenced this issue Nov 8, 2021
facebook-github-bot pushed a commit that referenced this issue Nov 19, 2021
Summary:
Track each SST's timestamp information as user properties #8959

Rockdb has supported user-defined timestamp feature. Application can specify a timestamp
when writing each k-v pair. When data flush from memory to disk file called SST files.
Each SST files consist of multiple data blocks and several metadata blocks. Among the metadata
blocks, there is one called Properties block that tracks some pre-defined properties of this SST file.

This PR is for collecting the properties of min and max timestamps of all keys in the file. With those
properties the SST file is more convenient to tell whether the keys in the SST have timestamps or not.

The changes involved are as follows:

1) Add a class TimestampTablePropertiesCollector to collect min/max timestamp when add keys to table,
   The way TimestampTablePropertiesCollector use to compare timestamp of key should defined by
   user by implementing the Comparator::CompareTimestamp function in the user defined comparator.
2) Add corresponding unit tests.

Pull Request resolved: #9093

Reviewed By: ltamasi

Differential Revision: D32406927

Pulled By: riversand963

fbshipit-source-id: 25922971b7e67bacf4d53a1fb67c4c5ddaa61573
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants