New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement TTL support for Pinot upsert #9529
Comments
We summarized the challenges and thoughts for partial upsert in this design Please review cc @Jackie-Jiang @chenboat @yupeng9 |
After discussion with @Jackie-Jiang @yupeng9 @chenboat We can break down the feature into the following part.
|
Thanks for summarizing it. Part 1.3 is not required. Controller will ask server for the snapshot and then controller is responsible for the snapshot upload |
This was referenced Jan 3, 2023
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Apache Pinot provides native support of Upsert since v0.6.0 (#4261), it allows users to modify existing records, and successfully onboard many use cases. We observed Pinot upsert clusters usually have high usage of heap memory. This is because the upsert metadata (primaryKeyIndexes and validDocIndexes), are stored in heap of pinot hosts. For use cases with high cardinality of primary keys, the heap usage of these upsert tables usually becomes the bottleneck of the hardware resource.
For some use cases, records that shared primary keys will get updates frequently during a time window, and after the time window, these records won’t get updated any more. In these use cases, each primary key has a lifecycle and will be deactivated after the time window. Currently these primary keys won’t expire until the retention days, and they will be kept in primaryKeyIndexes. We shall introduce TTL (time-to-live) for Pinot primary keys. Primary keys will expire after the TTL, and we can remove inactive keys from upsert metadata to save heap space.
Few Challenges that we want to solve.
The text was updated successfully, but these errors were encountered: