-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Materialize candidates for whats-hot algo #1086
Conversation
await db.schema | ||
.createView('algo_whats_hot_view') | ||
.materialized() | ||
.as(viewQb) | ||
.execute() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to flag this for you @dholms— this will lock post
and post_agg
for loading the initial data into the view. Future refreshes will be done concurrently
which does not require locking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not possible to do the initial create concurrently is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to get that going by creating the view with no data
, then running a concurrent refresh afterwards, but it doesn't work: apparently in order to perform a concurrent refresh the view already has to be populated.
.where('candidate.score', '<', maxScore) | ||
.where('post.cid', '!=', cursorCid) | ||
} | ||
const keyset = new ScoreKeyset(ref('candidate.score'), ref('candidate.cid')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call switching this to a real keyset
|
||
const { ref } = db.dynamic | ||
|
||
// materialized views are difficult to change, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks great 🙌
* Setup whats-hot materialized view w/ params, run view maintainer in service entrypoint * Update whats-hot to use new materialized view, tidy * Update migration to create with no data * Revert whats-hot migration change for no data * Bump refresh rate for view down to 1min
* Setup whats-hot materialized view w/ params, run view maintainer in service entrypoint * Update whats-hot to use new materialized view, tidy * Update migration to create with no data * Revert whats-hot migration change for no data * Bump refresh rate for view down to 1min
The whats-hot algo does some expensive work of paginating over a dynamic score for all post over the past day (i.e. the feed candidates).
This work replaces the subquery to score the candidates with a materialized view that is refreshed concurrently every two minutes (configurable). The materialized view is parameterized using a separate table so that we can alter the like threshold (default: 10) and the interval over which we grab candidates (default: 1day).
One complexity here is around how the refresh is scheduled. This piggy-backs off the
Leader
abstraction to ensure there's no more than one node maintaining the materialized view at a time. The creator of the view also needs to be the user who performs the refresh due to postgres limitations.