Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Materialize candidates for whats-hot algo #1086

Merged
merged 7 commits into from
May 26, 2023
Merged

Conversation

devinivy
Copy link
Collaborator

The whats-hot algo does some expensive work of paginating over a dynamic score for all post over the past day (i.e. the feed candidates).

This work replaces the subquery to score the candidates with a materialized view that is refreshed concurrently every two minutes (configurable). The materialized view is parameterized using a separate table so that we can alter the like threshold (default: 10) and the interval over which we grab candidates (default: 1day).

One complexity here is around how the refresh is scheduled. This piggy-backs off the Leader abstraction to ensure there's no more than one node maintaining the materialized view at a time. The creator of the view also needs to be the user who performs the refresh due to postgres limitations.

Comment on lines +65 to +69
await db.schema
.createView('algo_whats_hot_view')
.materialized()
.as(viewQb)
.execute()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to flag this for you @dholms— this will lock post and post_agg for loading the initial data into the view. Future refreshes will be done concurrently which does not require locking.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not possible to do the initial create concurrently is it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to get that going by creating the view with no data, then running a concurrent refresh afterwards, but it doesn't work: apparently in order to perform a concurrent refresh the view already has to be populated.

.where('candidate.score', '<', maxScore)
.where('post.cid', '!=', cursorCid)
}
const keyset = new ScoreKeyset(ref('candidate.score'), ref('candidate.cid'))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call switching this to a real keyset


const { ref } = db.dynamic

// materialized views are difficult to change,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome

Copy link
Collaborator

@dholms dholms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks great 🙌

@devinivy devinivy merged commit 9eb817d into main May 26, 2023
@devinivy devinivy deleted the whats-hot-materialized branch May 26, 2023 19:05
mloar pushed a commit to mloar/atproto that referenced this pull request Sep 26, 2023
* Setup whats-hot materialized view w/ params, run view maintainer in service entrypoint

* Update whats-hot to use new materialized view, tidy

* Update migration to create with no data

* Revert whats-hot migration change for no data

* Bump refresh rate for view down to 1min
mloar pushed a commit to mloar/atproto that referenced this pull request Nov 15, 2023
* Setup whats-hot materialized view w/ params, run view maintainer in service entrypoint

* Update whats-hot to use new materialized view, tidy

* Update migration to create with no data

* Revert whats-hot migration change for no data

* Bump refresh rate for view down to 1min
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants