Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update aggregate user repopulate to use the past two weeks #2069

Merged
merged 15 commits into from
Nov 19, 2021

Conversation

isaacsolo
Copy link
Contributor

@isaacsolo isaacsolo commented Nov 15, 2021

Description

Running a full refresh with a truncate blocks other queries on aggregate_user. Since there are several dependencies on aggregate_user, we cannot simply do a drop and replace on that. We'd have to drop and recreate those dependencies too.

This change reduces the frequency of refreshes to about once a day and only refreshes user data from the past two weeks, reducing the frequency and duration of potential blocking.

Tests

Run locally against prod snapshot.
Verified updates in different states: refresh past two weeks, full recreation when cache is empty, regular updates.

How will this change be monitored?

@isaacsolo isaacsolo marked this pull request as ready for review November 15, 2021 19:18
most_recent_indexed_aggregate_block = 0
session.execute("TRUNCATE TABLE {}".format(table_name))
elif latest_indexed_block_num % REFRESH_COUNTER == 0:
Copy link
Contributor Author

@isaacsolo isaacsolo Nov 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not liking this random behavior anymore. Gonna change this to use a last completed timestamp saved in redis, then update based on that.

@isaacsolo isaacsolo changed the title Update aggregate user re-create to the past two weeks Update aggregate user repopulate to use the past two weeks Nov 16, 2021
Copy link
Contributor

@dmanjunath dmanjunath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one question about a potential bug but otherwise looks good!

most_recent_indexed_aggregate_block = 0
session.execute("TRUNCATE TABLE {}".format(table_name))
redis.set(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should move these redis.set calls after the session.execute. otherwise i think this could cause some inconsistency? like to avoid the case where a condition like variable gets set in redis but the worker gets restarted before completing so the db isn't actually updated. we should move it near the other redis.set after the execute

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea moved it

@isaacsolo isaacsolo merged commit 5ebf1c8 into master Nov 19, 2021
@isaacsolo isaacsolo deleted the is-agg-user-truncate branch November 19, 2021 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants