Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix MAU reaping where reserved users are specified. #6168

Merged
merged 7 commits into from Oct 11, 2019

Conversation

@neilisfragile
Copy link
Contributor

neilisfragile commented Oct 4, 2019

The bug I am fixing is quite hard to explain, but in short, in some cases the reaping background task would delete too many users from the monthly_active_user table if reserved users had been specified.

@neilisfragile neilisfragile changed the title Fix mua reaping where reserved are specified. Fix mau reaping where reserved are specified. Oct 4, 2019
@neilisfragile neilisfragile requested a review from matrix-org/synapse-core Oct 4, 2019
for i in range(initial_users):
user = "@user%d:server" % i
email = "user%d@example.com" % i
self.store.upsert_monthly_active_user(user)

This comment has been minimized.

Copy link
@erikjohnston

erikjohnston Oct 8, 2019

Member
Suggested change
self.store.upsert_monthly_active_user(user)
self.get_success(self.store.upsert_monthly_active_user(user))

This comment has been minimized.

Copy link
@erikjohnston

erikjohnston Oct 8, 2019

Member

etc for other store functions

synapse/storage/monthly_active_users.py Outdated Show resolved Hide resolved
query_args = []
query_args.extend(self.reserved_users)
query_args.append(safe_guard)
query_args.extend(self.reserved_users)

This comment has been minimized.

Copy link
@erikjohnston

erikjohnston Oct 8, 2019

Member

You can write this as query_args = [*self.reserved_users, safe_guard, *self.reserved_users]


# Update reserved_users to ensure it stays in sync, this is important
# for reaping.
self.reserved_users = users

This comment has been minimized.

Copy link
@erikjohnston

erikjohnston Oct 8, 2019

Member

This feels wrong, as there's nothing to say that get_registered_reserved_users_count is going to get called to make this accurate. Can't we query the database to get the reserved_users when we need it? (And potentially cache it?)

This comment has been minimized.

Copy link
@neilisfragile

neilisfragile Oct 9, 2019

Author Contributor

Done, opted not to cache because method is called every hour and ought to be light.

else:
safe_guard = max_mau_value - len(self.reserved_users)
# Must be greater than zero for postgres
safe_guard = safe_guard if safe_guard > 0 else 0

This comment has been minimized.

Copy link
@erikjohnston

erikjohnston Oct 8, 2019

Member

Is this not just safe_guard = max(max_mau_value - len(self.reserved_users), 0). Also can we rename it to something like num_users_to_remove?

This comment has been minimized.

Copy link
@neilisfragile

neilisfragile Oct 9, 2019

Author Contributor

have renamed, but the sense is the opposite, because it is number of non reserved users to save.

@richvdh richvdh changed the title Fix mau reaping where reserved are specified. Fix MAU reaping where reserved users are specified. Oct 8, 2019
@neilisfragile neilisfragile requested a review from erikjohnston Oct 9, 2019
Copy link
Member

erikjohnston left a comment

Just a couple of formatting nits, otherwise good to go

",".join(questionmarks)
)
query_args.extend(reserved_users)
sql = base_sql + """ AND user_id NOT IN ({})""".format(question_marks)

This comment has been minimized.

Copy link
@erikjohnston

erikjohnston Oct 10, 2019

Member
Suggested change
sql = base_sql + """ AND user_id NOT IN ({})""".format(question_marks)
sql = base_sql + " AND user_id NOT IN ({})".format(question_marks)
ORDER BY timestamp DESC
LIMIT ?
)
AND user_id NOT IN ({})""".format(

This comment has been minimized.

Copy link
@erikjohnston

erikjohnston Oct 10, 2019

Member
Suggested change
AND user_id NOT IN ({})""".format(
AND user_id NOT IN ({})
""".format(
user_id = yield self.hs.get_datastore().get_user_id_by_threepid(
tp["medium"], tp["address"]
)
if user_id:
count = count + 1
return count
users = users + (user_id,)

This comment has been minimized.

Copy link
@erikjohnston

erikjohnston Oct 10, 2019

Member

It'd probably be easier to return a list rather than using tuples, really.

@neilisfragile neilisfragile merged commit a0d0ba7 into develop Oct 11, 2019
18 checks passed
18 checks passed
buildkite/synapse Build #4918 passed (22 minutes, 20 seconds)
Details
buildkite/synapse/check-sample-config Passed (2 minutes, 3 seconds)
Details
buildkite/synapse/check-style Passed (1 minute, 47 seconds)
Details
buildkite/synapse/isort Passed (19 seconds)
Details
buildkite/synapse/mypy Passed (58 seconds)
Details
buildkite/synapse/newspaper-newsfile Passed (44 seconds)
Details
buildkite/synapse/packaging Passed (1 minute, 17 seconds)
Details
buildkite/synapse/pipeline Passed (3 seconds)
Details
buildkite/synapse/python-3-dot-5-slash-postgres-9-dot-5 Passed (17 minutes, 59 seconds)
Details
buildkite/synapse/python-3-dot-5-slash-sqlite Passed (7 minutes, 18 seconds)
Details
buildkite/synapse/python-3-dot-5-slash-sqlite-slash-old-deps Passed (8 minutes, 20 seconds)
Details
buildkite/synapse/python-3-dot-6-slash-sqlite Passed (7 minutes, 21 seconds)
Details
buildkite/synapse/python-3-dot-7-slash-postgres-11 Passed (18 minutes, 10 seconds)
Details
buildkite/synapse/python-3-dot-7-slash-postgres-9-dot-5 Passed (17 minutes, 36 seconds)
Details
buildkite/synapse/python-3-dot-7-slash-sqlite Passed (7 minutes, 6 seconds)
Details
buildkite/synapse/sytest-python-3-dot-5-slash-postgres-9-dot-6-slash-monolith Passed (13 minutes, 58 seconds)
Details
buildkite/synapse/sytest-python-3-dot-5-slash-postgres-9-dot-6-slash-workers Passed (14 minutes, 17 seconds)
Details
buildkite/synapse/sytest-python-3-dot-5-slash-sqlite-slash-monolith Passed (13 minutes, 53 seconds)
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.