-
-
Notifications
You must be signed in to change notification settings - Fork 7.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
database: Add database index for private messages. #9753
Conversation
Hello @zulip/server-api, @zulip/server-production members, this pull request was labeled with the "area: api", "area: production" labels, so you may want to check it out! |
pinging @timabbott for review. It'd be great to have this reviewed and merged in order to start with adding another flag in #7459. |
@shubham-padia this looks good, except that I'm a bit worried that the migration to add the new flag might be really slow and need to be batched. The only real way to test that is to try it, so I'm going to test-deploy this to chat.zulip.org and see what happens. If it doesn't go well, we'll need to batch it (e.g. by user). |
OK, conclusion was that the migration's runtime was at least 2 minutes on chat.zulip.org, which is way too long for a synchronous migration. So we'll need to do it in batches, using |
Cool! Working on that. |
5f6c1d0
to
4a1cd48
Compare
@timabbott I've updated the PR with the requested changes |
apps: StateApps, schema_editor: DatabaseSchemaEditor) -> None: | ||
UserMessage = apps.get_model("zerver", "UserMessage") | ||
UserProfile = apps.get_model("zerver", "UserProfile") | ||
user_profiles = UserProfile.objects.all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good overall, but one comment.
It's a simple optimization here to just fetch userids with .values('id')
. I know it's not the bottleneck here, but you're not buying much simplicity to pull in the fat user objects for all Nk users.
stream_objects = UserMessage.objects.filter(message__recipient__type = 2, | ||
user_profile = user_profile) | ||
private_objects.update(flags=F('flags').bitor(UserMessage.flags.is_private)) | ||
stream_objects.update(flags=F('flags').bitand(~UserMessage.flags.is_private)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if it would be cheaper to do it like this:
- clear all flags where is_private = True (probably very few, since the old flag was rare)
- set all flags for private_objects
This will result in fewer updates and allow you to replace the second query that has to join to Recipient via Message with a simpler query.
I added a couple comments, both of which are easy-to-implement performance optimizations. |
@showell Thanks for the review, I've updated the PR :) |
Thanks @shubham-padia! Your changes address my comments, and overall LGTM, but I'll wait for Tim to merge this, for fairly obvious reasons. 😄 |
This seems to be really slow; we end up doing a table scan of |
Why is it table scanning Message when message_id is part of the join? Also aren’t there more PM recipients than users? I suppose we could force all flags to 1 then invert them for UMs attached to stream recipients. I suspect this might just be inherently slow and we should strategize to make the slowness mostly transparent to users? |
I'm currently testing this as the main migration block:
It also seems fairly slow; this will end up being a fairly expensive migration to execute :/. |
(Regardless, I think that this branch is missing a commit to make the |
Since #6896 only mentioned adding the index, I thought making the narrow use the index was a follow-up issue, I'll add the commit in this PR then. |
The reason it's table-scanning Message is because even though we have an index on |
Why isn’t it starting with UserMessage rows that match the user_profile_id index and then finding the Message in o(1) and filtering on recipient id? |
The query plan is below:
|
seems to do considerably better. |
zerver/tests/test_messages.py
Outdated
self.send_personal_message(self.example_email("hamlet"), user_profile.email, | ||
content="test") | ||
message = most_recent_message(user_profile) | ||
assert(UserMessage.objects.get(user_profile=user_profile, message=message).flags.is_private.is_set) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.assertTrue
is the right way to write this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(In general, one doesn't use the assert
keyword in unit tests; it produces less nice output than the self.assert*
methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, I'll push the change with the other changes and will take care of this is future :)
I tweaked the first commit to add a variable I think we need to also add a check in the
(which I bet 500s with an invalid flag name) so we'll need to add something new with some tests to give a nice "Invalid flag 'foo'" type error message. |
d1999a4
to
100746a
Compare
@timabbott This is ready for another review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a few of the changes mentioned here, and pushed the branch back. Can you re-test the migration and fix the tests?
zerver/lib/actions.py
Outdated
# `flags` is a list. Do not use `flags` here by mistake. | ||
flagattr = getattr(UserMessage.flags, flag) | ||
else: | ||
raise JsonableError(_("Invalid flag: %s" % (flag,))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should do an early-exit with JsonableError
. Also, I renamed flags
to valid_flags
so we don't need the comment.
UserMessage = apps.get_model("zerver", "UserMessage") | ||
Message = apps.get_model("zerver", "Message") | ||
i = 0 | ||
total = Message.objects.count() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This total
approach is buggy, because if messages are being sent while this runs, the total we queried before is wrong. At least if we're using atomic=False
which we should for this expensive migration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Using the looping flow from migration 0177 is robust against this sort of thing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(we need to adapt it a bit for the fact the query is sparse, though; see the code I just pushed)
zerver/models.py
Outdated
# Zulip backend, and don't make sense to expose to the API. A | ||
# good example is is_private, which is just a denormalization of | ||
# message.recipient_type for database query performance. | ||
NON_API_FLAGS = {"is_private"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a commit that also puts active_mobile_push_notification
in this list.
zerver/tests/test_events.py
Outdated
|
||
with self.assertRaises(JsonableError): | ||
do_update_message_flags(user_profile, get_client("website"), 'remove', 'invalid', [message]) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_events.py
is really just for event system races; we should move your test changes. git grep messages/flags
shows that test_messages.py
is probably where this belongs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@timabbott Can you have a look at test_messages.py
, I'm not sure under which test case class should this fall?
(When adding the test initially I had a look at test_messages.py
, but I didn't find any relevant test case classes.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add a new test function in MessageAccessTests
, next to the starring tests.
zerver/tests/test_messages.py
Outdated
"test") | ||
|
||
for msg in self.get_messages(): | ||
self.assertTrue('is_private' not in msg['flags']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a self.assertNotIn
.
10ef506
to
0aa5580
Compare
I test-deployed this on chat.zulip.org, and the index works and seems to fix the original performance problem, which is great! So just a bit more cleanup and we can merge this. |
I guess one of the more important pieces of cleanup is we should have the migration abort early if there are no messages in the database (the CI failure):
|
5750976
to
f3f97ad
Compare
See the comment for why this is correct; basically, this flag is used only for internal accounting, and would only confuse API clients.
The reasoning here is similar to `is_private`; this flag is only used for internal accounting inside the Zulip server.
Raise error if flag is present in NON_API_FLAGS or is not present in UserMessage.flags.
@timabbott This is ready for another review. |
do_update_message_flags(user_profile, get_client("website"), 'remove', first_non_api_flag, [message]) | ||
|
||
with self.assertRaises(JsonableError): | ||
do_update_message_flags(user_profile, get_client("website"), 'remove', 'invalid', [message]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our tests, we usually prefer API interaction to calling raw actions.py
methods where possible. I'll just redo this that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
)
- user_profile = self.example_user('hamlet')
- for first_non_api_flag in UserMessage.NON_API_FLAGS:
- break
- with self.assertRaises(JsonableError):
- do_update_message_flags(user_profile, get_client("website"), 'remove', first_non_api_flag, [message])
+ self.login(self.example_email("hamlet"))
+ result = self.client_post("/json/messages/flags",
+ {"messages": ujson.dumps([message]),
+ "op": "add",
+ "flag": "invalid"})
+ self.assert_json_error(result, "Invalid flag: 'invalid'")
- with self.assertRaises(JsonableError):
- do_update_message_flags(user_profile, get_client("website"), 'remove', 'invalid', [message])
+ result = self.client_post("/json/messages/flags",
+ {"messages": ujson.dumps([message]),
+ "op": "add",
+ "flag": "is_private"})
+ self.assert_json_error(result, "Invalid flag: 'is_private'")
+
+ result = self.client_post("/json/messages/flags",
+ {"messages": ujson.dumps([message]),
+ "op": "add",
+ "flag": "active_mobile_push_notification"})
+ self.assert_json_error(result, "Invalid flag: 'active_mobile_push_notification'")
def change_star(self, messages: List[int], add: bool=True, **kwargs: Any) -> HttpResponse:
Nice, I changed the one thing I mentioned above, and merged this as the series of commits ending with bdaff17 (note that I also moved the more extensive commit message and closes to the last commit). Thanks @shubham-padia!! |
Fixes #6896.
For commit 1:
For
set_initial_value_of_is_private_flag
, I tried out two approaches for the migration.The first was :
And the second one was the currently implemented one.
Although I suspected that the second approach maybe faster, I checked the execution time for both the approaches. The first approach took about ~1.44s for my db and second one about ~0.066s. So I chose to go with the second approach.
For (hah, don't think I can call this testing) testing the migration, I ran
similarly for
private_objects
.For commit 2:
Other commits like 8bb812c also added the sql to the changelog, since there was no changelog file rn, I haven't added it yet.
I've also added the migration to
upgrade-zulip-stage-2
script. From what the script stated, this is used to upgrade to a newer version of Zulip and not any specific version, so I added the migration to it like the other commits.