database: Add database index for private messages. #9753

shubham-padia · 2018-06-14T16:58:49Z

For commit 1:
For set_initial_value_of_is_private_flag, I tried out two approaches for the migration.

The first was :

    UserMessage = apps.get_model("zerver", "UserMessage")
    all_objects = UserMessage.objects.all()

    for obj in all_objects:
        recipient_type = obj.message.recipient.type
        if recipient_type == 1 or recipient_type == 3:
            obj.flags |= UserMessage.flags.is_private
        else:
            obj.flags &= ~UserMessage.flags.is_private
        obj.save(update_fields=['flags'])

And the second one was the currently implemented one.

Although I suspected that the second approach maybe faster, I checked the execution time for both the approaches. The first approach took about ~1.44s for my db and second one about ~0.066s. So I chose to go with the second approach.

For (hah, don't think I can call this testing) testing the migration, I ran

for obj in stream_objects:
    if obj.message.recipient.type != 2 or obj.flags.is_private:
        print('fail')

similarly for private_objects.

For commit 2:
Other commits like 8bb812c also added the sql to the changelog, since there was no changelog file rn, I haven't added it yet.
I've also added the migration to upgrade-zulip-stage-2 script. From what the script stated, this is used to upgrade to a newer version of Zulip and not any specific version, so I added the migration to it like the other commits.

zulipbot · 2018-06-14T16:58:51Z

Hello @zulip/server-api, @zulip/server-production members, this pull request was labeled with the "area: api", "area: production" labels, so you may want to check it out!

shubham-padia · 2018-06-19T13:18:38Z

pinging @timabbott for review. It'd be great to have this reviewed and merged in order to start with adding another flag in #7459.

timabbott · 2018-06-19T17:57:18Z

@shubham-padia this looks good, except that I'm a bit worried that the migration to add the new flag might be really slow and need to be batched. The only real way to test that is to try it, so I'm going to test-deploy this to chat.zulip.org and see what happens. If it doesn't go well, we'll need to batch it (e.g. by user).

timabbott · 2018-06-19T18:02:33Z

OK, conclusion was that the migration's runtime was at least 2 minutes on chat.zulip.org, which is way too long for a synchronous migration. So we'll need to do it in batches, using atomic=False, and probably looping over the users (one transaction per user).

shubham-padia · 2018-06-19T18:38:03Z

Cool! Working on that.

shubham-padia · 2018-06-19T20:33:54Z

@timabbott I've updated the PR with the requested changes

showell · 2018-06-20T13:46:53Z

zerver/migrations/0173_user_message_add_is_private_flag.py

+        apps: StateApps, schema_editor: DatabaseSchemaEditor) -> None:
+    UserMessage = apps.get_model("zerver", "UserMessage")
+    UserProfile = apps.get_model("zerver", "UserProfile")
+    user_profiles = UserProfile.objects.all()


Code looks good overall, but one comment.

It's a simple optimization here to just fetch userids with .values('id'). I know it's not the bottleneck here, but you're not buying much simplicity to pull in the fat user objects for all Nk users.

showell · 2018-06-20T13:54:10Z

zerver/migrations/0173_user_message_add_is_private_flag.py

+            stream_objects = UserMessage.objects.filter(message__recipient__type = 2,
+                                                        user_profile = user_profile)
+            private_objects.update(flags=F('flags').bitor(UserMessage.flags.is_private))
+            stream_objects.update(flags=F('flags').bitand(~UserMessage.flags.is_private))


I'm wondering if it would be cheaper to do it like this:

clear all flags where is_private = True (probably very few, since the old flag was rare)

set all flags for private_objects

This will result in fewer updates and allow you to replace the second query that has to join to Recipient via Message with a simpler query.

showell · 2018-06-20T13:57:38Z

I added a couple comments, both of which are easy-to-implement performance optimizations.

shubham-padia · 2018-06-20T15:06:43Z

@showell Thanks for the review, I've updated the PR :)

showell · 2018-06-20T16:13:15Z

Thanks @shubham-padia! Your changes address my comments, and overall LGTM, but I'll wait for Tim to merge this, for fairly obvious reasons. 😄

timabbott · 2018-06-23T20:13:03Z

This seems to be really slow; we end up doing a table scan of zerver_message for each user. I think we want to do it one-recipient-at-a-time instead, since that avoids the table scan issue.

showell · 2018-06-23T20:21:56Z

Why is it table scanning Message when message_id is part of the join? Also aren’t there more PM recipients than users? I suppose we could force all flags to 1 then invert them for UMs attached to stream recipients. I suspect this might just be inherently slow and we should strategize to make the slowness mostly transparent to users?

timabbott · 2018-06-23T20:22:36Z

I'm currently testing this as the main migration block:

    Recipient = apps.get_model("zerver", "Recipient")
    recipient_ids = Recipient.objects.filter(type__in=[1,3]).values_list("id", flat=True)

    total = len(recipient_ids)
    i = 0
    for recipient_id in recipient_ids:
        i += 1
	print("Processing %s %s/%s" % (recipient_id, i, total))
	sys.stdout.flush()
        with transaction.atomic():
            private_objects = UserMessage.objects.filter(message__recipient_id = recipient_id)
            private_objects.update(flags=F('flags').bitor(UserMessage.flags.is_private))

It also seems fairly slow; this will end up being a fairly expensive migration to execute :/.

timabbott · 2018-06-23T20:23:00Z

(Regardless, I think that this branch is missing a commit to make the is:private narrow actually use the new index by doing a query on UserMessage?).

shubham-padia · 2018-06-23T20:26:03Z

Since #6896 only mentioned adding the index, I thought making the narrow use the index was a follow-up issue, I'll add the commit in this PR then.

timabbott · 2018-06-23T20:26:58Z

The reason it's table-scanning Message is because even though we have an index on Message__recipient_id, when the list of recipient IDs is a few thousand long (as it is), there isn't an efficient way for Postgres to filter Message for the matching rows using the index, so instead it table scans and then filters.

showell · 2018-06-23T20:30:18Z

Why isn’t it starting with UserMessage rows that match the user_profile_id index and then finding the Message in o(1) and filtering on recipient id?

timabbott · 2018-06-23T20:40:05Z

The query plan is below:

zulip=> explain analyze UPDATE "zerver_usermessage" SET "flags" = ("zerver_usermessage"."flags" | 2048) WHERE "zerver_usermessage"."id" IN (SELECT V0."id" AS Col1 FROM "zerver_usermessage" V0 INNER JOIN "zerver_message" V2 ON (V0."message_id" = V2."id") WHERE (V0."user_profile_id" = 79 AND V2."recipient_id" IN (SELECT U0."id" AS Col1 FROM "zerver_recipient" U0 WHERE U0."type" IN (1, 3))));

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on zerver_usermessage  (cost=203478.70..403933.86 rows=23562 width=44) (actual time=1261.376..1261.376 rows=0 loops=1)
   ->  Nested Loop  (cost=203478.70..403933.86 rows=23562 width=44) (actual time=1260.864..1261.109 rows=10 loops=1)
         ->  HashAggregate  (cost=203478.13..203713.75 rows=23562 width=22) (actual time=1260.806..1260.855 rows=10 loops=1)
               ->  Hash Join  (cost=146558.77..203419.23 rows=23562 width=22) (actual time=811.029..1260.782 rows=10 loops=1)
                     Hash Cond: (v2.recipient_id = u0.id)
                     ->  Hash Join  (cost=146274.47..202581.59 rows=23562 width=20) (actual time=803.195..1238.252 rows=64840 loops=1)
                           Hash Cond: (v0.message_id = v2.id)
                           ->  Index Scan using zerver_usermessage_06037614 on zerver_usermessage v0  (cost=0.57..52390.90 rows=23562 width=14) (actual time=0.071..288.141 rows=64840 loops=1)
                                 Index Cond: (user_profile_id = 79)
                           ->  Hash  (cost=134730.40..134730.40 rows=664040 width=14) (actual time=802.696..802.696 rows=602819 loops=1)
                                 Buckets: 65536  Batches: 2  Memory Usage: 14156kB
                                 ->  Seq Scan on zerver_message v2  (cost=0.00..134730.40 rows=664040 width=14) (actual time=0.046..590.753 rows=602819 loops=1)
                     ->  Hash  (cost=171.69..171.69 rows=9009 width=10) (actual time=5.239..5.239 rows=8913 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 383kB
                           ->  Seq Scan on zerver_recipient u0  (cost=0.00..171.69 rows=9009 width=10) (actual time=0.017..2.796 rows=8913 loops=1)
                                 Filter: (type = ANY ('{1,3}'::integer[]))
                                 Rows Removed by Filter: 248
         ->  Index Scan using zerver_usermessage_pkey on zerver_usermessage  (cost=0.57..8.49 rows=1 width=26) (actual time=0.021..0.022 rows=1 loops=10)
               Index Cond: (id = v0.id)
 Total runtime: 1261.809 ms
(20 rows)

timabbott · 2018-06-23T20:47:16Z

    UserProfile = apps.get_model("zerver", "UserProfile")
    user_profile_ids = UserProfile.objects.all().order_by("id").values_list("id", flat=True)
    Recipient = apps.get_model("zerver", "Recipient")
    Subscription = apps.get_model("zerver", "Subscription")

    total = len(user_profile_ids)
    for user_id in user_profile_ids:
        print(user_id, total)
        sys.stdout.flush()
	recipient_ids = Subscription.objects.filter(recipient__type__in=[1,3], user_profile_id=user_id).\
values_list("recipient_id", flat=True)
#        recipient_ids = Recipient.objects.filter(type__in=[1,3]).values_list("id", flat=True)           
        with transaction.atomic():
            # We only need to do this because a previous migration didn't clean the field.               
            private_objects = UserMessage.objects.filter(user_profile__id = user_id,
                                                         message__recipient_id__in=recipient_ids)
            private_objects.update(flags=F('flags').bitor(UserMessage.flags.is_private))
        print("Processed %s/%s" % (user_id, total))

seems to do considerably better.

timabbott · 2018-08-08T17:11:15Z

zerver/tests/test_messages.py

+        self.send_personal_message(self.example_email("hamlet"), user_profile.email,
+                                   content="test")
+        message = most_recent_message(user_profile)
+        assert(UserMessage.objects.get(user_profile=user_profile, message=message).flags.is_private.is_set)


self.assertTrue is the right way to write this.

(In general, one doesn't use the assert keyword in unit tests; it produces less nice output than the self.assert* methods.

Cool, I'll push the change with the other changes and will take care of this is future :)

timabbott · 2018-08-08T17:26:18Z

I tweaked the first commit to add a variable NON_API_FLAGS and pushed back to this PR.

I think we need to also add a check in the update_message_flags view to block client interaction with flags in the NON_API_FLAGS list; it looks like we don't currently have validation of the flag validity beyond this:

    flagattr = getattr(UserMessage.flags, flag)

(which I bet 500s with an invalid flag name) so we'll need to add something new with some tests to give a nice "Invalid flag 'foo'" type error message.

shubham-padia · 2018-08-09T20:29:39Z

@timabbott This is ready for another review.

timabbott

I made a few of the changes mentioned here, and pushed the branch back. Can you re-test the migration and fix the tests?

timabbott · 2018-08-09T21:07:43Z

zerver/lib/actions.py

+        # `flags` is a list. Do not use `flags` here by mistake.
+        flagattr = getattr(UserMessage.flags, flag)
+    else:
+        raise JsonableError(_("Invalid flag: %s" % (flag,)))


We should do an early-exit with JsonableError. Also, I renamed flags to valid_flags so we don't need the comment.

timabbott · 2018-08-09T21:08:19Z

zerver/migrations/0182_set_initial_value_is_private_flag.py

+    UserMessage = apps.get_model("zerver", "UserMessage")
+    Message = apps.get_model("zerver", "Message")
+    i = 0
+    total = Message.objects.count()


This total approach is buggy, because if messages are being sent while this runs, the total we queried before is wrong. At least if we're using atomic=False which we should for this expensive migration.

(Using the looping flow from migration 0177 is robust against this sort of thing)

(we need to adapt it a bit for the fact the query is sparse, though; see the code I just pushed)

timabbott · 2018-08-09T21:13:04Z

zerver/models.py

+    # Zulip backend, and don't make sense to expose to the API.  A
+    # good example is is_private, which is just a denormalization of
+    # message.recipient_type for database query performance.
+    NON_API_FLAGS = {"is_private"}


I added a commit that also puts active_mobile_push_notification in this list.

timabbott · 2018-08-09T21:13:53Z

zerver/tests/test_events.py

+
+        with self.assertRaises(JsonableError):
+            do_update_message_flags(user_profile, get_client("website"), 'remove', 'invalid', [message])
+


test_events.py is really just for event system races; we should move your test changes. git grep messages/flags shows that test_messages.py is probably where this belongs.

@timabbott Can you have a look at test_messages.py, I'm not sure under which test case class should this fall?
(When adding the test initially I had a look at test_messages.py, but I didn't find any relevant test case classes.)

I'd add a new test function in MessageAccessTests, next to the starring tests.

timabbott · 2018-08-09T21:14:08Z

zerver/tests/test_messages.py

+                                   "test")
+
+        for msg in self.get_messages():
+            self.assertTrue('is_private' not in msg['flags'])


I think there's a self.assertNotIn.

timabbott · 2018-08-09T21:45:36Z

I test-deployed this on chat.zulip.org, and the index works and seems to fix the original performance problem, which is great! So just a bit more cleanup and we can merge this.

timabbott · 2018-08-09T21:46:22Z

I guess one of the more important pieces of cleanup is we should have the migration abort early if there are no messages in the database (the CI failure):

Aug 09 21:31:59   File "/srv/zulip-venv-cache/c5ebf65c6930c83dc2e5ef4011cf534e70de3346/zulip-py3-venv/lib/python3.6/site-packages/django/core/management/__init__.py", line 364, in execute_from_command_line
Aug 09 21:31:59     utility.execute()
Aug 09 21:31:59   File "/srv/zulip-venv-cache/c5ebf65c6930c83dc2e5ef4011cf534e70de3346/zulip-py3-venv/lib/python3.6/site-packages/django/core/management/__init__.py", line 356, in execute
Aug 09 21:31:59     self.fetch_command(subcommand).run_from_argv(self.argv)
Aug 09 21:31:59   File "/srv/zulip-venv-cache/c5ebf65c6930c83dc2e5ef4011cf534e70de3346/zulip-py3-venv/lib/python3.6/site-packages/django/core/management/base.py", line 283, in run_from_argv
Aug 09 21:31:59     self.execute(*args, **cmd_options)
Aug 09 21:31:59   File "/srv/zulip-venv-cache/c5ebf65c6930c83dc2e5ef4011cf534e70de3346/zulip-py3-venv/lib/python3.6/site-packages/django/core/management/base.py", line 330, in execute
Aug 09 21:31:59     output = self.handle(*args, **options)
Aug 09 21:31:59   File "/srv/zulip-venv-cache/c5ebf65c6930c83dc2e5ef4011cf534e70de3346/zulip-py3-venv/lib/python3.6/site-packages/django/core/management/commands/migrate.py", line 204, in handle
Aug 09 21:31:59     fake_initial=fake_initial,
Aug 09 21:31:59   File "/srv/zulip-venv-cache/c5ebf65c6930c83dc2e5ef4011cf534e70de3346/zulip-py3-venv/lib/python3.6/site-packages/django/db/migrations/executor.py", line 115, in migrate
Aug 09 21:31:59     state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
Aug 09 21:31:59   File "/srv/zulip-venv-cache/c5ebf65c6930c83dc2e5ef4011cf534e70de3346/zulip-py3-venv/lib/python3.6/site-packages/django/db/migrations/executor.py", line 145, in _migrate_all_forwards
Aug 09 21:31:59     state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
Aug 09 21:31:59   File "/srv/zulip-venv-cache/c5ebf65c6930c83dc2e5ef4011cf534e70de3346/zulip-py3-venv/lib/python3.6/site-packages/django/db/migrations/executor.py", line 244, in apply_migration
Aug 09 21:31:59     state = migration.apply(state, schema_editor)
Aug 09 21:31:59   File "/srv/zulip-venv-cache/c5ebf65c6930c83dc2e5ef4011cf534e70de3346/zulip-py3-venv/lib/python3.6/site-packages/django/db/migrations/migration.py", line 129, in apply
Aug 09 21:31:59     operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
Aug 09 21:31:59   File "/srv/zulip-venv-cache/c5ebf65c6930c83dc2e5ef4011cf534e70de3346/zulip-py3-venv/lib/python3.6/site-packages/django/db/migrations/operations/special.py", line 193, in database_forwards
Aug 09 21:31:59     self.code(from_state.apps, schema_editor)
Aug 09 21:31:59   File "/home/circleci/zulip/zerver/migrations/0182_set_initial_value_is_private_flag.py", line 28, in set_initial_value_of_is_private_flag
Aug 09 21:31:59     if count == 0 and range_end >= Message.objects.last().id:
Aug 09 21:31:59 AttributeError: 'NoneType' object has no attribute 'id'

See the comment for why this is correct; basically, this flag is used only for internal accounting, and would only confuse API clients.

The reasoning here is similar to `is_private`; this flag is only used for internal accounting inside the Zulip server.

Raise error if flag is present in NON_API_FLAGS or is not present in UserMessage.flags.

Fixes zulip#6896.

shubham-padia · 2018-08-09T22:59:23Z

@timabbott This is ready for another review.

timabbott · 2018-08-09T23:07:35Z

zerver/tests/test_messages.py

+            do_update_message_flags(user_profile, get_client("website"), 'remove', first_non_api_flag, [message])
+
+        with self.assertRaises(JsonableError):
+            do_update_message_flags(user_profile, get_client("website"), 'remove', 'invalid', [message])


In our tests, we usually prefer API interaction to calling raw actions.py methods where possible. I'll just redo this that way.

) - user_profile = self.example_user('hamlet') - for first_non_api_flag in UserMessage.NON_API_FLAGS: - break - with self.assertRaises(JsonableError): - do_update_message_flags(user_profile, get_client("website"), 'remove', first_non_api_flag, [message]) + self.login(self.example_email("hamlet")) + result = self.client_post("/json/messages/flags", + {"messages": ujson.dumps([message]), + "op": "add", + "flag": "invalid"}) + self.assert_json_error(result, "Invalid flag: 'invalid'") - with self.assertRaises(JsonableError): - do_update_message_flags(user_profile, get_client("website"), 'remove', 'invalid', [message]) + result = self.client_post("/json/messages/flags", + {"messages": ujson.dumps([message]), + "op": "add", + "flag": "is_private"}) + self.assert_json_error(result, "Invalid flag: 'is_private'") + + result = self.client_post("/json/messages/flags", + {"messages": ujson.dumps([message]), + "op": "add", + "flag": "active_mobile_push_notification"}) + self.assert_json_error(result, "Invalid flag: 'active_mobile_push_notification'") def change_star(self, messages: List[int], add: bool=True, **kwargs: Any) -> HttpResponse:

timabbott · 2018-08-09T23:12:24Z

Nice, I changed the one thing I mentioned above, and merged this as the series of commits ending with bdaff17 (note that I also moved the more extensive commit message and closes to the last commit). Thanks @shubham-padia!!

zulipbot added size: L area: api area: production priority: high labels Jun 14, 2018

shubham-padia force-pushed the 6896 branch 4 times, most recently from 5f6c1d0 to 4a1cd48 Compare June 19, 2018 20:30

shubham-padia force-pushed the 6896 branch from 4a1cd48 to ae60100 Compare June 19, 2018 20:45

showell reviewed Jun 20, 2018

View reviewed changes

shubham-padia force-pushed the 6896 branch from ae60100 to 86876b8 Compare June 20, 2018 15:04

timabbott reviewed Aug 8, 2018

View reviewed changes

timabbott force-pushed the 6896 branch from d1d71da to 8d69b5f Compare August 8, 2018 17:21

shubham-padia force-pushed the 6896 branch from 8d69b5f to 618b8ef Compare August 9, 2018 19:22

zulipbot added size: XL and removed size: L labels Aug 9, 2018

shubham-padia force-pushed the 6896 branch 2 times, most recently from d1999a4 to 100746a Compare August 9, 2018 20:13

timabbott mentioned this pull request Aug 9, 2018

export: Set UserMessage.flags.is_private properly in build_usermessage #10262

Closed

timabbott force-pushed the 6896 branch from 100746a to bf6be19 Compare August 9, 2018 21:14

timabbott reviewed Aug 9, 2018

View reviewed changes

timabbott force-pushed the 6896 branch 2 times, most recently from 10ef506 to 0aa5580 Compare August 9, 2018 21:27

shubham-padia force-pushed the 6896 branch 3 times, most recently from 5750976 to f3f97ad Compare August 9, 2018 22:30

shubham-padia and others added 6 commits August 10, 2018 04:11

models: Do not leak is_private UserMessage flag to the API.

e375cf9

See the comment for why this is correct; basically, this flag is used only for internal accounting, and would only confuse API clients.

models: Do not leak 'active_mobile_push_notification' flag to API.

a0141ac

The reasoning here is similar to `is_private`; this flag is only used for internal accounting inside the Zulip server.

actions.py: Block client interaction with flags in the NON_API_FLAGS.

53c211e

Raise error if flag is present in NON_API_FLAGS or is not present in UserMessage.flags.

actions.py: Set is_private flag in do_send_messages.

224e9b1

migrations: Set initial value for is_private flags.

adb8768

Fixes zulip#6896.

narrow: Use is_private flag index for is:private.

c651ddf

shubham-padia force-pushed the 6896 branch from f3f97ad to c651ddf Compare August 9, 2018 22:43

timabbott reviewed Aug 9, 2018

View reviewed changes

timabbott closed this Aug 9, 2018


		with self.assertRaises(JsonableError):
		do_update_message_flags(user_profile, get_client("website"), 'remove', 'invalid', [message])

database: Add database index for private messages. #9753

database: Add database index for private messages. #9753

Conversation

shubham-padia commented Jun 14, 2018 • edited

zulipbot commented Jun 14, 2018

shubham-padia commented Jun 19, 2018

timabbott commented Jun 19, 2018

timabbott commented Jun 19, 2018

shubham-padia commented Jun 19, 2018

shubham-padia commented Jun 19, 2018

showell Jun 20, 2018 • edited

Choose a reason for hiding this comment

showell Jun 20, 2018 • edited

Choose a reason for hiding this comment

showell commented Jun 20, 2018

shubham-padia commented Jun 20, 2018

showell commented Jun 20, 2018

timabbott commented Jun 23, 2018

showell commented Jun 23, 2018

timabbott commented Jun 23, 2018

timabbott commented Jun 23, 2018

shubham-padia commented Jun 23, 2018

timabbott commented Jun 23, 2018

showell commented Jun 23, 2018 • edited

timabbott commented Jun 23, 2018 • edited

timabbott commented Jun 23, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timabbott commented Aug 8, 2018

shubham-padia commented Aug 9, 2018

timabbott left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timabbott commented Aug 9, 2018

timabbott commented Aug 9, 2018

shubham-padia commented Aug 9, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timabbott commented Aug 9, 2018

shubham-padia commented Jun 14, 2018 •

edited

showell Jun 20, 2018 •

edited

showell Jun 20, 2018 •

edited

showell commented Jun 23, 2018 •

edited

timabbott commented Jun 23, 2018 •

edited