chore: fix missing index on courier list count #3002

jonas-jonas · 2023-01-04T18:45:35Z

Changes the order in which the count of all courier_messages is fetched from the database, to avoid scoping in the query and thus doing full table scans. Also adds missing indexes for the status and recipient filter.

Related issue(s)

Checklist

I have read the contributing guidelines.
I have referenced an issue containing the design document if my change
introduces a new feature.
I am following the
contributing code guidelines.
I have read the security policy.
I confirm that this pull request does not address a security
vulnerability. If this pull request addresses a security vulnerability, I
confirm that I got the approval (please contact
security@ory.sh) from the maintainers to push
the changes.
I have added tests that prove my fix is effective or that my feature
works.
I have added or changed the documentation.

Further Comments

aeneasr · 2023-01-05T10:34:58Z

persistence/sql/migrations/sql/20230104193739000000_courier_list_index.up.sql

@@ -0,0 +1,3 @@
+CREATE INDEX courier_messages_status_idx ON courier_messages (status);
+
+CREATE INDEX courier_messages_status_idx ON courier_messages (recipient);


Indices are a way to tell the database which queries it should expect, so that it can optimize the lookup for these. That's why indices should always include all columns for a specific query. Optimally, indices are built in such a way that they can be re-used elsewhere.

https://www.cockroachlabs.com/docs/v22.2/schema-design-indexes

https://www.cockroachlabs.com/docs/v22.2/create-index

https://www.cockroachlabs.com/docs/stable/indexes.html

https://www.cockroachlabs.com/docs/v22.2/schema-design-indexes#best-practices

https://www.cockroachlabs.com/docs/v22.2/performance-best-practices-overview#unique-id-best-practices

Queries can benefit from an index even if they only filter a prefix of its columns. For example, if you create an index of columns (A, B, C), queries filtering (A) or (A, B) can use the index, so you don't need to also index (A).

The more variety a column has, the better it is in the index, because it avoids hot spots. So always put the column with the highest "entropy" first:

Suggested change

CREATE INDEX courier_messages_status_idx ON courier_messages (recipient);

CREATE INDEX courier_messages_recipient_status_idx ON courier_messages (nid, recipient, status);

In the example below, if recipient is not required, you should be able to change the query to

if filter.Recipient != "" { q = q.Where("recipient=?", filter.Recipient) } else { q = q.Where("recipient != ?", "") }

so that you don't need to add another index (please verify this first though).

Furthermore, you're creating the same index name twice, which will result in an error.

I have now done some more testing, and benchmarking on a CRDB instance.

The following results have been gathered using EXPLAIN (opt, verbose) on CRDB.

To cover all possible queries in the "ListMessages" method with an index, we would need 4 indexes:

CREATE INDEX … (nid, created_at DESC, id) if neither status nor recipient is supplied

CREATE INDEX … (nid, status, created_at DESC, id) if only status is supplied

CREATE INDEX … (nid, recipient, created_at DESC, id) if only recipient is supplied

CREATE INDEX … (nid, status, recipient, created_at DESC, id) if both are supplied

IIUC, this would probably hinder the performance, because it would make writes slower. It is also not scalable, if we want to add a filter for say template_type in the future.

I'd propose to only create CREATE INDEX … (nid, created_at DESC, id) to cover the case where neither status nor recipient is supplied, as that is probably the most common case. Additionally, we can add separate indexes on status and recipient: CREATE INDEX ... (status) (and (recipient) respectively).

I have looked at the query plans for the cases and these are the results:

filter cost

no filter cost=14.82

status = 3 cost=18.47

recipient = 'x@ory.sh' cost=18.47

status = 3 AND recipient = 'x@ory.sh' cost=18.48

(Without the separate indices on status and recipient the cost would be cost=21.37 for the queries filtering for them.)

Another improvement would be to use the STORING clause in the index (CREATE INDEX … (nid, created_at DESC, id) STORING (body, template_data, etc...)), to avoid the index join with the table courier_messages and return the data from the index directly, which would bring the cost for the query with no filters down to 9.37. But this requires us to define the columns in the index definition, which is possible, but IMO out of the scope of this PR. Let me know how you want me to proceed here.

And to add, since I didn't it include in the comments yet: Adding recipient != '' to the query and creating the index with CREATE INDEX ... (nid, recipient, status, created_at DESC, id) does have an effect:

filter cost

recipient != '' AND status != -1 cost=17.32

recipient != '' AND status = 3 cost=17.32

recipient = 'x@ory.sh' AND status != -1 cost=18.48

status = 3 AND recipient = 'x@ory.sh' cost=14.87

So in essence, it slows down the query with no filters, and speeds up the others a bit. IMO that's not really what we want, because it's more likely that users want to see just any message, until we have proper filtering, using a search engine. But do let me know what you think!

What's the expected query pattern here, realistically? The status column has low entropy and, if combined with a search for recipient, a scan through the result set is not a problem. A sequential scan isn't a problem if the set to scan is small. Combined indices beyond these:

CREATE INDEX … (nid, created_at DESC, id)

CREATE INDEX … (nid, status, created_at DESC, id)

CREATE INDEX … (nid, recipient, created_at DESC, id)

probably don't give us anything useful.

I recommend you check if those indices get used in ListMessages (even if more filters are active) and call it a day.

Thanks @alnr. From my POV, users would initially go to the email delivery view (where we show this data) with no filters active. Then they might filter for Abandoned messages and might search for a specific recipient. So we definitely need an index on the query with no filters, might need an index on the status query, and might need one on the recipient filter query. While possible, a query with both a status and a recipient filter is not that useful, so probably not to be expected.

I have settled on creating CREATE INDEX … (nid, created_at DESC, id) and indices for status (already exists) and recipient (new in this PR).

That way, we get these costs:

filter cost

no filter cost=14.82

status = 3 cost=18.47

recipient = 'x@ory.sh' cost=18.47

status = 3 AND recipient = 'x@ory.sh' cost=18.48

If we additionally create:

CREATE INDEX … (nid, status, created_at DESC, id)

CREATE INDEX … (nid, recipient, created_at DESC, id)

We get cost=14.82 on the queries that have either of the fields, but not for both, which would be fine, IMO. Question is, what is the overhead for writes, if we have all 3 of these indices active? I'd say we will do many more writes to this table, than this admin query is executed.

Updating 3 indices shouldn't be a problem. I'd say go for it and revisit if there's a problem.

Alright, I added the 2 missing indices to the PR. Should be good to go now.

As far as I understand, we always have these query components:

nid, created_at

If the user applies filters, they also add recipient, status, or both. Thus, this should handle all query cases:

CREATE INDEX … (nid ASC, created_at ASC, recipient ASC, status ASC) CREATE INDEX … (nid ASC, created_at ASC, status ASC)

Given that you put recipient first:

q := p.GetConnection(ctx).Where("nid=?", p.NetworkID(ctx)) if filter.Recipient != "" { q = q.Where("recipient=?", filter.Recipient) } if filter.Status != nil { q = q.Where("status=?", *filter.Status) }

Please note that created_at DESC has no effect, as we explicitly use ORDER BY created_at DESC, which is not using the index for sorting but instead scans all ranges and orders them. Typically, we sort stuff ASC in indicies

From my testing, I can now say, that we need to put recipient and status first, before created_at in the index, otherwise the DB refuses to use the index in the query. Also see below on desc vs asc in the index.

Please note that created_at DESC has no effect, as we explicitly use ORDER BY created_at DESC, which is not using the index for sorting but instead scans all ranges and orders them. Typically, we sort stuff ASC in indicies

That is not true from my testing on a CRDB, as you can see here:

CREATE INDEX ... (nid, created_at); // this index is not used in the query plan and cost = 17.27

CREATE INDEX ... (nid, created_at DESC); // index is used in the query plan and cost = 14.82

But indeed, id does not seem to be needed. I'll remove it.

persistence/sql/migrations/sql/20230104193739000000_courier_list_index.up.sql

jonas-jonas · 2023-01-06T15:51:52Z

This PR is now ready for another review. Please have a look at #3002 (comment) where I have written down the analysis of the costs of these indices.

jonas-jonas · 2023-01-06T16:03:00Z

courier/test/persistence.go


 				require.NoError(t, err)
 				require.Len(t, ms, 0)
 				require.Equal(t, int64(0), tc)
+


This is what a failure of the test case looks like: https://github.com/ory/kratos/actions/runs/3856498251/jobs/6572792385#step:15:3563

That's terrifying.

Yup, luckily this was very unlikely, as there had to have been three preconditions:

two messages with a timestamp collision (microsecond precision)

the first message needed to be a page token

the second message needed to have a UUID that had a higher "string" value (e.g. 1: "00...", 2: "11...")

codecov · 2023-01-06T18:19:46Z

Codecov Report

Merging #3002 (ca35b45) into master (ca35b45) will not change coverage.
The diff coverage is n/a.

❗ Current head ca35b45 differs from pull request most recent head 31cc37b. Consider uploading reports for the commit 31cc37b to get more accurate results

@@           Coverage Diff           @@
##           master    #3002   +/-   ##
=======================================
  Coverage   77.59%   77.59%           
=======================================
  Files         310      310           
  Lines       19265    19265           
=======================================
  Hits        14948    14948           
  Misses       3180     3180           
  Partials     1137     1137

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

…ourierList

persistence/sql/migrations/sql/20230104193739000000_courier_list_index.up.sql

aeneasr · 2023-01-10T14:46:02Z

courier/test/persistence.go

@@ -126,12 +126,60 @@ func TestPersister(ctx context.Context, newNetworkUnlessExisting NetworkWrapper,
 			assert.Equal(t, messages[len(messages)-1].ID, ms[0].ID)

 			t.Run("on another network", func(t *testing.T) {


…ourierList

chore: fix missing index on courier list count

6069360

jonas-jonas requested review from aeneasr and zepatrik as code owners January 4, 2023 18:45

aeneasr requested changes Jan 5, 2023

View reviewed changes

jonas-jonas self-assigned this Jan 6, 2023

chore: add reproduction test for nid escape

cf156c7

jonas-jonas requested a review from hperl January 6, 2023 13:57

chore: fix list courier messages index

f7fe751

jonas-jonas commented Jan 6, 2023

View reviewed changes

persistence/sql/migrations/sql/20230104193739000000_courier_list_index.up.sql Outdated Show resolved Hide resolved

chore: update ory/x

99ef939

jonas-jonas commented Jan 6, 2023

View reviewed changes

chore: fix index syntax

e70720a

jonas-jonas added 4 commits January 7, 2023 20:02

chore: fix index ordering

cb66eaf

chore: add status and recipient indices

aaf49e5

Merge remote-tracking branch 'origin/master' into fix/missingIndexOnC…

aa80826

…ourierList

Merge remote-tracking branch 'origin/master' into fix/missingIndexOnC…

c6f8da6

…ourierList

aeneasr reviewed Jan 10, 2023

View reviewed changes

persistence/sql/migrations/sql/20230104193739000000_courier_list_index.up.sql Outdated Show resolved Hide resolved

aeneasr reviewed Jan 10, 2023

View reviewed changes

jonas-jonas added 2 commits January 10, 2023 16:47

chore: remove id from indices

815a593

Merge remote-tracking branch 'origin/master' into fix/missingIndexOnC…

31cc37b

…ourierList

aeneasr approved these changes Jan 10, 2023

View reviewed changes

aeneasr merged commit 3b50711 into master Jan 10, 2023

aeneasr deleted the fix/missingIndexOnCourierList branch January 10, 2023 23:46

CNLHC pushed a commit to seekthought/kratos that referenced this pull request May 16, 2023

fix: missing index on courier list count (ory#3002)

aaa51e5

peturgeorgievv pushed a commit to senteca/kratos-fork that referenced this pull request Jun 30, 2023

fix: missing index on courier list count (ory#3002)

4b68996

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: fix missing index on courier list count #3002

chore: fix missing index on courier list count #3002

jonas-jonas commented Jan 4, 2023

aeneasr Jan 5, 2023

This comment was marked as outdated.

jonas-jonas Jan 6, 2023

jonas-jonas Jan 7, 2023

alnr Jan 9, 2023

jonas-jonas Jan 9, 2023

alnr Jan 9, 2023

jonas-jonas Jan 9, 2023

aeneasr Jan 10, 2023

jonas-jonas Jan 10, 2023

jonas-jonas commented Jan 6, 2023

jonas-jonas Jan 6, 2023

alnr Jan 9, 2023

jonas-jonas Jan 9, 2023

codecov bot commented Jan 6, 2023 •

edited

aeneasr Jan 10, 2023

		@@ -0,0 +1,3 @@
		CREATE INDEX courier_messages_status_idx ON courier_messages (status);

		CREATE INDEX courier_messages_status_idx ON courier_messages (recipient);

	CREATE INDEX courier_messages_status_idx ON courier_messages (recipient);
	CREATE INDEX courier_messages_recipient_status_idx ON courier_messages (nid, recipient, status);

filter	cost
`no filter`	`cost=14.82`
`status = 3`	`cost=18.47`
`recipient = 'x@ory.sh'`	`cost=18.47`
`status = 3 AND recipient = 'x@ory.sh'`	`cost=18.48`

filter	cost
`recipient != '' AND status != -1`	`cost=17.32`
`recipient != '' AND status = 3`	`cost=17.32`
`recipient = 'x@ory.sh' AND status != -1`	`cost=18.48`
`status = 3 AND recipient = 'x@ory.sh'`	`cost=14.87`

		@@ -126,12 +126,60 @@ func TestPersister(ctx context.Context, newNetworkUnlessExisting NetworkWrapper,
		assert.Equal(t, messages[len(messages)-1].ID, ms[0].ID)

		t.Run("on another network", func(t *testing.T) {

chore: fix missing index on courier list count #3002

chore: fix missing index on courier list count #3002

Conversation

jonas-jonas commented Jan 4, 2023

Related issue(s)

Checklist

Further Comments

Choose a reason for hiding this comment

This comment was marked as outdated.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonas-jonas commented Jan 6, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jan 6, 2023 • edited

Codecov Report

Choose a reason for hiding this comment

codecov bot commented Jan 6, 2023 •

edited