fix: reverse index to journals to speed up triggers #19056

miketheman · 2025-11-14T18:37:39Z

When a journal entry changes, the update_project_last_serial trigger fires, which runs maintain_project_last_serial().

This function looks for the max(id) for a given name, meaning it first has to find all records for name, and then find the highest id (at the "end" of the records).

By creating a compound index that has the name first, and then the most recent id at the top of the index should return much faster, speeding up any trigger/function executions, which are sadly somewhat opaque to our instrumentation.

When a journal entry changes, the `update_project_last_serial` trigger fires, which runs `maintain_project_last_serial()`. This function looks for the `max(id)` for a given `name`, meaning it first has to find all records for `name`, and then find the highest `id` (at the "end" of the records). By creating a compound index that has the name first, and then the most recent `id` at the **top** of the index should return much faster, speeding up any trigger/function executions, which are sadly somewhat opaque to our instrumentation. Signed-off-by: Mike Fiedler <miketheman@gmail.com>

miketheman · 2025-11-14T18:43:23Z

Here's some comparisons of the query plans.

The effective function query is:

SELECT max(id) as last_serial
    FROM journals
    WHERE journals.name = 'foo';

production (without this new index)

Ignore the costs and timings, they are polluted by me testing queries earlier, what's important is the plan structure

warehouse=> explain analyze SELECT max(id) as last_serial
                    FROM journals
                    WHERE journals.name = 'foo';
                                                            QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1590.48..1590.49 rows=1 width=4) (actual time=1.925..1.926 rows=1 loops=1)
   ->  Bitmap Heap Scan on journals  (cost=7.55..1589.48 rows=402 width=4) (actual time=1.921..1.922 rows=3 loops=1)
         Recheck Cond: (name = 'foo'::text)
         Heap Blocks: exact=1
         ->  Bitmap Index Scan on journals_name_idx  (cost=0.00..7.45 rows=402 width=0) (actual time=1.214..1.215 rows=3 loops=1)
               Index Cond: (name = 'foo'::text)
 Planning Time: 0.070 ms
 Execution Time: 1.947 ms

development

without the new index

                                                           QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=93.21..93.22 rows=1 width=4) (actual time=0.908..0.910 rows=1 loops=1)
   ->  Bitmap Heap Scan on journals  (cost=4.60..93.15 rows=23 width=4) (actual time=0.893..0.895 rows=4 loops=1)
         Recheck Cond: (name = 'foo'::text)
         Heap Blocks: exact=1
         ->  Bitmap Index Scan on journals_name_idx  (cost=0.00..4.60 rows=23 width=0) (actual time=0.400..0.401 rows=4 loops=1)
               Index Cond: (name = 'foo'::text)
 Planning Time: 1.248 ms
 Execution Time: 0.962 ms

Same structural approach - the cost and timings are not important since the data, resources are vastly different.
But there's no combined index today that covers both name and id - so there's a recheck condition.

with the new index

                                                                    QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=0.79..0.80 rows=1 width=4) (actual time=0.847..0.850 rows=1 loops=1)
   InitPlan 1
     ->  Limit  (cost=0.42..0.79 rows=1 width=4) (actual time=0.831..0.833 rows=1 loops=1)
           ->  Index Only Scan using journals_name_id_idx on journals  (cost=0.42..8.83 rows=23 width=4) (actual time=0.827..0.828 rows=1 loops=1)
                 Index Cond: (name = 'foo'::text)
                 Heap Fetches: 0
 Planning Time: 0.940 ms
 Execution Time: 0.918 ms

So we can see that with the new index applied, we'd only ever need to check one index to satisfy the query. We'll still hit the function N times for every journal update via the trigger, but at least now the speed to find each one should be faster.

ewdurbin · 2025-11-14T19:09:27Z

warehouse=> explain analyze SELECT max(id) as last_serial                                                                                                         
                    FROM journals                                                                                                                     
                    WHERE journals.name = 'foo';
                                                                      QUERY PLAN                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=1.89..1.90 rows=1 width=4) (actual time=0.015..0.016 rows=1 loops=1)
   InitPlan 1
     ->  Limit  (cost=0.56..1.89 rows=1 width=4) (actual time=0.014..0.014 rows=1 loops=1)
           ->  Index Only Scan using journals_name_id_idx on journals  (cost=0.56..559.95 rows=422 width=4) (actual time=0.013..0.013 rows=1 loops=1)
                 Index Cond: (name = 'foo'::text)
                 Heap Fetches: 0
 Planning Time: 0.081 ms
 Execution Time: 0.024 ms
(8 rows)

Not too bad!

miketheman requested a review from a team as a code owner November 14, 2025 18:37

ewdurbin approved these changes Nov 14, 2025

View reviewed changes

ewdurbin merged commit 661c06f into pypi:main Nov 14, 2025
21 checks passed

miketheman mentioned this pull request Nov 14, 2025

User account cannot be deleted: 503 error #19033

Closed

miketheman deleted the miketheman/19033-follow-up-index branch November 14, 2025 19:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: reverse index to journals to speed up triggers #19056

fix: reverse index to journals to speed up triggers #19056

miketheman commented Nov 14, 2025

Uh oh!

miketheman commented Nov 14, 2025

Uh oh!

Uh oh!

ewdurbin commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: reverse index to journals to speed up triggers #19056

fix: reverse index to journals to speed up triggers #19056

Conversation

miketheman commented Nov 14, 2025

Uh oh!

miketheman commented Nov 14, 2025

production (without this new index)

development

without the new index

with the new index

Uh oh!

Uh oh!

ewdurbin commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants