Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pgsql: Reduce cost of GetNotification by 2.5x #279

Merged
merged 1 commit into from
Dec 4, 2016

Conversation

Quentin-M
Copy link
Contributor

@Quentin-M Quentin-M commented Dec 4, 2016

This PR optimizes specifically searchNotificationLayerIntroducingVulnerability, called twice by GetNotification.

By delaying the Layer join to the very end, we can cut the query costs from 540,836 to 219,477. Below are results of EXPLAIN ANALYZE on one of the worth case found today. We can observe a reduction of execution time from 5470.466 ms down to 2521.073 ms. This query fully uses the index added in #278. When executed with the parameters used in #278 (redacted), a >10x gain is seen.

Original query

WITH subquery AS (
  SELECT l.ID, l.name
  FROM Vulnerability_Affects_FeatureVersion vafv, FeatureVersion fv, Layer_diff_FeatureVersion ldfv, Layer l
  WHERE l.id >= $2
    AND vafv.vulnerability_id = $1
    AND vafv.featureversion_id = fv.id
    AND ldfv.featureversion_id = fv.id
    AND ldfv.modification = 'add'
    AND ldfv.layer_id = l.id
  ORDER BY l.ID
)
SELECT *
FROM subquery
LIMIT $3;

Limit  (cost=540862.93..540864.93 rows=100 width=278) (actual time=5466.087..5466.611 rows=100 loops=1)
  CTE subquery
    ->  Sort  (cost=540836.30..540862.93 rows=10650 width=92) (actual time=5466.078..5466.186 rows=100 loops=1)
          Sort Key: l.id
          Sort Method: external sort  Disk: 11456kB
          ->  Nested Loop  (cost=217.06..540123.90 rows=10650 width=92) (actual time=282.663..5170.733 rows=132227 loops=1)
                ->  Nested Loop  (cost=216.62..478346.31 rows=13379 width=4) (actual time=39.503..3052.648 rows=144392 loops=1)
                      Join Filter: (vafv.featureversion_id = ldfv.featureversion_id)
                      ->  Nested Loop  (cost=0.72..165.36 rows=15 width=8) (actual time=0.022..0.056 rows=3 loops=1)
                            ->  Index Only Scan using vulnerability_affects_feature_vulnerability_id_featureversi_key on vulnerability_affects_featureversion vafv  (cost=0.43..40.56 rows=15 width=4) (actual time=0.011..0.018 rows=3 loops=1)
                                  Index Cond: (vulnerability_id = 1367197)
                                  Heap Fetches: 3
                            ->  Index Only Scan using featureversion_pkey on featureversion fv  (cost=0.29..8.31 rows=1 width=4) (actual time=0.005..0.006 rows=1 loops=3)
                                  Index Cond: (id = vafv.featureversion_id)
                                  Heap Fetches: 3
                      ->  Bitmap Heap Scan on layer_diff_featureversion ldfv  (cost=215.90..31692.20 rows=14922 width=8) (actual time=13.179..914.892 rows=48131 loops=3)
                            Recheck Cond: (featureversion_id = fv.id)
                            Rows Removed by Index Recheck: <redacted>
                            Filter: (modification = 'add'::modification)
                            Rows Removed by Filter: 190
                            Heap Blocks: exact=35226 lossy=80070
                            ->  Bitmap Index Scan on layer_diff_featureversion_featureversion_id_layer_id_idx  (cost=0.00..212.17 rows=15060 width=0) (actual time=10.916..10.916 rows=48320 loops=3)
                                  Index Cond: (featureversion_id = fv.id)
                ->  Index Scan using layer_pkey on layer l  (cost=0.44..4.61 rows=1 width=92) (actual time=0.009..0.010 rows=1 loops=144392)
                      Index Cond: ((id = ldfv.layer_id) AND (id >= $2))
  ->  CTE Scan on subquery  (cost=0.00..213.00 rows=10650 width=278) (actual time=5466.083..5466.409 rows=100 loops=1)
Planning time: 1.421 ms
Execution time: 5470.466 ms

New query

EXPLAIN ANALYZE WITH LDFV AS ( 
  SELECT ldfv.layer_id 
  FROM Vulnerability_Affects_FeatureVersion vafv, FeatureVersion fv, Layer_diff_FeatureVersion ldfv 
  WHERE ldfv.layer_id >= $2 
    AND vafv.vulnerability_id = $1
    AND vafv.featureversion_id = fv.id 
    AND ldfv.featureversion_id = fv.id 
    AND ldfv.modification = 'add' 
  ORDER BY ldfv.ID 
) 
SELECT l.id, l.name
FROM LDFV, Layer l 
WHERE LDFV.layer_id = l.id
LIMIT $3

 Limit  (cost=218662.94..219477.65 rows=100 width=92) (actual time=2517.235..2518.658 rows=100 loops=1)
   CTE ldfv
     ->  Sort  (cost=218633.61..218662.50 rows=11556 width=8) (actual time=2517.210..2517.323 rows=100 loops=1)
           Sort Key: ldfv_1.id
           Sort Method: external sort  Disk: 2328kB
           ->  Nested Loop  (cost=179.94..217853.79 rows=11556 width=8) (actual time=35.591..2299.412 rows=132227 loops=1)
                 Join Filter: (vafv.featureversion_id = ldfv_1.featureversion_id)
                 ->  Nested Loop  (cost=0.72..165.36 rows=15 width=8) (actual time=0.022..0.065 rows=3 loops=1)
                       ->  Index Only Scan using vulnerability_affects_feature_vulnerability_id_featureversi_key on vulnerability_affects_featureversion vafv  (cost=0.43..40.56 rows=15 width=4) (actual time=0.011..0.017 rows=3 loops=1)
                             Index Cond: (vulnerability_id = 1367197)
                             Heap Fetches: 3
                       ->  Index Only Scan using featureversion_pkey on featureversion fv  (cost=0.29..8.31 rows=1 width=4) (actual time=0.007..0.009 rows=1 loops=3)
                             Index Cond: (id = vafv.featureversion_id)
                             Heap Fetches: 3
                 ->  Bitmap Heap Scan on layer_diff_featureversion ldfv_1  (cost=179.22..14351.46 rows=12888 width=12) (actual time=11.860..667.419 rows=44076 loops=3)
                       Recheck Cond: ((featureversion_id = fv.id) AND (layer_id >= $2))
                       Rows Removed by Index Recheck: <redacted>
                       Filter: (modification = 'add'::modification)
                       Rows Removed by Filter: 190
                       Heap Blocks: exact=24973 lossy=80092
                       ->  Bitmap Index Scan on layer_diff_featureversion_featureversion_id_layer_id_idx  (cost=0.00..176.00 rows=13008 width=0) (actual time=10.231..10.231 rows=44265 loops=3)
                             Index Cond: ((featureversion_id = fv.id) AND (layer_id >= $2))
   ->  Nested Loop  (cost=0.44..94148.66 rows=11556 width=92) (actual time=2517.233..2518.458 rows=100 loops=1)
         ->  CTE Scan on ldfv  (cost=0.00..231.12 rows=11556 width=4) (actual time=2517.215..2517.548 rows=100 loops=1)
         ->  Index Scan using layer_pkey on layer l  (cost=0.44..8.12 rows=1 width=92) (actual time=0.004..0.005 rows=1 loops=100)
               Index Cond: (id = ldfv.layer_id)
 Planning time: 0.762 ms
 Execution time: 2521.073 ms

@Quentin-M Quentin-M added area/performance related to improving application performance component/database labels Dec 4, 2016
By delaying the Layer join to the very end, we can cut the query costs from 540,836 to 219,477.

See Pull Request for details.
@Quentin-M Quentin-M changed the title pgsql: Reduce cost of GetNotification by 3.4x pgsql: Reduce cost of GetNotification by 2.5x Dec 4, 2016
@jzelinskie jzelinskie merged commit dab6e49 into master Dec 4, 2016
@jzelinskie jzelinskie deleted the searchintro_optimize branch December 4, 2016 17:08
@Quentin-M
Copy link
Contributor Author

FWIW, straight-forward request to ensure correctness.

WITH
  S1 AS (
      SELECT l.ID, l.name
      FROM Vulnerability_Affects_FeatureVersion vafv, FeatureVersion fv, Layer_diff_FeatureVersion ldfv, Layer l
      WHERE l.id >= $1
        AND vafv.vulnerability_id = $2
        AND vafv.featureversion_id = fv.id
        AND ldfv.featureversion_id = fv.id
        AND ldfv.modification = 'add'
        AND ldfv.layer_id = l.id
      ORDER BY l.ID
  ),
  S2 AS (
      SELECT ldfv.layer_id
      FROM Vulnerability_Affects_FeatureVersion vafv, FeatureVersion fv, Layer_diff_FeatureVersion ldfv
      WHERE ldfv.layer_id >= $1
        AND vafv.vulnerability_id = $2
        AND vafv.featureversion_id = fv.id
        AND ldfv.featureversion_id = fv.id
        AND ldfv.modification = 'add'
      ORDER BY ldfv.ID
    )
SELECT COUNT(*) - $3 FROM
(
  (
    SELECT *
    FROM S1
    LIMIT $3
  )
  UNION
  (
    SELECT l.id, l.name
    FROM S2, Layer l
    WHERE S2.layer_id = l.id
    LIMIT $3
  )
) x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance related to improving application performance
Development

Successfully merging this pull request may close these issues.

None yet

2 participants