Simplify query in column_definitions() for Postgres #27743

Merged
merged 1 commit into from Jan 30, 2017

Projects

None yet

6 participants

@jordanlewis
Contributor

column_definitions() for Postgres needs to fetch the collation for every column, if present. Previously, it did so using a correlated subquery - a subquery that references results from the outer scope.

This patch updates the query to remove the subquery in favor of a simpler and more efficient JOIN clause.

Running the two queries through EXPLAIN against Postgres additionally shows that the original form with a correlated subquery requires a Nested Loop Left Join, while the new form with a simple JOIN can use a more efficient Merge Left Join.

Old plan:

jordan=# explain SELECT a.attname,
  format_type(a.atttypid, a.atttypmod),
  pg_get_expr(d.adbin, d.adrelid),
  a.attnotnull,
  a.atttypid,
  a.atttypmod,
  (SELECT c.collname
   FROM pg_collation c, pg_type t
   WHERE c.oid = a.attcollation
   AND t.oid = a.atttypid
   AND a.attcollation <> t.typcollation),
   col_description(a.attrelid, a.attnum) AS comment
FROM pg_attribute a LEFT JOIN pg_attrdef d
ON a.attrelid = d.adrelid AND a.attnum = d.adnum
WHERE a.attrelid = 'a'::regclass
AND a.attnum > 0 AND NOT a.attisdropped
ORDER BY a.attnum;
                                                  QUERY PLAN
--------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=4.46..74.93 rows=3 width=119)
   Join Filter: ((a.attrelid = d.adrelid) AND (a.attnum = d.adnum))
   ->  Index Scan using pg_attribute_relid_attnum_index on pg_attribute a  (cost=0.28..12.26 rows=3 width=83)
         Index Cond: ((attrelid = '3525321'::oid) AND (attnum > 0))
         Filter: (NOT attisdropped)
   ->  Materialize  (cost=4.18..12.66 rows=4 width=38)
         ->  Bitmap Heap Scan on pg_attrdef d  (cost=4.18..12.64 rows=4 width=38)
               Recheck Cond: (adrelid = '3525321'::oid)
               ->  Bitmap Index Scan on pg_attrdef_adrelid_adnum_index  (cost=0.00..4.18 rows=4 width=0)
                     Index Cond: (adrelid = '3525321'::oid)
   SubPlan 1
     ->  Nested Loop  (cost=0.29..16.34 rows=1 width=64)
           ->  Index Scan using pg_collation_oid_index on pg_collation c  (cost=0.15..8.17 rows=1 width=64)
                 Index Cond: (oid = a.attcollation)
           ->  Index Scan using pg_type_oid_index on pg_type t  (cost=0.15..8.17 rows=1 width=0)
                 Index Cond: (oid = a.atttypid)
                 Filter: (a.attcollation <> typcollation)
(17 rows)

New plan:

jordan=# explain SELECT a.attname,
  format_type(a.atttypid, a.atttypmod),
  pg_get_expr(d.adbin, d.adrelid),
  a.attnotnull,
  a.atttypid,
  a.atttypmod,
  c.collname,
  col_description(a.attrelid, a.attnum) AS comment
FROM pg_attribute a
LEFT JOIN pg_attrdef d
ON a.attrelid = d.adrelid AND a.attnum = d.adnum
LEFT JOIN pg_collation c
ON c.oid = a.attcollation
LEFT JOIN pg_type t
ON t.oid = a.atttypid AND a.attcollation <> t.typcollation
WHERE a.attrelid = 'collation_test'::regclass
AND a.attnum > 0 AND NOT a.attisdropped
ORDER BY a.attnum;
                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Merge Left Join  (cost=38.41..39.23 rows=3 width=179)
   Merge Cond: (a.attnum = d.adnum)
   Join Filter: (a.attrelid = d.adrelid)
   ->  Sort  (cost=25.73..25.73 rows=3 width=143)
         Sort Key: a.attnum
         ->  Hash Right Join  (cost=12.30..25.70 rows=3 width=143)
               Hash Cond: (c.oid = a.attcollation)
               ->  Seq Scan on pg_collation c  (cost=0.00..11.69 rows=269 width=68)
               ->  Hash  (cost=12.26..12.26 rows=3 width=83)
                     ->  Index Scan using pg_attribute_relid_attnum_index on pg_attribute a  (cost=0.28..12.26 rows=3 width=83)
                           Index Cond: ((attrelid = '3525342'::oid) AND (attnum > 0))
                           Filter: (NOT attisdropped)
   ->  Sort  (cost=12.68..12.69 rows=4 width=38)
         Sort Key: d.adnum
         ->  Bitmap Heap Scan on pg_attrdef d  (cost=4.18..12.64 rows=4 width=38)
               Recheck Cond: (adrelid = '3525342'::oid)
               ->  Bitmap Index Scan on pg_attrdef_adrelid_adnum_index  (cost=0.00..4.18 rows=4 width=0)
                     Index Cond: (adrelid = '3525342'::oid)
(18 rows)
@senny senny was assigned by rails-bot Jan 19, 2017
@rails-bot

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @senny (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review.

Please see the contribution instructions for more information.

+ FROM pg_attribute a
+ LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum
+ LEFT JOIN pg_collation c ON a.attcollation = c.oid
+ LEFT JOIN pg_type t on a.atttypid = t.oid AND a.attcollation <> t.typcollation
@kamipo
kamipo Jan 19, 2017 Member

Only this line's on is lower case.

@jordanlewis
jordanlewis Jan 19, 2017 Contributor

Done.

@jordanlewis
Contributor

Thanks for your review, @kamipo! What can I do to help get this PR reviewed and merged?

@jordanlewis
Contributor

Hi! Is there anything else I can do to help get an eye or two on this one? @senny

Thanks!

@rafaelfranca
Member

Could you squash your commits?

@jordanlewis jordanlewis Simplify Postgres query for column_definitions()
column_definitions() needs to fetch the collation for every column, if
present. Previously, it did so using a correlated subquery - a subquery
that references results from the outer scope.

This patch updates the query to remove the subquery in favor of a
simpler and more efficient JOIN clause.

Running the two queries through EXPLAIN against Postgres additionally
shows that the original form with a correlated subquery requires a
Nested Loop Left Join, while the new form with a simple JOIN can use a
more efficient Merge Left Join.
f9c6cbd
@jordanlewis
Contributor

Squashed and rebased.

@rafaelfranca rafaelfranca merged commit eda7f26 into rails:master Jan 30, 2017

2 checks passed

codeclimate no new or fixed issues
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@jordanlewis
Contributor

Thank you, @rafaelfranca!

@jordanlewis jordanlewis referenced this pull request in cockroachdb/cockroach Feb 8, 2017
Open

sql: support ActiveRecord through postgresql adapter #12792

11 of 13 tasks complete
@jordanlewis jordanlewis deleted the jordanlewis:simplify-postgres-column-definitions branch Feb 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment