Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use single query to retrieve indexes in PostgreSQL #45381

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -87,8 +87,13 @@ def indexes(table_name) # :nodoc:
scope = quoted_scope(table_name)

result = query(<<~SQL, "SCHEMA")
SELECT distinct i.relname, d.indisunique, d.indkey, pg_get_indexdef(d.indexrelid), t.oid,
pg_catalog.obj_description(i.oid, 'pg_class') AS comment, d.indisvalid
SELECT distinct i.relname, d.indisunique, d.indkey, pg_get_indexdef(d.indexrelid),
pg_catalog.obj_description(i.oid, 'pg_class') AS comment, d.indisvalid,
ARRAY(
SELECT pg_get_indexdef(d.indexrelid, k + 1, true)
FROM generate_subscripts(d.indkey, 1) AS k
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 30% sure this needs an ORDER BY k to be Technically Correct 🤔

This also feels like a behaviour change for non-column-reference expressions. Is that true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the ORDER BY k is not needed. indkey returns a vector (in postgres terminology) with column numbers from the table. So for table users(id, name, email, created_at) and index on (created_at, email) it returns [4, 3].

This also feels like a behaviour change for non-column-reference expressions.

Can you provide an example?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern was for an index that mixed columns and non-column expressions, but per Discord conversation, I now see that the changed codepath (around line 125 below) is only taken for indexes that consist exclusively of table columns: as soon as any expression is present, we hit if indkey.include?(0), and things diverge.

(It separately seems less-than-ideal that we skip over the "richer" behaviour for those index columns that are non-expression references to table columns, but that's not relevant to this change.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm also happy to assume that generate_subscripts produces rows in a defined order within the immediate subquery, even without an explicit ORDER BY, just on the basis that set returning functions are weird.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added ORDER BY k to be sure. I see that was used in other examples on the internets, but I did not get it why we need to use it, because seems like generate_subscripts result should already be sorted.

(It separately seems less-than-ideal that we skip over the "richer" behaviour for those index columns that are non-expression references to table columns, but that's not relevant to this change.)

Probably I should got to sleep, but I did not get what is the problem here? 🤔 I would appreciate if you can provide a concrete example where the new approach won't work.

ORDER BY k
) AS columns
FROM pg_class t
INNER JOIN pg_index d ON t.oid = d.indrelid
INNER JOIN pg_class i ON d.indexrelid = i.oid
Expand All @@ -105,9 +110,9 @@ def indexes(table_name) # :nodoc:
unique = row[1]
indkey = row[2].split(" ").map(&:to_i)
inddef = row[3]
oid = row[4]
comment = row[5]
valid = row[6]
comment = row[4]
valid = row[5]
columns = row[6]

using, expressions, where = inddef.scan(/ USING (\w+?) \((.+?)\)(?: WHERE (.+))?\z/m).flatten

Expand All @@ -117,12 +122,8 @@ def indexes(table_name) # :nodoc:
if indkey.include?(0)
columns = expressions
else
columns = Hash[query(<<~SQL, "SCHEMA")].values_at(*indkey).compact
SELECT a.attnum, a.attname
FROM pg_attribute a
WHERE a.attrelid = #{oid}
AND a.attnum IN (#{indkey.join(",")})
SQL
decoder = PG::TextDecoder::Array.new
columns = decoder.decode(columns)

# add info on sort order (only desc order is explicitly specified, asc is the default)
# and non-default opclasses
Expand Down