Add support for PostgreSQL operator classes to add_index #19090

Merged
merged 1 commit into from Dec 1, 2017

Conversation

Projects
None yet
@gregnavis
Contributor

gregnavis commented Feb 26, 2015

Use case

I needed to use trigrams when SELECT-ing from a table. I want to use schema.rb. Unfortunately this wasn't possible to create an appropriate index. The required query is:

CREATE INDEX users_name ON users USING gist (name gist_trgm_ops);

The gist_trgm_ops after name is the operator class to use when using the index. Currently it's possible to specify ... USING gist (name) but there's no way of adding the operator class after name.

PostgreSQL is the only affected database. Other databases are not affected.

Solution

Operator classes can be explicitly specified in add_index as:

add_index :users, :name, using: :gist, opclass: :gist_trgm_ops

Changes

  • added opclass to IndexDefinition and made it a valid add_index option
  • added support for opclass to SchemaDumper
  • test cases for the changes above

Issues

Below are issues I run into. I present my decision and a rationale for it. Any feedback is welcome! Hopefully some improvement is possible.

Syntax

I wasn't sure what's the best syntax. I considered

add_index :users, {name: :gist_trgm_ops}, using: :gist

but it places PostgreSQL-specific data where the user might not expect it.

I decided to use a new option as this makes the implementation very simple and makes the opclasses used explicit. The tradeoff is that the column names must be specified twice.

Extraneous whitespace (resolved)

There's always a space after a column name used in the index even when no operator class is specified. For example ... USING gist (name) is turned into ... USING gist (name ). I decided that this makes the code simpler at the expense of a tiny ugliness in the test suite. Additionally multiple spaces already appear in some statements, e.g. CREATE INDEX when UNIQE is not present.

@cristianbica

This comment has been minimized.

Show comment
Hide comment
@cristianbica

cristianbica Feb 26, 2015

Member

There's a similar work at #18499

Member

cristianbica commented Feb 26, 2015

There's a similar work at #18499

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Feb 26, 2015

Member

Is there a reason simply passing a string to :using is insufficient for this?

Member

sgrif commented Feb 26, 2015

Is there a reason simply passing a string to :using is insufficient for this?

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis Feb 26, 2015

Contributor

It's great to see that Rails is improving in this area. I can see that my PR is smaller and more focused than the other PR.

@sgrif If you do

add_index(:users, :name, :using => 'gist (name gist_trgm_ops)')

then the query is

CREATE  INDEX  "index_users_on_name" ON "users" USING gist (name gist_trgm_ops) ("name" )

Please notice that the columns specified as the second argument to add_index are listed after USING. The result is invalid syntax. Or did you have something else in mind?

Contributor

gregnavis commented Feb 26, 2015

It's great to see that Rails is improving in this area. I can see that my PR is smaller and more focused than the other PR.

@sgrif If you do

add_index(:users, :name, :using => 'gist (name gist_trgm_ops)')

then the query is

CREATE  INDEX  "index_users_on_name" ON "users" USING gist (name gist_trgm_ops) ("name" )

Please notice that the columns specified as the second argument to add_index are listed after USING. The result is invalid syntax. Or did you have something else in mind?

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Feb 26, 2015

Member

That seems like a bug. I'd rather just make sure it's possible to pass a string to using properly

Member

sgrif commented Feb 26, 2015

That seems like a bug. I'd rather just make sure it's possible to pass a string to using properly

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis Feb 26, 2015

Contributor

@sgrif that may be an option. There are some issues though. If we pass using: :gist then we must use the columns specified as the second argument to add_index. When we pass using: 'gist(name gist_trgm_ops)' then we should ignore the columns specified in the second argument because we can pass something entirely different here.

@sgrif Please tell me whether using: 'gist(name gist_trgm_ops)' is the syntax you had in mind.

Contributor

gregnavis commented Feb 26, 2015

@sgrif that may be an option. There are some issues though. If we pass using: :gist then we must use the columns specified as the second argument to add_index. When we pass using: 'gist(name gist_trgm_ops)' then we should ignore the columns specified in the second argument because we can pass something entirely different here.

@sgrif Please tell me whether using: 'gist(name gist_trgm_ops)' is the syntax you had in mind.

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Feb 26, 2015

Member

Yes, that is the syntax I had in mind. We can probably just drop the automatic arguments if you pass a string to using.

Member

sgrif commented Feb 26, 2015

Yes, that is the syntax I had in mind. We can probably just drop the automatic arguments if you pass a string to using.

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis Feb 26, 2015

Contributor

What is nice about your approach is that it simpler. I can see one more issue. If I pass using: 'gist(name gist_trgm_ops)' and don't pass the array of columns (I assume this is what you meant by automatic arguments) then the migration or schema.rb will break after switching to a database other than PostgreSQL.

I don't know what is in greater alignment with core values of Rails: the ability to switch the database without breaking migrations or schema.rb or a simpler syntax and implementation. Could you give a hint?

Contributor

gregnavis commented Feb 26, 2015

What is nice about your approach is that it simpler. I can see one more issue. If I pass using: 'gist(name gist_trgm_ops)' and don't pass the array of columns (I assume this is what you meant by automatic arguments) then the migration or schema.rb will break after switching to a database other than PostgreSQL.

I don't know what is in greater alignment with core values of Rails: the ability to switch the database without breaking migrations or schema.rb or a simpler syntax and implementation. Could you give a hint?

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Feb 26, 2015

Member

In this case, the simpler syntax and implementation.

Member

sgrif commented Feb 26, 2015

In this case, the simpler syntax and implementation.

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Feb 26, 2015

Member

using: :gist already isn't portable.

Member

sgrif commented Feb 26, 2015

using: :gist already isn't portable.

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis Feb 26, 2015

Contributor

Sorry, I was imprecise. My question was: do we need the ability to switch a database to something other than PostgreSQL and still be able to run migrations or schema.rb albeit with a different result (e.g. an ordinary index). It's a form of partial portability because everything will work but you won't get the features not supported by the other database.

If the answer is No, we don't need that then your approach will be way simpler.

Contributor

gregnavis commented Feb 26, 2015

Sorry, I was imprecise. My question was: do we need the ability to switch a database to something other than PostgreSQL and still be able to run migrations or schema.rb albeit with a different result (e.g. an ordinary index). It's a form of partial portability because everything will work but you won't get the features not supported by the other database.

If the answer is No, we don't need that then your approach will be way simpler.

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Feb 26, 2015

Member

No, we don't need that.

Member

sgrif commented Feb 26, 2015

No, we don't need that.

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis Feb 27, 2015

Contributor

Great! That simplifies a lot.

I also came up with another syntax for this:

# Allow opclasses to be specified in column_name.
# Currently the whole string is quoted and treated as a column name.
add_index :users, 'name gist_trgm_ops', using: :gist

# or expect columns and opclasses to appear in using: when
# it's specified as a string (no column_name in this case)
add_index :users, using: 'gist (name gist_trgm_ops)'

The latter is what you suggested @sgrif, right?

I think the first syntax is more compatible with what we currently have. The downside is that it breaks compatibility for people who use spaces in column names (are there any? 😄)

Which do you think is better?

Contributor

gregnavis commented Feb 27, 2015

Great! That simplifies a lot.

I also came up with another syntax for this:

# Allow opclasses to be specified in column_name.
# Currently the whole string is quoted and treated as a column name.
add_index :users, 'name gist_trgm_ops', using: :gist

# or expect columns and opclasses to appear in using: when
# it's specified as a string (no column_name in this case)
add_index :users, using: 'gist (name gist_trgm_ops)'

The latter is what you suggested @sgrif, right?

I think the first syntax is more compatible with what we currently have. The downside is that it breaks compatibility for people who use spaces in column names (are there any? 😄)

Which do you think is better?

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Feb 27, 2015

Member

I think it should be:

add_index :users, :name, using: "gist (name gist_trgm_ops)"

Simply so we can continue to have the column name for index naming purposes..

Member

sgrif commented Feb 27, 2015

I think it should be:

add_index :users, :name, using: "gist (name gist_trgm_ops)"

Simply so we can continue to have the column name for index naming purposes..

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis Feb 28, 2015

Contributor

What if using: is a string, e.g. "gist"? This happens in one of the tests.

An option is to test %w(gin gist hash btree).include?(options[:using].downcase). If so, then do what the current code does (i.e. index columns specified in column_name). Otherwise insert options[:using] into the query without inserting column_name (which would be used only to name the index).

What do you think?

Contributor

gregnavis commented Feb 28, 2015

What if using: is a string, e.g. "gist"? This happens in one of the tests.

An option is to test %w(gin gist hash btree).include?(options[:using].downcase). If so, then do what the current code does (i.e. index columns specified in column_name). Otherwise insert options[:using] into the query without inserting column_name (which would be used only to name the index).

What do you think?

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis Mar 2, 2015

Contributor

I updated the PR with Sean's suggestions. I'd love to hear your feedback!

Contributor

gregnavis commented Mar 2, 2015

I updated the PR with Sean's suggestions. I'd love to hear your feedback!

@matthewd

This comment has been minimized.

Show comment
Hide comment
@matthewd

matthewd Mar 2, 2015

Member

@sgrif this seems like quite a perversion of the existing call syntax to me. 😕

Not to mention the danger of people supplying different column names in the two places...

Member

matthewd commented Mar 2, 2015

@sgrif this seems like quite a perversion of the existing call syntax to me. 😕

Not to mention the danger of people supplying different column names in the two places...

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif Mar 2, 2015

Member

@matthewd what would you like to see?

Member

sgrif commented Mar 2, 2015

@matthewd what would you like to see?

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis May 25, 2015

Contributor

@matthewd, @sgrif, @cristianbica is there a chance we can make some progress on this? Thanks!

Contributor

gregnavis commented May 25, 2015

@matthewd, @sgrif, @cristianbica is there a chance we can make some progress on this? Thanks!

@matthewd

This comment has been minimized.

Show comment
Hide comment
@matthewd

matthewd May 25, 2015

Member

I guess the spelling I'd consider to be most reflective of the underlying PostgreSQL syntax would be:

add_index :users, [[:name, :gist_trgm_ops]], using: :gist

# or perhaps a more extensible:
add_index :users, [[:name, opclass: :gist_trgm_ops]], using: :gist

.. which isn't particularly pretty... but may still be an improvement over how we currently do things?

Otherwise, again in line with :order and :length, the consistent-with-precedent approach would be to add a top level :opclass option, which can either be a string (applies to all columns) or a hash (keys are column names).

Note that even if we adopted my above suggestion of [column, options] pairs, we could still support a top-level option as applying to all the columns -- meaning you could ignore that syntax for all the more common single-column / consistent-opclass indexes.


You seem to have done a slightly-too-good job of revising history here, so I can't actually see whether any of the above resembles how you had it before @sgrif suggested the current form.

But I do feel that conflating the USING parameter with the index column list would be an error: they are no more related than are the table name and the column list.

Member

matthewd commented May 25, 2015

I guess the spelling I'd consider to be most reflective of the underlying PostgreSQL syntax would be:

add_index :users, [[:name, :gist_trgm_ops]], using: :gist

# or perhaps a more extensible:
add_index :users, [[:name, opclass: :gist_trgm_ops]], using: :gist

.. which isn't particularly pretty... but may still be an improvement over how we currently do things?

Otherwise, again in line with :order and :length, the consistent-with-precedent approach would be to add a top level :opclass option, which can either be a string (applies to all columns) or a hash (keys are column names).

Note that even if we adopted my above suggestion of [column, options] pairs, we could still support a top-level option as applying to all the columns -- meaning you could ignore that syntax for all the more common single-column / consistent-opclass indexes.


You seem to have done a slightly-too-good job of revising history here, so I can't actually see whether any of the above resembles how you had it before @sgrif suggested the current form.

But I do feel that conflating the USING parameter with the index column list would be an error: they are no more related than are the table name and the column list.

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis May 25, 2015

Contributor

@matthewd, I reverted the previous version of the code and the pull request message. The usage I implemented looked like:

add_index :users, :name, using: :gist, opclasses: {name: :gist_trgm_ops}

So this is in line with :order and :length (except I should change :opclasses to :opclass for consistency). How should I continue from the code that is currently in this PR?

Contributor

gregnavis commented May 25, 2015

@matthewd, I reverted the previous version of the code and the pull request message. The usage I implemented looked like:

add_index :users, :name, using: :gist, opclasses: {name: :gist_trgm_ops}

So this is in line with :order and :length (except I should change :opclasses to :opclass for consistency). How should I continue from the code that is currently in this PR?

@swalkinshaw

This comment has been minimized.

Show comment
Hide comment
@swalkinshaw

swalkinshaw Aug 13, 2015

Contributor

@grn I'm having this same issue but with to_tsvector. Could this PR be more generic to support functions as well? Having the opclasses option limits this to your use case.

Example:

CREATE INDEX widget_name_search_idx ON widgets USING gin(to_tsvector('english', name))

The problem is the same here since you can pass a string to using but Rails still adds the column name at the end.

add_index :widgets, :name, name: 'widget_name_search_idx', using: "gin(to_tsvector('english', name))"
Contributor

swalkinshaw commented Aug 13, 2015

@grn I'm having this same issue but with to_tsvector. Could this PR be more generic to support functions as well? Having the opclasses option limits this to your use case.

Example:

CREATE INDEX widget_name_search_idx ON widgets USING gin(to_tsvector('english', name))

The problem is the same here since you can pass a string to using but Rails still adds the column name at the end.

add_index :widgets, :name, name: 'widget_name_search_idx', using: "gin(to_tsvector('english', name))"
@lsylvester

This comment has been minimized.

Show comment
Hide comment
@lsylvester

lsylvester Aug 27, 2015

Contributor

I also have run into this using the to_tsvector function. I think that it would be good to allow for indexes to be created on eny expression.

If we are going to allow any expression to be indexed, then I think that the syntax

add_index :users, :name, using: "gist (name gist_trgm_ops)"

has a few of issues.

First, there may be times when you want to index an expression without using a custom index type, like an index on a LOWER function. Here, you would to specify the index type even though you are not changing it from the default. ie.

add_index :users, :name, using: "btree (LOWER(name))"

instead of

add_index :users, "LOWER(name)"

Second, in the case where you want to index multiple columns, you would have to repeat all of the columns in the using clause. ie.

add_index :users, :organisation_id, :name, using: "btree (organisation_id, LOWER(name))"

instead of

add_index :users, :organisation_id, "LOWER(name)", using: :btree

Third, it might be required to have multiple indexes on the same column with different functions/operator classes, but if the name is generated using only the column name then the names will conflict.

For example, you might need to have indexes like:

add_index :users, :name, using: "gist (name gist_trgm_ops)"
add_index :users, :name

To support both equality and similarity searches, but both indexes would by default have the same name.

I think that it would be better to either require the name to be specified if there is an expression, or automatically generate the name based on the whole expression instead of just the column.

add_index :users, :name                               #=> creates index "index_users_on_name"
add_index :users, "name gist_trgm_ops", using: :gist  #=> creates index "index_users_on_name_gist_trgm_ops"
add_index :users, "LOWER(name)"                       #=> creates index "index_users_on_lower_name"
Contributor

lsylvester commented Aug 27, 2015

I also have run into this using the to_tsvector function. I think that it would be good to allow for indexes to be created on eny expression.

If we are going to allow any expression to be indexed, then I think that the syntax

add_index :users, :name, using: "gist (name gist_trgm_ops)"

has a few of issues.

First, there may be times when you want to index an expression without using a custom index type, like an index on a LOWER function. Here, you would to specify the index type even though you are not changing it from the default. ie.

add_index :users, :name, using: "btree (LOWER(name))"

instead of

add_index :users, "LOWER(name)"

Second, in the case where you want to index multiple columns, you would have to repeat all of the columns in the using clause. ie.

add_index :users, :organisation_id, :name, using: "btree (organisation_id, LOWER(name))"

instead of

add_index :users, :organisation_id, "LOWER(name)", using: :btree

Third, it might be required to have multiple indexes on the same column with different functions/operator classes, but if the name is generated using only the column name then the names will conflict.

For example, you might need to have indexes like:

add_index :users, :name, using: "gist (name gist_trgm_ops)"
add_index :users, :name

To support both equality and similarity searches, but both indexes would by default have the same name.

I think that it would be better to either require the name to be specified if there is an expression, or automatically generate the name based on the whole expression instead of just the column.

add_index :users, :name                               #=> creates index "index_users_on_name"
add_index :users, "name gist_trgm_ops", using: :gist  #=> creates index "index_users_on_name_gist_trgm_ops"
add_index :users, "LOWER(name)"                       #=> creates index "index_users_on_lower_name"
@matthewd

This comment has been minimized.

Show comment
Hide comment
@matthewd

matthewd Aug 27, 2015

Member

I think that it would be good to allow for indexes to be created on any expression

I agree... but that sounds more like #13684; while they're written close together in the SQL, I think opclasses are ultimately unrelated.

Member

matthewd commented Aug 27, 2015

I think that it would be good to allow for indexes to be created on any expression

I agree... but that sounds more like #13684; while they're written close together in the SQL, I think opclasses are ultimately unrelated.

@YorickPeterse

This comment has been minimized.

Show comment
Hide comment
@YorickPeterse

YorickPeterse Mar 3, 2016

If it's of any use, I just backported these changes to GitLab (see https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/2987 for the exact changes) and they work like a charm. If anybody wants to backport these as well they can dump the following code somewhere in their Rails application (e.g. an initializer): https://gist.github.com/YorickPeterse/00a4364ec11e3b63c2c3

If it's of any use, I just backported these changes to GitLab (see https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/2987 for the exact changes) and they work like a charm. If anybody wants to backport these as well they can dump the following code somewhere in their Rails application (e.g. an initializer): https://gist.github.com/YorickPeterse/00a4364ec11e3b63c2c3

jrjang pushed a commit to jrjang/gitlab-ce that referenced this pull request Mar 3, 2016

Backport Rails support for PostgreSQL opclasses
This is needed to support creating/dumping/loading indexes that use the
gin_trgm_ops operator class on PostgreSQL. These changes are taken from
Rails pull request rails/rails#19090.

jrjang pushed a commit to jrjang/gitlab-ce that referenced this pull request Mar 4, 2016

Backport Rails support for PostgreSQL opclasses
This is needed to support creating/dumping/loading indexes that use the
gin_trgm_ops operator class on PostgreSQL. These changes are taken from
Rails pull request rails/rails#19090.

kamipo added a commit to kamipo/rails that referenced this pull request Mar 6, 2016

Add Expression Indexes and Operator Classes support for PostgreSQL
Example:

    create_table :users do |t|
      t.string :name
      t.index -> { 'lower(name)' }
      t.index -> { 'name varchar_pattern_ops' }
    end

Fixes #19090, #21765, #21819.

rspeicher added a commit to gitlabhq/gitlabhq that referenced this pull request Mar 11, 2016

Backport Rails support for PostgreSQL opclasses
This is needed to support creating/dumping/loading indexes that use the
gin_trgm_ops operator class on PostgreSQL. These changes are taken from
Rails pull request rails/rails#19090.

kamipo added a commit to kamipo/rails that referenced this pull request Apr 21, 2016

Add Expression Indexes and Operator Classes support for PostgreSQL
Example:

    create_table :users do |t|
      t.string :name
      t.index -> { 'lower(name)' }
      t.index -> { 'name varchar_pattern_ops' }
    end

Fixes #19090, #21765, #21819.

kamipo added a commit to kamipo/rails that referenced this pull request Apr 24, 2016

Add Expression Indexes and Operator Classes support for PostgreSQL
Example:

    create_table :users do |t|
      t.string :name
      t.index 'lower(name) varchar_pattern_ops'
    end

Fixes #19090.
Fixes #21765.
Fixes #21819.
Fixes #24359.

@jeremy jeremy closed this in edc2b77 Apr 24, 2016

@jeremy

This comment has been minimized.

Show comment
Hide comment
@jeremy

jeremy Apr 24, 2016

Member

Reopening for full-fledged support. Now covered by allowing literal SQL.

Member

jeremy commented Apr 24, 2016

Reopening for full-fledged support. Now covered by allowing literal SQL.

@jeremy jeremy reopened this Apr 24, 2016

@soulnafein

This comment has been minimized.

Show comment
Hide comment
@soulnafein

soulnafein Apr 26, 2016

@jeremy What's the plan for this? Nice to see it reopened. Sad that I was hoping to add new faster indices today.

@jeremy What's the plan for this? Nice to see it reopened. Sad that I was hoping to add new faster indices today.

@jeremy

This comment has been minimized.

Show comment
Hide comment
@jeremy

jeremy Apr 26, 2016

Member

@soulnafein If you're on Rails 5 beta, check out #23393

Member

jeremy commented Apr 26, 2016

@soulnafein If you're on Rails 5 beta, check out #23393

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis Jun 4, 2016

Contributor

@matthewd, @jeremy, @sgrif, I can block a few hours to work on this. I'd like to have your input on the desired shape of this feature.

@matthewd said:

Otherwise, again in line with :order and :length, the consistent-with-precedent approach would be to add a top level :opclass option, which can either be a string (applies to all columns) or a hash (keys are column names).

and I think I agree with this approach. If we decide to support another syntax (e.g. add_index :users, [[:email, :gin_trgm_ops]]) then we can do it in a separate PR.

So the question is: what should be the syntax for adding an operator class?

Contributor

gregnavis commented Jun 4, 2016

@matthewd, @jeremy, @sgrif, I can block a few hours to work on this. I'd like to have your input on the desired shape of this feature.

@matthewd said:

Otherwise, again in line with :order and :length, the consistent-with-precedent approach would be to add a top level :opclass option, which can either be a string (applies to all columns) or a hash (keys are column names).

and I think I agree with this approach. If we decide to support another syntax (e.g. add_index :users, [[:email, :gin_trgm_ops]]) then we can do it in a separate PR.

So the question is: what should be the syntax for adding an operator class?

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis Jun 15, 2016

Contributor

@matthewd, @jeremy, @sgrif, any updates? No worries if you're too busy. I'll ping you in a week or two.

Contributor

gregnavis commented Jun 15, 2016

@matthewd, @jeremy, @sgrif, any updates? No worries if you're too busy. I'll ping you in a week or two.

@esebastian esebastian referenced this pull request in civio/onodo Jul 5, 2016

Closed

Implementar Search en Explore #22

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis Sep 1, 2016

Contributor

@matthewd, @jeremy, @sgrif, any thoughts? It seems we need to decide on the syntax we'd like to use.

Contributor

gregnavis commented Sep 1, 2016

@matthewd, @jeremy, @sgrif, any thoughts? It seems we need to decide on the syntax we'd like to use.

@lettergram

This comment has been minimized.

Show comment
Hide comment

@matthewd, @jeremy, @sgrif any updates on this?

@matthewd

This comment has been minimized.

Show comment
Hide comment
@matthewd

matthewd Mar 24, 2017

Member

Oops.. I'd left this to give a chance for second opinions, but then failed to come back to it. Sorry @gregnavis 😟

I like this implementation.

From a quick scroll through to reacquaint myself, I've spotted:

  • rename opclasses to opclass as you mentioned;
  • even though we're already far from perfect on this query construction, it's probably worth avoiding the space when no opclass is set;
  • opclass can be a non-hash, in which case that value applies to all columns (and the dumper should presumably take advantage of that, especially for the special-but-common case of a single-column index).

The inddef parsing is taking some liberties in assuming things about the names (e.g., that neither the column nor opclass name contains a space), but it looks like desc_order_columns, for example, is already being similarly presumptuous. So it seems fine to call that someone else's future problem.

I haven't looked at how bad the conflicts are after having neglected this for so long 😕

Member

matthewd commented Mar 24, 2017

Oops.. I'd left this to give a chance for second opinions, but then failed to come back to it. Sorry @gregnavis 😟

I like this implementation.

From a quick scroll through to reacquaint myself, I've spotted:

  • rename opclasses to opclass as you mentioned;
  • even though we're already far from perfect on this query construction, it's probably worth avoiding the space when no opclass is set;
  • opclass can be a non-hash, in which case that value applies to all columns (and the dumper should presumably take advantage of that, especially for the special-but-common case of a single-column index).

The inddef parsing is taking some liberties in assuming things about the names (e.g., that neither the column nor opclass name contains a space), but it looks like desc_order_columns, for example, is already being similarly presumptuous. So it seems fine to call that someone else's future problem.

I haven't looked at how bad the conflicts are after having neglected this for so long 😕

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis Mar 24, 2017

Contributor

No problem, @matthewd! I know you're super-busy with other stuff. I'll try to address these issues and rebase it on top of master next week. Stay tuned!

Contributor

gregnavis commented Mar 24, 2017

No problem, @matthewd! I know you're super-busy with other stuff. I'll try to address these issues and rebase it on top of master next week. Stay tuned!

@maclover7 maclover7 added needs work and removed needs feedback labels Mar 24, 2017

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis May 13, 2017

Contributor

@matthewd I rebased the branch (the conflicts weren't that bad) and address the issues you mentioned.

@matthewd @jeremy @sgrif - please review and let me know if anything else should be done before merging

Contributor

gregnavis commented May 13, 2017

@matthewd I rebased the branch (the conflicts weren't that bad) and address the issues you mentioned.

@matthewd @jeremy @sgrif - please review and let me know if anything else should be done before merging

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis Jul 7, 2017

Contributor

@matthewd @jeremy @sgrif I'm floating this to the top of your inboxes. If there's anything I could do to make the PR better please let me know.

Contributor

gregnavis commented Jul 7, 2017

@matthewd @jeremy @sgrif I'm floating this to the top of your inboxes. If there's anything I could do to make the PR better please let me know.

@matthewd

Thanks for the reminder!

One little thing, and this looks ready to go! 🚀

@@ -391,6 +391,25 @@ def default_index_type?(index) # :nodoc:
private
+ def add_index_opclass(column_names, options = {})
+ opclass = case options[:opclass]
+ when String

This comment has been minimized.

@matthewd

matthewd Jul 8, 2017

Member

I think this misses Symbol?

Seems like it might read more easily if it were flipped a bit, making this the else clause -- then there's no need for a separate empty-hash branch.

@matthewd

matthewd Jul 8, 2017

Member

I think this misses Symbol?

Seems like it might read more easily if it were flipped a bit, making this the else clause -- then there's no need for a separate empty-hash branch.

- index_name, index_type, index_columns, index_options, index_algorithm, index_using, comment = add_index_options(table_name, column_name, options)
- execute("CREATE #{index_type} INDEX #{index_algorithm} #{quote_column_name(index_name)} ON #{quote_table_name(table_name)} #{index_using} (#{index_columns})#{index_options}").tap do
+ index_name, index_type, index_columns_and_opclasses, index_options, index_algorithm, index_using, comment = add_index_options(table_name, column_name, options)
+ execute("CREATE #{index_type} INDEX #{index_algorithm} #{quote_column_name(index_name)} ON #{quote_table_name(table_name)} #{index_using} (#{index_columns_and_opclasses})#{index_options}").tap do

This comment has been minimized.

@kamipo

kamipo Jul 8, 2017

Member

Changing the index_columns name is necessary? Actually this also includes index sort orders.

@kamipo

kamipo Jul 8, 2017

Member

Changing the index_columns name is necessary? Actually this also includes index sort orders.

This comment has been minimized.

@gregnavis

gregnavis Jul 10, 2017

Contributor

I wanted the name to reflect the content. I don't have a strong opinion here so I can revert if you think it's unnecessary.

@gregnavis

gregnavis Jul 10, 2017

Contributor

I wanted the name to reflect the content. I don't have a strong opinion here so I can revert if you think it's unnecessary.

@@ -494,7 +494,39 @@ def test_dump_foreign_key_targeting_different_schema
end
end
-class DefaultsUsingMultipleSchemasAndDomainTest < ActiveRecord::PostgreSQLTestCase
+class SchemaIndexOpclassTest < ActiveRecord::TestCase

This comment has been minimized.

@kamipo

kamipo Jul 8, 2017

Member

s/ActiveRecord::TestCase/ActiveRecord::PostgreSQLTestCase/

@kamipo

kamipo Jul 8, 2017

Member

s/ActiveRecord::TestCase/ActiveRecord::PostgreSQLTestCase/

+ end
+end
+
+class DefaultsUsingMultipleSchemasAndDomainTest < ActiveSupport::TestCase

This comment has been minimized.

@kamipo

kamipo Jul 8, 2017

Member

s/ActiveSupport::TestCase/ActiveRecord::PostgreSQLTestCase/

@kamipo

kamipo Jul 8, 2017

Member

s/ActiveSupport::TestCase/ActiveRecord::PostgreSQLTestCase/

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis Jul 11, 2017

Contributor

@matthewd @kamipo thanks for feedback! I updated the PR as per your suggestions. It seems the build is failing for reasons unrelated to my changes.

Contributor

gregnavis commented Jul 11, 2017

@matthewd @kamipo thanks for feedback! I updated the PR as per your suggestions. It seems the build is failing for reasons unrelated to my changes.

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis Nov 29, 2017

Contributor

@matthewd @kamipo I just rebase the PR to the recent master. Please let me know whether there's anything I could do to help to merge it.

Contributor

gregnavis commented Nov 29, 2017

@matthewd @kamipo I just rebase the PR to the recent master. Please let me know whether there's anything I could do to help to merge it.

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis Nov 29, 2017

Contributor

There's a CodeClimate error and it seems it prefers:

        def add_index_opclass(column_names, options = {})
          opclass = if options[:opclass].is_a?(Hash)
            options[:opclass].symbolize_keys
          else
            Hash.new { |hash, column| hash[column] = options[:opclass].to_s }
          end
          # ...
        end

to

        def add_index_opclass(column_names, options = {})
          opclass = if options[:opclass].is_a?(Hash)
                      options[:opclass].symbolize_keys
                    else
                      Hash.new { |hash, column| hash[column] = options[:opclass].to_s }
                    end
          # ...
        end

Is that style really preferred? If so, I'll update the PR.

Contributor

gregnavis commented Nov 29, 2017

There's a CodeClimate error and it seems it prefers:

        def add_index_opclass(column_names, options = {})
          opclass = if options[:opclass].is_a?(Hash)
            options[:opclass].symbolize_keys
          else
            Hash.new { |hash, column| hash[column] = options[:opclass].to_s }
          end
          # ...
        end

to

        def add_index_opclass(column_names, options = {})
          opclass = if options[:opclass].is_a?(Hash)
                      options[:opclass].symbolize_keys
                    else
                      Hash.new { |hash, column| hash[column] = options[:opclass].to_s }
                    end
          # ...
        end

Is that style really preferred? If so, I'll update the PR.

@matthewd

This comment has been minimized.

Show comment
Hide comment
@matthewd

matthewd Nov 30, 2017

Member

I believe it will acquiesce to:

      def add_index_opclass(column_names, options = {})
        opclass =
          if options[:opclass].is_a?(Hash)
            options[:opclass].symbolize_keys
          else
            Hash.new { |hash, column| hash[column] = options[:opclass].to_s }
          end
        # ...
      end
Member

matthewd commented Nov 30, 2017

I believe it will acquiesce to:

      def add_index_opclass(column_names, options = {})
        opclass =
          if options[:opclass].is_a?(Hash)
            options[:opclass].symbolize_keys
          else
            Hash.new { |hash, column| hash[column] = options[:opclass].to_s }
          end
        # ...
      end
+ super
+ end
+
+ # See http://www.postgresql.org/docs/current/static/errcodes-appendix.html

This comment has been minimized.

@matthewd

matthewd Nov 30, 2017

Member

Tiny mismerge: we've lost an s

@matthewd

matthewd Nov 30, 2017

Member

Tiny mismerge: we've lost an s

Add support for PostgreSQL operator classes to add_index
Add support for specifying non-default operator classes in PostgreSQL
indexes. An example CREATE INDEX query that becomes possible is:

    CREATE INDEX users_name ON users USING gist (name gist_trgm_ops);

Previously it was possible to specify the `gist` index but not the
custom operator class. The `add_index` call for the above query is:

    add_index :users, :name, using: :gist, opclasses: {name: :gist_trgm_ops}
@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis Nov 30, 2017

Contributor

Eagle eye! I updated the PR. Please take another look, @matthewd.

Contributor

gregnavis commented Nov 30, 2017

Eagle eye! I updated the PR. Please take another look, @matthewd.

@matthewd matthewd merged commit 8e7b9e2 into rails:master Dec 1, 2017

2 checks passed

codeclimate All good!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@matthewd

This comment has been minimized.

Show comment
Hide comment
@matthewd

matthewd Dec 1, 2017

Member

🎉

Sorry it took so long... and it got dropped so many times along the way 😞

And thanks for persisting -- I'm really glad to have this in.

Great work! ❤️

Member

matthewd commented Dec 1, 2017

🎉

Sorry it took so long... and it got dropped so many times along the way 😞

And thanks for persisting -- I'm really glad to have this in.

Great work! ❤️

@gregnavis

This comment has been minimized.

Show comment
Hide comment
@gregnavis

gregnavis Dec 1, 2017

Contributor

Woohoo! 🚀 Thank you @matthewd and everyone else for making it happen.

Contributor

gregnavis commented Dec 1, 2017

Woohoo! 🚀 Thank you @matthewd and everyone else for making it happen.

@gregnavis gregnavis deleted the gregnavis:support-postgresql-operator-classes-in-indexes branch Jan 3, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment