ActiveRecord query changing when a dot/period is in condition value #950

Closed
lighthouse-import opened this Issue May 16, 2011 · 27 comments

Comments

Projects
None yet
@lighthouse-import

Imported from Lighthouse. Original ticket at: http://rails.lighthouseapp.com/projects/8994/tickets/6611
Created by Corey Ward - 2011-03-24 03:00:22 UTC

I have created a barebones Rails app demonstrating this bug: https://github.com/coreyward/bug-demo

I also have a Question open on StackOverflow about this: http://stackoverflow.com/questions/5199235/activerecord-query-changing-when-a-dot-period-is-in-condition-value

I don't have the skills/knowledge to know A) where this is coming from or B) how to fix it. Any help would be very much appreciated.


This bug occurs when you have a "child" model (one that belongs_to another model) with an order scope on it and you try to set a where condition.

So for the following models:

class Person < ActiveRecord::Base
  has_many :items
end

class Item < ActiveRecord::Base
  belongs_to :person
  default_scope order(:ordinal)
end

With the following calls:

Person.includes(:items).where(:name => 'John')
# or
Person.includes(:items).find_by_name 'John'

The items loaded are ordered by the ordinal on each Item.

If you introduce a period into the name, though (e.g. "John.Smith"), the order scope is ignored.

To see this in action...

$ git clone git://github.com/coreyward/bug-demo.git
$ cd bug-demo
$ rake db:create db:migrate db:seed
$ rails s

And then fire up http://localhost:3000/ and see for yourself.

@lighthouse-import

This comment has been minimized.

Show comment
Hide comment
@lighthouse-import

lighthouse-import May 16, 2011

Imported from Lighthouse.
Comment by Anuj Dutta - 2011-03-24 03:44:43 UTC

Thanks for such a detailed info and the demo application. Ran the app and can confirm that this is a bug. I am not sure what is causing it yet but I am investigating.

Imported from Lighthouse.
Comment by Anuj Dutta - 2011-03-24 03:44:43 UTC

Thanks for such a detailed info and the demo application. Ran the app and can confirm that this is a bug. I am not sure what is causing it yet but I am investigating.

@nocache

This comment has been minimized.

Show comment
Hide comment
@nocache

nocache Sep 9, 2011

We are still experiencing this issue with Rails 3.1
Can it please be re-opened

nocache commented Sep 9, 2011

We are still experiencing this issue with Rails 3.1
Can it please be re-opened

@tenderlove tenderlove reopened this Sep 12, 2011

@kranzky

This comment has been minimized.

Show comment
Hide comment
@kranzky

kranzky Sep 12, 2011

Thanks @tenderlove; we're getting this issue in 3.1 with an autocomplete which fails (with an exception) iff the user enters data that contains a period. The work-around for this is to do user_data.gsub!('.', '_'), which works as we're doing a LIKE query anway, but one concern is whether we're hitting changes in query behaviour elsewhere that don't manifest themselves so obviously.

kranzky commented Sep 12, 2011

Thanks @tenderlove; we're getting this issue in 3.1 with an autocomplete which fails (with an exception) iff the user enters data that contains a period. The work-around for this is to do user_data.gsub!('.', '_'), which works as we're doing a LIKE query anway, but one concern is whether we're hitting changes in query behaviour elsewhere that don't manifest themselves so obviously.

@tenderlove

This comment has been minimized.

Show comment
Hide comment
@tenderlove

tenderlove Sep 12, 2011

Member

@jasonhutchens makes sense. Is this only happening on PG, or any database? Can you possibly show code that breaks? Thanks.

Member

tenderlove commented Sep 12, 2011

@jasonhutchens makes sense. Is this only happening on PG, or any database? Can you possibly show code that breaks? Thanks.

@kranzky

This comment has been minimized.

Show comment
Hide comment
@kranzky

kranzky Sep 13, 2011

@tenderlove sorry for the delay; we will answer those questions as soon as we're able

kranzky commented Sep 13, 2011

@tenderlove sorry for the delay; we will answer those questions as soon as we're able

@jroes

This comment has been minimized.

Show comment
Hide comment
@jroes

jroes Sep 14, 2011

Contributor

Confirmed on SQLite with https://github.com/coreyward/bug-demo and rails-HEAD after a few changes to make the project 3.1-compatible.

I started down the path of writing a failing unit test but adding new items to the AR fixtures easily causes a bunch of unintended failures. Any advice?

Contributor

jroes commented Sep 14, 2011

Confirmed on SQLite with https://github.com/coreyward/bug-demo and rails-HEAD after a few changes to make the project 3.1-compatible.

I started down the path of writing a failing unit test but adding new items to the AR fixtures easily causes a bunch of unintended failures. Any advice?

@tommeier

This comment has been minimized.

Show comment
Hide comment
@tommeier

tommeier Oct 12, 2011

Contributor

Hit this bug too when firing an includes statement on a query with periods in it. (postgres db, rails 3.1.1)

Contributor

tommeier commented Oct 12, 2011

Hit this bug too when firing an includes statement on a query with periods in it. (postgres db, rails 3.1.1)

@peterjm

This comment has been minimized.

Show comment
Hide comment
@peterjm

peterjm Oct 14, 2011

Contributor

I've also encountered this bug. My version of it was slightly different (in my case, the select query was ignored, not the order query) so I created a different sample application to demonstrate the error. See https://github.com/peterjm/rails-activerecord-bug-demo

I've confirmed the error with both sqlite3 and mysql2.

Contributor

peterjm commented Oct 14, 2011

I've also encountered this bug. My version of it was slightly different (in my case, the select query was ignored, not the order query) so I created a different sample application to demonstrate the error. See https://github.com/peterjm/rails-activerecord-bug-demo

I've confirmed the error with both sqlite3 and mysql2.

@tommeier

This comment has been minimized.

Show comment
Hide comment
@tommeier

tommeier Oct 14, 2011

Contributor

^^ nice one @peterjm, that makes it even clearer with the 2 characters test, I also tested your sample over postgres, and same deal, so it seems to be independent of database choice.

Also I tested the same sample with rails 3.1.1, same deal.

Contributor

tommeier commented Oct 14, 2011

^^ nice one @peterjm, that makes it even clearer with the 2 characters test, I also tested your sample over postgres, and same deal, so it seems to be independent of database choice.

Also I tested the same sample with rails 3.1.1, same deal.

@tommeier

This comment has been minimized.

Show comment
Hide comment
@tommeier

tommeier Oct 14, 2011

Contributor

So as far as i can tell, its this :

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/relation.rb#L523

Returning both 'people' and 'xx' as valid table names, causing it to then alias the string (having a table found count greater than 1), and burning the previously defined values.

Contributor

tommeier commented Oct 14, 2011

So as far as i can tell, its this :

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/relation.rb#L523

Returning both 'people' and 'xx' as valid table names, causing it to then alias the string (having a table found count greater than 1), and burning the previously defined values.

@peterjm

This comment has been minimized.

Show comment
Hide comment
@peterjm

peterjm Oct 14, 2011

Contributor

Yep, I think you're right -- forcing tables_in_string to return an empty array definitely fixes my problem.

I think fixing this might be tricky -- I'll need to dive deeper than my current knowledge into the inner workings of ActiveRecord.

Contributor

peterjm commented Oct 14, 2011

Yep, I think you're right -- forcing tables_in_string to return an empty array definitely fixes my problem.

I think fixing this might be tricky -- I'll need to dive deeper than my current knowledge into the inner workings of ActiveRecord.

@ghost ghost assigned tenderlove Oct 16, 2011

@tenderlove

This comment has been minimized.

Show comment
Hide comment
@tenderlove

tenderlove Oct 19, 2011

Member

I've been looking in to this. The problem is that there is no way for us to tell the difference between someone referencing a column and string literal without a full blown SQL parser. The problem stems from the fact that we allow users to add arbitrary snippets of SQL. Arbitrary snippets of SQL, in addition to reconciling eager loaded tables within those snippets makes everything fail.

We can't tell if someone is referencing a "table like thing" from a string literal because string literal quoting changes from database to database. As far as I can tell, we have three options:

  1. Disallow arbitrary SQL
  2. Write a SQL parser for every database
  3. Document the problem and move on

I'm firmly in favor of option 3, unless someone more clever than I can come up with a solution. Maybe @jonleighton?

Anyway, here is a test case that will fail:

diff --git a/activerecord/test/cases/associations/has_many_associations_test.rb b/activerecord/test/cases/associations/has_many_associations_test.rb
index cddd2a6..bf4625c 100644
--- a/activerecord/test/cases/associations/has_many_associations_test.rb
+++ b/activerecord/test/cases/associations/has_many_associations_test.rb
@@ -52,6 +52,12 @@ class HasManyAssociationsTest < ActiveRecord::TestCase
     Client.destroyed_client_ids.clear
   end

+  def test_select_with_dot_in_name
+    Developer.where(:name => 'aaron.patterson')
+    .includes(:audit_logs)
+    .select('developers.id as omg').order('omg').to_a
+  end
+
   def test_create_from_association_should_respect_default_scope
     car = Car.create(:name => 'honda')
     assert_equal 'honda', car.name
Member

tenderlove commented Oct 19, 2011

I've been looking in to this. The problem is that there is no way for us to tell the difference between someone referencing a column and string literal without a full blown SQL parser. The problem stems from the fact that we allow users to add arbitrary snippets of SQL. Arbitrary snippets of SQL, in addition to reconciling eager loaded tables within those snippets makes everything fail.

We can't tell if someone is referencing a "table like thing" from a string literal because string literal quoting changes from database to database. As far as I can tell, we have three options:

  1. Disallow arbitrary SQL
  2. Write a SQL parser for every database
  3. Document the problem and move on

I'm firmly in favor of option 3, unless someone more clever than I can come up with a solution. Maybe @jonleighton?

Anyway, here is a test case that will fail:

diff --git a/activerecord/test/cases/associations/has_many_associations_test.rb b/activerecord/test/cases/associations/has_many_associations_test.rb
index cddd2a6..bf4625c 100644
--- a/activerecord/test/cases/associations/has_many_associations_test.rb
+++ b/activerecord/test/cases/associations/has_many_associations_test.rb
@@ -52,6 +52,12 @@ class HasManyAssociationsTest < ActiveRecord::TestCase
     Client.destroyed_client_ids.clear
   end

+  def test_select_with_dot_in_name
+    Developer.where(:name => 'aaron.patterson')
+    .includes(:audit_logs)
+    .select('developers.id as omg').order('omg').to_a
+  end
+
   def test_create_from_association_should_respect_default_scope
     car = Car.create(:name => 'honda')
     assert_equal 'honda', car.name
@peterjm

This comment has been minimized.

Show comment
Hide comment
@peterjm

peterjm Oct 20, 2011

Contributor

Could there be some way to pass down a hint to the Relation that the string contains SQL, or is known not to contain SQL.

I'm just thinking out loud here, but here's a couple possibilities:

  1. Add some way to flag a string as not containing SQL, i.e. Developer.where(:name => 'aaron.patterson'.sql_safe) and then have Relation#tables_in_string understand the sql_safe flag.
  2. Add a method to Relations to prevent any of the custom strings being parsed as SQL, i.e. Developer.where(:name => 'aaron.patterson').no_sql.to_a. (This seems pretty ugly to me, but I'm just brainstorming).
Contributor

peterjm commented Oct 20, 2011

Could there be some way to pass down a hint to the Relation that the string contains SQL, or is known not to contain SQL.

I'm just thinking out loud here, but here's a couple possibilities:

  1. Add some way to flag a string as not containing SQL, i.e. Developer.where(:name => 'aaron.patterson'.sql_safe) and then have Relation#tables_in_string understand the sql_safe flag.
  2. Add a method to Relations to prevent any of the custom strings being parsed as SQL, i.e. Developer.where(:name => 'aaron.patterson').no_sql.to_a. (This seems pretty ugly to me, but I'm just brainstorming).
@jonleighton

This comment has been minimized.

Show comment
Hide comment
@jonleighton

jonleighton Oct 21, 2011

Member

@tenderlove, @NZKoz was talking about deprecating the tables_in_string magic a while back. I would be in favour of that. We could then simply document that if people are referencing stuff in string conditions that they wish to be eager loaded, then they must use eager_load(:foo). What do you think?

Member

jonleighton commented Oct 21, 2011

@tenderlove, @NZKoz was talking about deprecating the tables_in_string magic a while back. I would be in favour of that. We could then simply document that if people are referencing stuff in string conditions that they wish to be eager loaded, then they must use eager_load(:foo). What do you think?

@NZKoz

This comment has been minimized.

Show comment
Hide comment
@NZKoz

NZKoz Oct 21, 2011

Member

people's apps would break in the upgrade, but they'd break predictably (no such table whatevers) rather than 'randomly' issuing a much more complicated query. Seems like a slightly painful transition but beats the ... madness we have right now :)

Member

NZKoz commented Oct 21, 2011

people's apps would break in the upgrade, but they'd break predictably (no such table whatevers) rather than 'randomly' issuing a much more complicated query. Seems like a slightly painful transition but beats the ... madness we have right now :)

@jonleighton

This comment has been minimized.

Show comment
Hide comment
@jonleighton

jonleighton Oct 21, 2011

Member

@NZKoz I don't think there's anything stopping us implementing it as a deprecation before removing entirely?

Member

jonleighton commented Oct 21, 2011

@NZKoz I don't think there's anything stopping us implementing it as a deprecation before removing entirely?

@NZKoz

This comment has been minimized.

Show comment
Hide comment
@NZKoz

NZKoz Oct 21, 2011

Member

True, a deprecation warning would catch every intentional use of the 'fallback' strategy, it'd just fire false positives where this issue is concerned. But that's fine ;)

Cheers,

Koz

On Friday, 21 October 2011 at 9:42 PM, Jon Leighton wrote:

@NZKoz I don't think there's anything stopping us implementing it as a deprecation before removing entirely?

Reply to this email directly or view it on GitHub:
#950 (comment)

Member

NZKoz commented Oct 21, 2011

True, a deprecation warning would catch every intentional use of the 'fallback' strategy, it'd just fire false positives where this issue is concerned. But that's fine ;)

Cheers,

Koz

On Friday, 21 October 2011 at 9:42 PM, Jon Leighton wrote:

@NZKoz I don't think there's anything stopping us implementing it as a deprecation before removing entirely?

Reply to this email directly or view it on GitHub:
#950 (comment)

@tenderlove

This comment has been minimized.

Show comment
Hide comment
@tenderlove

tenderlove Oct 21, 2011

Member

I like the idea of adding an eager_load rather than trying to intuit from the strings what tables people want to load. 👍

Member

tenderlove commented Oct 21, 2011

I like the idea of adding an eager_load rather than trying to intuit from the strings what tables people want to load. 👍

@jonleighton

This comment has been minimized.

Show comment
Hide comment
@jonleighton

jonleighton Oct 21, 2011

Member

Ok, I added a 3.2 milestone for this.

Member

jonleighton commented Oct 21, 2011

Ok, I added a 3.2 milestone for this.

@nocache

This comment has been minimized.

Show comment
Hide comment
@nocache

nocache Oct 24, 2011

+1
an explicit eager_load sounds like a good plan

nocache commented Oct 24, 2011

+1
an explicit eager_load sounds like a good plan

@jeremy

This comment has been minimized.

Show comment
Hide comment
@jeremy

jeremy Dec 29, 2011

Member

This feels like a disproportionate response to this failure case.

I'd propose leaving include as-is and finding a workaround or tweak to the heuristic to fix this particular issue.

Agreed that we don't want to write a whole sql parser, but the batch vs joined loading strategy heuristic has worked great.

Saying "include" is at a higher level of abstraction -- it's meant to hide the actual strategy. "I want all this stuff loaded, thanks! also, with these conditions and orderings on it, btw"

So, if anything, I think we should stop over-detecting qualified column names in conditions. And add API, like the references than @jonleighton suggested, to make these ambiguous cases explicit.

Member

jeremy commented Dec 29, 2011

This feels like a disproportionate response to this failure case.

I'd propose leaving include as-is and finding a workaround or tweak to the heuristic to fix this particular issue.

Agreed that we don't want to write a whole sql parser, but the batch vs joined loading strategy heuristic has worked great.

Saying "include" is at a higher level of abstraction -- it's meant to hide the actual strategy. "I want all this stuff loaded, thanks! also, with these conditions and orderings on it, btw"

So, if anything, I think we should stop over-detecting qualified column names in conditions. And add API, like the references than @jonleighton suggested, to make these ambiguous cases explicit.

@jonleighton

This comment has been minimized.

Show comment
Hide comment
@jonleighton

jonleighton Dec 29, 2011

Member

As discussed in campfire, we will investigate using references to explicitly state that a table is referenced. E.g.

Post.includes(:comments).where("comments.title = 'lol'").references(:comments)
# => triggers a JOIN

In most cases, it should be possible to automatically build up the references, e.g.

Post.includes(:comments).where(:comments => { :title => 'lol' })
# automatically adds 'comments' to the references

Where the user is using string SQL, they must add their own references for includes to work properly. So after this, we can deprecate the dodgy SQL-parsing regexp.

It's unlikely that I'll do this in time for 3.2, but if someone else wants to pick it up then be my guest...

Reopening for now, then.

Member

jonleighton commented Dec 29, 2011

As discussed in campfire, we will investigate using references to explicitly state that a table is referenced. E.g.

Post.includes(:comments).where("comments.title = 'lol'").references(:comments)
# => triggers a JOIN

In most cases, it should be possible to automatically build up the references, e.g.

Post.includes(:comments).where(:comments => { :title => 'lol' })
# automatically adds 'comments' to the references

Where the user is using string SQL, they must add their own references for includes to work properly. So after this, we can deprecate the dodgy SQL-parsing regexp.

It's unlikely that I'll do this in time for 3.2, but if someone else wants to pick it up then be my guest...

Reopening for now, then.

@oleander

This comment has been minimized.

Show comment
Hide comment
@oleander

oleander Jun 23, 2012

I'm not sure I've missed something, but I can't get it to work.

Works

Gig.select("gigs.*, (SELECT 1 FROM gigs LIMIT 1)").includes(:song).limit(1)
SELECT gigs.*, (SELECT 1 FROM gigs LIMIT 1) FROM `gigs` LIMIT 1
SELECT `songs`.* FROM `songs` WHERE `songs`.`id` IN (1)

Does not work (subquery contains a dot)

Gig.select("gigs.*, (SELECT 1 FROM gigs g2 INNER JOIN songs ON songs.id = g2.song_id LIMIT 1)").includes(:song).limit(1)
SELECT `gigs`.`id` AS t0_r0, `gigs`.`song_id` AS t0_r1, `gigs`.`channel_id` AS t0_r2, `gigs`.`time` AS t0_r3, `gigs`.`listed` AS t0_r4, `songs`.`id` AS t1_r0, `songs`.`artist_id` AS t1_r1, `songs`.`title` AS t1_r2, `songs`.`grade` AS t1_r3, `songs`.`length` AS t1_r4, `songs`.`play_count` AS t1_r5, `songs`.`gigs_count` AS t1_r6 FROM `gigs` LEFT OUTER JOIN `songs` ON `songs`.`id` = `gigs`.`song_id` LIMIT 1

Where did the subquery go?

I'm using rails 4.0.0.beta, e1838bf with the mysql2 gem version 0.3.10.

I'm not sure I've missed something, but I can't get it to work.

Works

Gig.select("gigs.*, (SELECT 1 FROM gigs LIMIT 1)").includes(:song).limit(1)
SELECT gigs.*, (SELECT 1 FROM gigs LIMIT 1) FROM `gigs` LIMIT 1
SELECT `songs`.* FROM `songs` WHERE `songs`.`id` IN (1)

Does not work (subquery contains a dot)

Gig.select("gigs.*, (SELECT 1 FROM gigs g2 INNER JOIN songs ON songs.id = g2.song_id LIMIT 1)").includes(:song).limit(1)
SELECT `gigs`.`id` AS t0_r0, `gigs`.`song_id` AS t0_r1, `gigs`.`channel_id` AS t0_r2, `gigs`.`time` AS t0_r3, `gigs`.`listed` AS t0_r4, `songs`.`id` AS t1_r0, `songs`.`artist_id` AS t1_r1, `songs`.`title` AS t1_r2, `songs`.`grade` AS t1_r3, `songs`.`length` AS t1_r4, `songs`.`play_count` AS t1_r5, `songs`.`gigs_count` AS t1_r6 FROM `gigs` LEFT OUTER JOIN `songs` ON `songs`.`id` = `gigs`.`song_id` LIMIT 1

Where did the subquery go?

I'm using rails 4.0.0.beta, e1838bf with the mysql2 gem version 0.3.10.

@dfens

This comment has been minimized.

Show comment
Hide comment
@dfens

dfens Sep 23, 2014

Contributor

Still in 3.2.19. Any chance to fix this?

Contributor

dfens commented Sep 23, 2014

Still in 3.2.19. Any chance to fix this?

@eileencodes

This comment has been minimized.

Show comment
Hide comment
@eileencodes

eileencodes Sep 23, 2014

Member

@dfens 3.2.19 is no longer supported for bug fixes, only security fixes so it will not be back ported into 3.2.19.

Member

eileencodes commented Sep 23, 2014

@dfens 3.2.19 is no longer supported for bug fixes, only security fixes so it will not be back ported into 3.2.19.

@ponny

This comment has been minimized.

Show comment
Hide comment
@ponny

ponny Nov 4, 2014

I'm on 4.0.11 and just started getting this.

ponny commented Nov 4, 2014

I'm on 4.0.11 and just started getting this.

@ausminternet

This comment has been minimized.

Show comment
Hide comment
@ausminternet

ausminternet Aug 27, 2015

same here with this code:
scope :label, -> label { joins(:labels).where("labels.name = ?", label) }
with label = foo.bar on latest rails with Postgres

same here with this code:
scope :label, -> label { joins(:labels).where("labels.name = ?", label) }
with label = foo.bar on latest rails with Postgres

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment