Skip to content
This repository

ActiveRecord query changing when a dot/period is in condition value #950

Closed
lighthouse-import opened this Issue May 15, 2011 · 23 comments
Lighthouse Import

Imported from Lighthouse. Original ticket at: http://rails.lighthouseapp.com/projects/8994/tickets/6611
Created by Corey Ward - 2011-03-24 03:00:22 UTC

I have created a barebones Rails app demonstrating this bug: https://github.com/coreyward/bug-demo

I also have a Question open on StackOverflow about this: http://stackoverflow.com/questions/5199235/activerecord-query-changing-when-a-dot-period-is-in-condition-value

I don't have the skills/knowledge to know A) where this is coming from or B) how to fix it. Any help would be very much appreciated.


This bug occurs when you have a "child" model (one that belongs_to another model) with an order scope on it and you try to set a where condition.

So for the following models:

class Person < ActiveRecord::Base
  has_many :items
end

class Item < ActiveRecord::Base
  belongs_to :person
  default_scope order(:ordinal)
end

With the following calls:

Person.includes(:items).where(:name => 'John')
# or
Person.includes(:items).find_by_name 'John'

The items loaded are ordered by the ordinal on each Item.

If you introduce a period into the name, though (e.g. "John.Smith"), the order scope is ignored.

To see this in action...

$ git clone git://github.com/coreyward/bug-demo.git
$ cd bug-demo
$ rake db:create db:migrate db:seed
$ rails s

And then fire up http://localhost:3000/ and see for yourself.

Lighthouse Import

Imported from Lighthouse.
Comment by Anuj Dutta - 2011-03-24 03:44:43 UTC

Thanks for such a detailed info and the demo application. Ran the app and can confirm that this is a bug. I am not sure what is causing it yet but I am investigating.

Lighthouse Import lighthouse-import closed this May 21, 2011
Glenn Tweedie

We are still experiencing this issue with Rails 3.1
Can it please be re-opened

Aaron Patterson tenderlove reopened this September 11, 2011
Jason Hutchens

Thanks @tenderlove; we're getting this issue in 3.1 with an autocomplete which fails (with an exception) iff the user enters data that contains a period. The work-around for this is to do user_data.gsub!('.', '_'), which works as we're doing a LIKE query anway, but one concern is whether we're hitting changes in query behaviour elsewhere that don't manifest themselves so obviously.

Aaron Patterson
Owner

@jasonhutchens makes sense. Is this only happening on PG, or any database? Can you possibly show code that breaks? Thanks.

Jason Hutchens

@tenderlove sorry for the delay; we will answer those questions as soon as we're able

Jonathan Roes

Confirmed on SQLite with https://github.com/coreyward/bug-demo and rails-HEAD after a few changes to make the project 3.1-compatible.

I started down the path of writing a failing unit test but adding new items to the AR fixtures easily causes a bunch of unintended failures. Any advice?

Tom Meier

Hit this bug too when firing an includes statement on a query with periods in it. (postgres db, rails 3.1.1)

Peter McCracken

I've also encountered this bug. My version of it was slightly different (in my case, the select query was ignored, not the order query) so I created a different sample application to demonstrate the error. See https://github.com/peterjm/rails-activerecord-bug-demo

I've confirmed the error with both sqlite3 and mysql2.

Tom Meier

^^ nice one @peterjm, that makes it even clearer with the 2 characters test, I also tested your sample over postgres, and same deal, so it seems to be independent of database choice.

Also I tested the same sample with rails 3.1.1, same deal.

Tom Meier

So as far as i can tell, its this :

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/relation.rb#L523

Returning both 'people' and 'xx' as valid table names, causing it to then alias the string (having a table found count greater than 1), and burning the previously defined values.

Peter McCracken

Yep, I think you're right -- forcing tables_in_string to return an empty array definitely fixes my problem.

I think fixing this might be tricky -- I'll need to dive deeper than my current knowledge into the inner workings of ActiveRecord.

Aaron Patterson
Owner

I've been looking in to this. The problem is that there is no way for us to tell the difference between someone referencing a column and string literal without a full blown SQL parser. The problem stems from the fact that we allow users to add arbitrary snippets of SQL. Arbitrary snippets of SQL, in addition to reconciling eager loaded tables within those snippets makes everything fail.

We can't tell if someone is referencing a "table like thing" from a string literal because string literal quoting changes from database to database. As far as I can tell, we have three options:

  1. Disallow arbitrary SQL
  2. Write a SQL parser for every database
  3. Document the problem and move on

I'm firmly in favor of option 3, unless someone more clever than I can come up with a solution. Maybe @jonleighton?

Anyway, here is a test case that will fail:

diff --git a/activerecord/test/cases/associations/has_many_associations_test.rb b/activerecord/test/cases/associations/has_many_associations_test.rb
index cddd2a6..bf4625c 100644
--- a/activerecord/test/cases/associations/has_many_associations_test.rb
+++ b/activerecord/test/cases/associations/has_many_associations_test.rb
@@ -52,6 +52,12 @@ class HasManyAssociationsTest < ActiveRecord::TestCase
     Client.destroyed_client_ids.clear
   end

+  def test_select_with_dot_in_name
+    Developer.where(:name => 'aaron.patterson')
+    .includes(:audit_logs)
+    .select('developers.id as omg').order('omg').to_a
+  end
+
   def test_create_from_association_should_respect_default_scope
     car = Car.create(:name => 'honda')
     assert_equal 'honda', car.name
Peter McCracken

Could there be some way to pass down a hint to the Relation that the string contains SQL, or is known not to contain SQL.

I'm just thinking out loud here, but here's a couple possibilities:

  1. Add some way to flag a string as not containing SQL, i.e. Developer.where(:name => 'aaron.patterson'.sql_safe) and then have Relation#tables_in_string understand the sql_safe flag.

  2. Add a method to Relations to prevent any of the custom strings being parsed as SQL, i.e. Developer.where(:name => 'aaron.patterson').no_sql.to_a. (This seems pretty ugly to me, but I'm just brainstorming).

Jon Leighton
Owner

@tenderlove, @NZKoz was talking about deprecating the tables_in_string magic a while back. I would be in favour of that. We could then simply document that if people are referencing stuff in string conditions that they wish to be eager loaded, then they must use eager_load(:foo). What do you think?

Michael Koziarski
Owner

people's apps would break in the upgrade, but they'd break predictably (no such table whatevers) rather than 'randomly' issuing a much more complicated query. Seems like a slightly painful transition but beats the ... madness we have right now :)

Jon Leighton
Owner

@NZKoz I don't think there's anything stopping us implementing it as a deprecation before removing entirely?

Michael Koziarski
Owner
Aaron Patterson
Owner

I like the idea of adding an eager_load rather than trying to intuit from the strings what tables people want to load. :+1:

Jon Leighton
Owner

Ok, I added a 3.2 milestone for this.

Glenn Tweedie

+1
an explicit eager_load sounds like a good plan

Jon Leighton jonleighton closed this in c99d507 December 29, 2011
Jeremy Kemper
Owner

This feels like a disproportionate response to this failure case.

I'd propose leaving include as-is and finding a workaround or tweak to the heuristic to fix this particular issue.

Agreed that we don't want to write a whole sql parser, but the batch vs joined loading strategy heuristic has worked great.

Saying "include" is at a higher level of abstraction -- it's meant to hide the actual strategy. "I want all this stuff loaded, thanks! also, with these conditions and orderings on it, btw"

So, if anything, I think we should stop over-detecting qualified column names in conditions. And add API, like the references than @jonleighton suggested, to make these ambiguous cases explicit.

Jon Leighton
Owner

As discussed in campfire, we will investigate using references to explicitly state that a table is referenced. E.g.

Post.includes(:comments).where("comments.title = 'lol'").references(:comments)
# => triggers a JOIN

In most cases, it should be possible to automatically build up the references, e.g.

Post.includes(:comments).where(:comments => { :title => 'lol' })
# automatically adds 'comments' to the references

Where the user is using string SQL, they must add their own references for includes to work properly. So after this, we can deprecate the dodgy SQL-parsing regexp.

It's unlikely that I'll do this in time for 3.2, but if someone else wants to pick it up then be my guest...

Reopening for now, then.

Jon Leighton jonleighton reopened this December 29, 2011
Jon Leighton jonleighton closed this in d13627d January 16, 2012
Matt Bridges mattdbridges referenced this issue from a commit in mattdbridges/rails January 16, 2012
Jon Leighton Deprecate inferred JOINs with includes + SQL snippets.
See the CHANGELOG for details.

Fixes #950.
a2dab46
Linus Oleander

I'm not sure I've missed something, but I can't get it to work.

Works

Gig.select("gigs.*, (SELECT 1 FROM gigs LIMIT 1)").includes(:song).limit(1)
SELECT gigs.*, (SELECT 1 FROM gigs LIMIT 1) FROM `gigs` LIMIT 1
SELECT `songs`.* FROM `songs` WHERE `songs`.`id` IN (1)

Does not work (subquery contains a dot)

Gig.select("gigs.*, (SELECT 1 FROM gigs g2 INNER JOIN songs ON songs.id = g2.song_id LIMIT 1)").includes(:song).limit(1)
SELECT `gigs`.`id` AS t0_r0, `gigs`.`song_id` AS t0_r1, `gigs`.`channel_id` AS t0_r2, `gigs`.`time` AS t0_r3, `gigs`.`listed` AS t0_r4, `songs`.`id` AS t1_r0, `songs`.`artist_id` AS t1_r1, `songs`.`title` AS t1_r2, `songs`.`grade` AS t1_r3, `songs`.`length` AS t1_r4, `songs`.`play_count` AS t1_r5, `songs`.`gigs_count` AS t1_r6 FROM `gigs` LEFT OUTER JOIN `songs` ON `songs`.`id` = `gigs`.`song_id` LIMIT 1

Where did the subquery go?

I'm using rails 4.0.0.beta, e1838bf with the mysql2 gem version 0.3.10.

Erich Menge erichmenge referenced this issue from a commit December 22, 2012
Commit has since been removed from the repository and is no longer available.
Erich Menge erichmenge referenced this issue from a commit December 22, 2012
Commit has since been removed from the repository and is no longer available.
Erich Menge erichmenge referenced this issue from a commit December 22, 2012
Commit has since been removed from the repository and is no longer available.
Erich Menge erichmenge referenced this issue from a commit December 22, 2012
Commit has since been removed from the repository and is no longer available.
Erich Menge erichmenge referenced this issue from a commit December 22, 2012
Commit has since been removed from the repository and is no longer available.
Kevin Sjöberg KevinSjoberg referenced this issue from a commit January 30, 2013
Commit has since been removed from the repository and is no longer available.
Kevin Sjöberg KevinSjoberg referenced this issue in activerecord-hackery/ransack January 30, 2013
Merged

Update Ransack::Constants#escape_wildcards #191

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.