Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

virtual attributes not retrieved using select(),where() and includes() #15185

Closed
jreichert opened this issue May 19, 2014 · 41 comments

Comments

@jreichert
Copy link

commented May 19, 2014

Virtual attributes are properly assigned to the object when combining includes() and select(), but not when combining those with where(). Example:

p = Post.includes(:comments).select("posts.*, 1 as testval")
p.first.title
=> "My Post"
p.first.testval
=> 1

p = Post.includes(:comments).select("posts.*, 1 as testval").where("comments.id < ?",1000).references(:comments)
p.first.title
=> "My Post"
p.first.testval
NoMethodError: undefined method `testval' for #Post:0x007fcd6b93eb68

@sgrif

This comment has been minimized.

Copy link
Contributor

commented May 19, 2014

It looks like the reason this happens is due to the column aliasing that occurs as a result of the joined eager loading. Not sure what the correct behavior is here, given that it is not specified whether testval should be an attribute of the resulting Post, or an attribute of the resulting Comment.

@sgrif

This comment has been minimized.

Copy link
Contributor

commented May 19, 2014

One possible solution here would be to not eager load the comments in the same query if you've ever called select, and have it fall back to doing a separate query for comments, even though we'd already joined. Is there a case where that could cause unexpected results?

@jreichert

This comment has been minimized.

Copy link
Author

commented May 19, 2014

That was my suspicion as well as to the root cause - after column aliasing in a LEFT JOIN the result set is t0_r0...t0_rN, t1_r0...t1_rN, testval - and then there is no indication of where testval belongs.

So two separate issues: how the issue should be resolved, and how to handle my use case.

For the first, it would be great to add in syntax similar to references() for where clauses:

Relation.select("posts.*, 1 as :post.testval").references(:post).where...

That would remove the ambiguity about assignment of the attribute. This would then assign it a new alias in t0_rN when performing the left join. However, as a test I tried to change the select to

select("posts.*, 1 as t0_r2")
puts p.first.t0_r2

but that didn't work either. I'm not sure how the t0_rN values get mapped back to model attributes under the hood, so this probably requires some adjustment (perhaps because you can't reference the t0_rN aliases directly as attributes).

If this is infeasible for some reason, we should probably at least update the docs to indicate that this combination is unsupported due to the ambiguity in assigning the column to a table.

As to the second question about how to resolve my particular use case, I'm not entirely sure that what you're suggesting works. I'm dumping the output of the results into a JBuilder template that looks something like:

json.array! @posts do |post|
  json.array! (post.comments) do |comment|
    # do stuff here with comment data
  end
end

Which seems like the exact use case where includes() is handy. Otherwise I'm not sure how to associate the comments with the respective posts before I get to the template. Maybe some fancy combination of eager_load and preload that I'm not thinking of?

@sgrif

This comment has been minimized.

Copy link
Contributor

commented May 19, 2014

The suggestion I had above would still eager load the comments, it would just fall back to the default behavior, where it executes:

SELECT posts.* FROM "posts" WHERE ...
SELECT comments.* FROM "comments" WHERE post_id IN (...)
@jreichert

This comment has been minimized.

Copy link
Author

commented May 19, 2014

As I think about it this is quite hard to turn into SQL under the hood that can be easily mapped back to an object; even if it was updated such that you could 'reference' the model to which the column should be mapped, you would need some fancy SQL parsing on the select("...") method to make it work - and as it stands, I believe that part of the parser is pretty basic right now. It's just taking the passed string and sticking it verbatim in the SELECT clause.

To change it as I suggested would require:

  • looking for a references() method attached to the select
  • verifying that the select() contains a string (or strings), and not a hash of param names
  • creating the SQL fragment '1 as t0_rX', where X is the next available column for Post
  • before executing, store in memory the fact that t0_rX should be mapped back to Post.attribute[:testval]
  • map the value of t0_rN back to Post.attributes[:testval]
@jreichert

This comment has been minimized.

Copy link
Author

commented May 19, 2014

I see; I misunderstood your original point about loading the comments separately to mean that should be handled in the calling code. I'm still not entirely sure that what you are suggesting works for my use case, which is actually:

Post.includes(:comments).select("posts.*, 1 as testval").where("posts.active = 1 OR comments.id < 1000").references(:comments)

In which case I need an outer join to get both all active Posts and Posts with Comments having id less than 1000. I don't think this use case is covered by your suggestion, since using includes() with where() as it is today gives you an outer join as you'd expect, but changing it as per your suggestion would change it to an inner join (and thus remove a bunch of the needed Posts from the result set).

I suppose that it could be split up into two queries, but the first would still need to be an OUTER JOIN rather than an INNER JOIN, and the eagerly-loaded comments just wouldn't be loaded from the first query (they would, as you suggested, be loaded and then mapped by the second query).

@jreichert

This comment has been minimized.

Copy link
Author

commented May 19, 2014

It is also a bit limiting in that there would be no way to assign testval to Comment instead of Post if that is what was desired, but that seems like a particularly confusing edge case that may not be worth the trouble to cover.

@sgrif

This comment has been minimized.

Copy link
Contributor

commented May 19, 2014

That's not what I'm suggesting. If you had changed includes to joins,
it works. (Of course we lose the eager loading behavior). Given that the
only time this fails is when we are joining to an eager loaded table,
triggering aliasing of the columns in order to select from both tables, my
suggestion is that if you have called select, the query occurs as if you
had called joins, instead of includes. Then we perform the second query
as we would have for includes without references.

On Mon, May 19, 2014 at 9:07 PM, jreichert notifications@github.com wrote:

I see; I misunderstood your original point about loading the comments
separately to mean that should be handled in the calling code. I'm still
not entirely sure that what you are suggesting works for my use case, which
is actually:

Post.includes(:comments).select("posts.*, 1 as testval").where("posts.active = 1 OR comments.id < 1000").references(:comments)

In which case I need an outer join to get both all active Posts and Posts
with Comments having id less than 1000. I don't think this use case is
covered by your suggestion, since using includes() with where() as it is
today gives you an outer join as you'd expect, but changing it as per your
suggestion would change it to an inner join (and thus remove a bunch of the
needed Posts from the result set).

I suppose that it could be split up into two queries, but the first would
still need to be an OUTER JOIN rather than an INNER JOIN, and the
eagerly-loaded comments just wouldn't be loaded from the first query (they
would, as you suggested, be loaded and then mapped by the second query).


Reply to this email directly or view it on GitHubhttps://github.com//issues/15185#issuecomment-43558028
.

Thanks,
Sean Griffin

@jreichert

This comment has been minimized.

Copy link
Author

commented May 19, 2014

In the use case I outlined, swapping out 'includes' for 'joins' only results in the correct behavior if you use an outer join; otherwise you'll filter out all posts that are active but have no comments:

table id active
posts 1 false
posts 2 true
table id post_id
comments 1 2

The query should return both Post objects, since Post 2 is active and Post 1 has comments with id < 1000. If you just swapped includes(:comments) for joins(:comments) then Post 1 wouldn't be correctly retrieved because the inner join would filter it out.

@rails-bot

This comment has been minimized.

Copy link

commented Nov 19, 2014

This issue has been automatically marked as stale because it has not been commented on for at least
three months.

The resources of the Rails team are limited, and so we are asking for your help.

If you can still reproduce this error on the 4-1-stable, 4-0-stable branches or on master,
please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions.

@rails-bot rails-bot added the stale label Nov 19, 2014

@rafaelfranca rafaelfranca added attached PR and removed stale labels Jan 2, 2015

@dougc84

This comment has been minimized.

Copy link

commented Feb 17, 2015

I have this same issue - Rails 4.1.6, an upgrade to latest 4.1.x doesn't fix anything.

Basically, there's a select statement, with a custom, named attribute, a couple joins, and an includes, just like is mentioned above. The removal of the includes makes it work but does not return the correct results. Alternative approaches slow down everything exponentially.

@p1366

This comment has been minimized.

Copy link

commented Sep 3, 2015

We all have the same issue. The issue is in ignoring any columns/aliases when .eager_load is used.

Short and simple fix is to implement support of tN_attrname column aliases.

P.S. Its very annoying issue for me, because with the fact of condition absence in .preload it makes impossible to do short and pretty AR code for simple thing which can be implemented in raw sql in 15mins:
how to avoid N+1 when master entity should have calculated column(and where/order by it) and has_many entities should be preloaded with condition, like
User.select('calculate(lalala) as cal_ed').eager_load(:posts).where(posts: { field: variable }) # lost cal_ed
or
User.select(...).preload(posts, conditions: { ... }) # cal_ed present but impossible at all

@dmgr

This comment has been minimized.

Copy link

commented Sep 22, 2015

This is a monkey patch I have prepared to workaround this issue:

module ActiveRecord
  Base.send :attr_accessor, :_row_

  module Associations
    class JoinDependency
      JoinBase && class JoinPart
        def instantiate_with_row(row, *args)
          instantiate_without_row(row, *args).tap { |i| i._row_ = row }
        end
        alias_method_chain :instantiate, :row
      end
    end
  end
end

Then you can get a column value from a row:

Post.includes(:comments).select("posts.*, 1 as testval").first._row_['test_val']
@ttrmw

This comment has been minimized.

Copy link

commented Jan 25, 2016

@dmgr your monkey patch seems to break eager_load? I get: ArgumentError Exception: wrong number of arguments (1 for 2).

@Envek

This comment has been minimized.

Copy link
Contributor

commented Jun 16, 2016

I'm suffering from it too. My use case is here: http://stackoverflow.com/questions/37720669/add-computable-column-to-multi-table-select-clause-with-eager-load

And I'd prefer to avoid separate queries (preload behavior) as Oracle have limitation of 1000 ids in single query and it's relatively easy to hit it when loading one-to-many associations where many side of relation have tens or hundreds records.

@Envek

This comment has been minimized.

Copy link
Contributor

commented Jun 18, 2016

@dmgr Thank you! Your monkey patch works fine with ActiveRecord 4.2.6!

@dmgr

This comment has been minimized.

Copy link

commented Jun 19, 2016

@ttrmw I am not sure what exactly the problem may be, but you could try updated monkey patch, where I have changed:

def instantiate_with_row row, aliases

to

def instantiate_with_row(row, *args)

@Envek I am glad that it helped you :)

@alekseyl

This comment has been minimized.

Copy link

commented Sep 23, 2016

Hi, everyone.
I also struck with this problem in my project, also it was impossible to step out of virtual attribute. Breaking single query to couple queries also was unacceptable solution. So I came up with my own way of doing it which I put to a simple gem:

https://github.com/alekseyl/rails_select_on_includes

You are free to use it. It works only for base class now, but if you need some advanced select to child objects, I think I could upgrade it.

@mswiszcz

This comment has been minimized.

Copy link

commented Oct 7, 2016

I wanted to bump this up, I am having the same problems on rails 4.2.7, trying to use sql SUM method aliased as amount_sum, fetch labels for products as well with include, and scoping with several wheres`

@alekseyl

This comment has been minimized.

Copy link

commented Oct 7, 2016

Did you try my monkey-gem?

https://github.com/alekseyl/rails_select_on_includes

@mswiszcz

This comment has been minimized.

Copy link

commented Oct 7, 2016

@alekseyl Yes, it works perfectly, but I would love to see this fixed in rails. In ruby ecosystem gems have tendency to get abandoned, so I think fixes like this should be incorporated in rails.

Nevertheless, thank you for you work, at the moment it solves the problem 👌

@oddlyfunctional

This comment has been minimized.

Copy link

commented Oct 22, 2016

+1

@voltechs

This comment has been minimized.

Copy link

commented Dec 22, 2016

So, why isn't this fixed in core? Is it intended to be a supported feature?

@pasivuorio

This comment has been minimized.

Copy link

commented Dec 29, 2016

Has someone found a workaround for this problem in Rails 5? rails_select_on_includes gem worked well in Rails 4.2, but when migrating to 5.0.1 it gives following exception:

An ActiveRecord::StatementInvalid occurred in modelname#search:
PG::IndeterminateDatatype: ERROR: could not determine data type of parameter $4

@alekseyl

This comment has been minimized.

Copy link

commented Dec 29, 2016

Write me an issue on gem, and I'll fix it on upcomming holydays.

Upd. Noticed comment before noticing issue )

@alekseyl

This comment has been minimized.

Copy link

commented Jan 7, 2017

Added rails 5 support for my gem.
rails 4 now must rewrite Gemfile to:
gem 'rails_select_on_includes', '~> 0.4.3'

rails 5 now must rewrite Gemfile to:
gem 'rails_select_on_includes', '~> 0.5.0'

@gduprey

This comment has been minimized.

Copy link

commented Feb 23, 2017

So the rails_select_on_includes gem is a brilliant solution -- works perfectly!

That said, agree with others that if it's this gem can do it, it really should be a supported part of AR when using eager_load. virtual_attributes (in our case, in the form of SQL DB functions) are absolutely critical. Making multiple round-trip calls to the database server is slow and costly compared to bottling everything you know you need (dependent tables, computed values, DB functions) in a single query.

eager_load is so useful to efficient model use when you know you're going to be needing other tables, I'm not sure why folks think the first solution is to break it into separate queries (especially given the preload method that uses a separate query with a big IN statement is subject to size limits). I've been doing SQL for the better part of 20 years and inner joins, when dealing with 1-1 associations in a single DB query, is critical to efficient queries.

@smatyas

This comment has been minimized.

Copy link

commented May 30, 2017

The rails_select_on_includes patch from @alekseyl works perfectly for me too.
We have to use aggregated and virtual fields in our queries for performance reasons, so from my point of view it should be a must to handle this seamlessly in an ORM.

WendyBeth added a commit to littlelines/otwarchive that referenced this issue May 30, 2017

Adds rails_select_on_includes gem
This fixes a problem raised in the UsersController (and potentially elsewhere)
where Mysql was blowing up because a virtual attribute defined in a select
statement wasn't a database column.

See the issue this is discussed in: rails/rails#15185
And the Gem: https://github.com/alekseyl/rails_select_on_includes

zz9pzza added a commit to otwcode/otwarchive that referenced this issue Jun 5, 2017

Ao3 5022 rails 4 dot 1 (#2927)
* AO3-5022 Bump Rails to 4.1.16

* AO3-5022 Remove with_scope on Work model

* AO3-5022 Bump Authlogic version

* AO3-5022 Bump to Authlogic 3.4.6

* AO3-5022 Resolve old association syntax for archive_faq

* Fixes triggering of CSRF protection in bookmarks controller spec

`get :action, format: :js` triggers CSRF protection in Rails 4.1
Replace with `xhr :get, :action, format: :js`

* Re-adds support for Minitest assertions in Cucumber steps

* Fixes Authlogic::Session::Activateion::NotActivated error in logout user step

* Fixes login step authlogic bugs

* Removes 'group_by' virtual attribute if @fandoms is empty in users controller

* Updates syntax for getting all Language ordered by short

* Fixes admin invitations feature specs

The old syntax `find(:all)` or `find(:first)` is incorrect - find now only
accepts one argument, which it expects to be the id of the object it's looking
for. This should fix the remaining spec failures in admin_invitations.feature

* Fixes syntax for scope on user

* Adds link_to_function helper

This is in place of rewriting the html/javascript everywhere to be unobtrusive:

http://guides.rubyonrails.org/working_with_javascript_in_rails.html#unobtrusive_javascript

Which is the recommended approach.

* Fixes query syntax in scope in Question model

* Fixes MySql2 syntax error by calling 'size' instead of 'count'

The bug this is addressing is described here:

rails/rails#15138

I chose to use `size` in these cases because the collections are already
loading. This by-steps the bug and isn't making another call to the database.

* Fixes problem with AR query in SeriesController

* Fix AR queries in Pseud model

* Fixes the calling of a virtual attribute on an empty array of Fandom objects in
the pseuds controller

* Fixes AR query syntax errors found in comments_and_kudos/kudos.feature

* Refactors helper method to eliminate bug

Can't call `size` on a collection that hasn't been loaded, and `count` needs to
go because of the known 4.1 but with select and count.

* Fixes AR query syntax in pseud model scope

* Fixes MySql count with select bug in challenge assignments partial

Removes whitespace

* Fixes bug where `size` was being called on `nil`

* Fixes bug with order of operations

one = 'one' && true will make one == true and return true
(one = 'one') && true will make one == 'one' and return true

* Adds rails_select_on_includes gem

This fixes a problem raised in the UsersController (and potentially elsewhere)
where Mysql was blowing up because a virtual attribute defined in a select
statement wasn't a database column.

See the issue this is discussed in: rails/rails#15185
And the Gem: https://github.com/alekseyl/rails_select_on_includes

* Commits Gemfile.lock

* Fixes `Mysql2::Error: Unknown column 'users.login'` in pseud model

Eager loading the user in two queries in the pseud model (the `by_byline` scope
and the `parse_byline` method) was causing Mysql to throw an Unknown column
error. Using a join instead fixes the problem.

* Fixes problem with includes statement

The includes statements in question referenced a has_many association that used
conditions (approved_conditions) that applied to an association of the
association. (`Work.includes(:approved_collections)` where `approved_collections`
are conditions that have collections that meet a certain criteria). By
moving the scopes that determine whether a collection is
approved/rejected/approved by user into the collectible model, the Work model
doesn't have to guess what is meant by the reference to the columns
`collection_items.attributes`.

* Removes rails_select_on_includes

It caused all other previously passing tests to start failing and didn't end up
actually solving the problem it was supposed to.

* Calls load on @fandoms in UsersController to make sure it's loaded before the
view calls `.empty?` on it

* Fixes `find :all` reference

* Fixes outdated `find :all` reference

* Fixes 'count' bug

* Fixes typo in pseud model scope

* Fixes outdated `find :all` syntax in media and tags controllers

* Calls `load` on @fandoms variable in pseuds controller to prevent `empty?` from
erroring out when it's called on @fandoms in the view

* Fixes outed `find(:first...) syntax in app/helpers/tag_helpers

* Adds Authlogic fix to tag steps that need it

* Fixes variable reference

* Fixes Authlogic problem with deleting account step

* Fixes feature failures in users caused by outdated `find :all` syntax

* Fixes works feature failures by updated outdated `find :all` syntax

* Fixes challenge_signups_controller_spec failure

* Fixes get requests with js format returning cross origin not allowed error

in comments controller spec

* Checks that expected params exist before passing them to update_attributes

* Fixes get ... format: :js line to not raise cross origin error

* Fixes outdated `find(:all)` syntax in tags controller

* Try

* Assign varible

* Try this

* And the second case

* Corgis are still lovely

* Fixes mysql syntax problem with `count` in SpamReport method

* Shortens syntax in external authors controller conditional

* Adds TODO/note about link_to_function definition

* Fixes typo in work model

* Removes duplicate line in user step definition

* Removes comment for useless gem

* Makes stylistic change - `size == 0` is better as `size.zero?`
@DiegoSalazar

This comment has been minimized.

Copy link

commented Nov 29, 2017

I was able to use rails_select_on_includes on Rails 4.1.16 but needed a small patch: alekseyl/rails_select_on_includes#10

@mengu

This comment has been minimized.

Copy link

commented Dec 14, 2017

hi. i believe this must be fixed in AR core. @alekseyl would you be interested in sending a PR?

@sgrif

This comment has been minimized.

Copy link
Contributor

commented Dec 14, 2017

I'm going to go ahead and close this, as there's been very little movement on it for 3 years. I don't think this is strictly a bug, but I do think this is a gotcha that we can (possibly) improve in a backwards compatible way.

I'm not sure if my comment from 2014 still holds or not. That approach is only tenable if we can be certain there is no way to access a column specified in a manual select when used in conjunction with .includes and .references (e.g. something that triggers column aliasing).

I would still accept a PR that addresses this, if it follows the approach outlined back in my original comments, and if the PR description provides enough information to demonstrate that there is no way this is going to break behavior someone is relying on, or if the PR provides a deprecation cycle and clear migration path for people relying on that behavior.

Looking through the gem linked, that is not an implementation that would be accepted if it were submitted as a PR.

With all of that said, I consider this a gotcha, not a bug -- And I don't see any movement happening any time soon, so I'm going to close this issue.

@sgrif sgrif closed this Dec 14, 2017

@mengu

This comment has been minimized.

Copy link

commented Dec 15, 2017

hi @sgrif

consider the following scenario: we have 3 models: post, comment, author. we get posts with Post.joins(:author).includes(:author).all. that's fine. we'd like to get the comment count as well:

Post.joins(:author).includes(:author).select("(SELECT COUNT(*) FROM comments WHERE comments.post_id = posts.id) AS comment_count").

while this generates the correct query, comment_count will not be available in post instances where as it will be available if we don't eager load authors. how is this not a bug?

@tilhoft

This comment has been minimized.

Copy link

commented Feb 1, 2018

I think this is still a Bug, i'm sad about this issue was closed.

@ncri

This comment has been minimized.

Copy link

commented Feb 1, 2018

Yep I agree, just stumbled upon this too.

@chrishough

This comment has been minimized.

Copy link

commented Mar 2, 2018

I just hit this as well, and yes, this is still a bug

@jstayton

This comment has been minimized.

Copy link

commented Mar 23, 2018

If you're OK with one additional query – as opposed to the eager loading behavior of includes with where – you can use preload + joins instead, which still allows virtual attributes to be accessed. My guess is that's a totally acceptable trade-off for most use-cases. That said, I agree that this is an annoying gotcha that people obviously stumble upon (case in point: this thread).

@chrishough

This comment has been minimized.

Copy link

commented Mar 23, 2018

@jstayton do you have an example?

@kalinchuk

This comment has been minimized.

Copy link

commented Nov 3, 2018

Bug still exists. Any updates?

@Tarens2

This comment has been minimized.

Copy link

commented Nov 7, 2018

If you do queries something like this
posts = Post.includes(:comments).select("posts.*, 1 as testval").where("posts.active = 1 OR comments.id < 1000").references(:comments)

you can access custom columns by this code posts.first.attributes["testval"], for example it is for first post

@delongGao

This comment has been minimized.

Copy link

commented Mar 14, 2019

Still having the same issue with rails 5.2.2 and postgresql. Basically all select statements are ignored when used with includes and references.

@joshmn

This comment has been minimized.

Copy link

commented Apr 15, 2019

Really kills me. I'm not familiar enough with the underlying parts of ActiveRecord to even attempt at giving it a go, but if someone wants to give me a hint I can take a look.

Item.includes(sales_order_items: [:sales_order]).references(sales_order_items: [:sales_order]).where('sales_orders.txn_date > ?', 3.months.ago).select("items.*, sum(sales_order_items.quantity_remaining) as total_quantity").group('sales_order_items.id, sales_orders.id, items.id').first.total_quantity

But if I plug that in ApplicationRecord.connection.execute(@items.to_sql).to_a.first and key down, I can see that it is returned, just not mapped.

Edit: https://github.com/alekseyl/rails_select_on_includes fixes this. Thanks @alekseyl — don't know how this isn't in AR core.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.