Support for joins in ActiveRecord::Relation#update_all #522

Closed
dhh opened this Issue May 11, 2011 · 17 comments

Projects

None yet
@dhh
Member
dhh commented May 11, 2011
class Todo < ActiveRecord::Base
  default_scope where(trashed: false)
  has_many :comments, as: :commentable
end

class Todolist < ActiveRecord::Base
  default_scope where(trashed: false)
  has_many   :todos
  has_many   :comments, through: :todos
end

todolist.comments.update_all trashed: true

Produces:

ActiveRecord::StatementInvalid: Mysql2::Error: Unknown column 'todos.trashed' in 'where clause': UPDATE `comments` SET `trashed` = 1 WHERE `comments`.`trashed` = 0 AND `todos`.`trashed` = 0 AND `todos`.`todolist_id` = 968316918 AND (`comments`.`commentable_type` = 'Todo') ORDER BY created_at
@tenderlove tenderlove was assigned May 11, 2011
@jonleighton
Member

@dhh does this happen on master? I think it might be fixed by a change I did this morning.

@dhh
Member
dhh commented May 12, 2011

Same error on master.

@jonleighton jonleighton was assigned May 12, 2011
@jonleighton
Member

Okay, I have reassigned to me as I'm sure @tenderlove has plenty to do already :)

@jonleighton
Member

I have been looking into this. Here's my initial failing test case:

  def test_update_has_many_through
    post  = posts(:thinking)
    david = people(:david)

    post.people << david
    post.people.update_all :first_name => 'Bob'

    assert_equal 'Bob', david.first_name
  end

It's actually nothing to do with default_scope. It is to do with the fact that ActiveRecord::Relation#update_all pays no attention to any joins etc within the scope. As such, this can't be a regression, it's just a feature we never had in the first place.

To implement it we will need to use subqueries in the WHERE. So the above would become:

UPDATE `comments` SET `trashed` = 1
WHERE `comments`.`id` IN (
  SELECT `comments`.`id`
  FROM `comments`
  INNER JOIN `todos` ON `comments`.`todo_id` = `todos`.`id`
  WHERE `comments`.`trashed` = 0 AND `todos`.`trashed` = 0 AND `todos`.`todolist_id` = 968316918
) ORDER BY created_at

Hopefully I will return to this when 3.1 is out.

@nolman
Contributor
nolman commented Jun 2, 2011

@daicoden and I at @square were pairing on this and we were able to put something together along the lines of:

Unfortunately we are still on 2.3.11.

class Comment
  class << self
    def trash_all
      sql = "UPDATE #{quoted_table_name} "
      add_joins!(sql, {})
      sql << "SET #{sanitize_sql_for_assignment({:trashed => true})} "
      add_conditions!(sql, {})
      connection.execute(sql)
    end
  end
end

Now you can call todolist.comments(:conditions => {:trashed => false}).trash_all
This results in the following SQL:

UPDATE `comments` INNER JOIN `todos` ON `todos`.id = `comments`.todo_id SET `trashed` = 1 WHERE (`comments`.`trashed` = 0 AND `todos`.`todolist_id` = 968316918) 

Hope this helps!

@joeellis
Contributor

Came across the same issue and found the following solution here (http://wiseleyb.tumblr.com/post/3112570940/default-scope-and-update-all-dont-play-nicely) if its of any help in fixing the bug or anyone looking for just a quick interim solution. (currently working with my Rails 3.1.0.rc4 app)

Just add this monkeypatch in development.rb / environment.rb / whereever, and use the update_all! and delete_all! methods when dealing with relations that are default_scoped.

    module ActiveRecordMixins
      class ActiveRecord::Base
        def self.update_all!(*args)
          self.send(:with_exclusive_scope) { self.update_all(*args) }
        end
        def self.delete_all!(*args)
          self.send(:with_exclusive_scope) { self.delete_all(*args) }
        end
      end
    end
@titanous
titanous commented Aug 3, 2011

Works fine on 3.0.9 but is broken on 3.1.0rc5. Seems like a regression to me.

def test_joins_update_all
  assert Comment.joins(:post).where('posts.id IS NOT NULL').update_all(:post_id => 1)
end
@jonleighton
Member

Okay, I looked into this.

On 3.0.9, we end up just slapping an INNER JOIN in the query. This actually works on mysql, though it doesn't work on sqlite3 and postgresql.

So this has never been properly supported on all databases, but it inadvertently worked on mysql.

So I have re-targeted this for 3.1 to fix the regression on mysql.

Then, for 3.2 we should implement a solution that works on other databases.

If anyone's interested, here's my test script.

@sbleon
sbleon commented Aug 4, 2011

I think jonleighton's proposal is the correct default behavior as it's RDBMS-independent. The RDBMS-specific adapters could then choose to implement a more specific UPDATE/JOIN which would probably be more performant than the "UPDATE/WHERE id IN" version.

FWIW, the Postgres-specific version of the query above is:

UPDATE `comments`
SET `trashed` = 1
FROM `todos`
WHERE `todos`.id = `comments`.todo_id  AND `comments`.`trashed` = 0 AND `todos`.`todolist_id` = 968316918
@sbleon sbleon added a commit to sbleon/rails that referenced this issue Aug 7, 2011
@sbleon sbleon Fix of issue #522 (update_all fails with joins) 1edab99
@sbleon
sbleon commented Aug 8, 2011

This commit adds support for joins in update_all to Rails 3.0. You can test it in my update_all_join_support branch.

I think my commit needs a little more work (SQL sanitization, refactoring.) If it's done soon, though, can new features still be added to 3.0.10?

I think a very similar approach would work for Rails 3.1 if anyone wants to adapt my code.

Thanks to @jonleighton for the test case and test script that I used as a starting point! My slightly more robust version of the test script is here.

@jonleighton jonleighton added a commit to jonleighton/rails that referenced this issue Aug 11, 2011
@jonleighton jonleighton Support updates with joins. Fixes #522. bb40a97
@jonleighton
Member

Thanks @sbleon for working on a fix. However, I wanted to use arel to generate the queries, and to use the connection adapter to handle the various db-specific bits.

I have a fix ready and waiting here: jonleighton@rails:master...update_all_with_joins

It required some changes in arel itself, so I haven't merged this code because I am waiting for @tenderlove to do a new arel release (otherwise merging will just cause tests to break).

@sbleon
sbleon commented Aug 15, 2011

@jonleighton, thanks for the update. I like what you've done, and allowing
the adapters to override the behavior will create a lot more flexibility.

I actually realized after I developed my fix that the "subquery for IDs"
strategy wouldn't work for my specific case, because the UPDATE clause
needed to reference the joined tables, which it can't do when they're in a
subquery. I think it would be possible to make this work in the adapters,
but I'm not sure how this would interact with the limit and offset options.
Anyway, mine is an edge case and not as important as fixing the basic
update_all behavior.

On Mon, Aug 15, 2011 at 5:41 AM, jonleighton <
reply@reply.github.com>wrote:

Thanks @sbleon for working on a fix. However, I wanted to use arel to
generate the queries, and to use the connection adapter to handle the
various db-specific bits.

I have a fix ready and waiting here:
jonleighton@rails:master...update_all_with_joins

It required some changes in arel itself, so I haven't merged this code
because I am waiting for @tenderlove to do a new arel release (otherwise
merging will just cause tests to break).

Reply to this email directly or view it on GitHub:
#522 (comment)

@jonleighton jonleighton added a commit that closed this issue Aug 15, 2011
@jonleighton jonleighton Support updates with joins. Fixes #522.
Conflicts:

	activerecord/lib/active_record/connection_adapters/mysql_adapter.rb
	activerecord/test/cases/relations_test.rb
044fb8c
@neglectedvalue

Hi! It does not work on PostgreSQL and produces:

PG::Error: ERROR:  missing FROM-clause entry for table "tbl"
@bbuchalter

I was expecting this type of statement to work with update_all, but it does not:

  def test_update_all_with_includes_references
    comments = Comment.includes(:post).references(:post).where('posts.id' => posts(:welcome).id).limit(1)
    assert_equal 1, comments.update_all(:posts_id => posts(:thinking).id)
  end

It yields the following SQL,which is obviously invalid.

UPDATE "comments" SET "posts_id" = 2 WHERE "comments"."id" IN (SELECT "comments"."id" FROM "comments" WHERE "posts"."id" = 1 LIMIT 1)

I suppose I expected it to work because a statement like this behaves as expected, creating the LEFT OUTTER JOIN:

Comment.includes(:post).references(:post).where('posts.id' => posts(:welcome).id).limit(1)
SELECT  "comments"."id" AS t0_r0, "comments"."post_id" AS t0_r1, "comments"."body" AS t0_r2, "comments"."type" AS t0_r3, "comments"."tags_count" AS t0_r4, "comments"."children_count" AS t0_r5, "comments"."parent_id" AS t0_r6, "comments"."author_id" AS t0_r7, "comments"."author_type" AS t0_r8, "comments"."resource_id" AS t0_r9, "comments"."resource_type" AS t0_r10, "comments"."developer_id" AS t0_r11, "posts"."id" AS t1_r0, "posts"."author_id" AS t1_r1, "posts"."title" AS t1_r2, "posts"."body" AS t1_r3, "posts"."type" AS t1_r4, "posts"."comments_count" AS t1_r5, "posts"."taggings_with_delete_all_count" AS t1_r6, "posts"."taggings_with_destroy_count" AS t1_r7, "posts"."tags_count" AS t1_r8, "posts"."tags_with_destroy_count" AS t1_r9, "posts"."tags_with_nullify_count" AS t1_r10 FROM "comments" LEFT OUTER JOIN "posts" ON "posts"."id" = "comments"."post_id" WHERE "posts"."id" = 1 LIMIT

Is this something that update_all should support?

@JasonPoll

Using Rails 4.1.6 on Postgresql and am still experiencing the same issue when calling update_all when the relation has an left-outer joins.

@davidcpell

I've also run into this problem on Rails 4.1.4 and PostgreSQL with update_all and an inner join

Nhl::Play.where(player_id: nil).where.not(game_player_id: nil).joins(:game_player).update_all("plays.player_id = game_players.player_id")
# SQL:
UPDATE "plays" SET plays.player_id = game_players.player_id WHERE "plays"."id" IN (SELECT "plays"."id" FROM "plays" INNER JOIN "game_players" ON "game_players"."id" = "plays"."game_player_id" WHERE "plays"."player_id" IS NULL AND ("plays"."game_player_id" IS NOT NULL))
# Error:
G::UndefinedTable: ERROR:  missing FROM-clause entry for table "game_players"
@MhdSyrwan

Also, i tried an example like the following:

Product.joins(:purchase_orders).update_all('product.price = purchase_orders.price')

The query above generates the following query (resulting in a Missing From Clause error):

UPDATE "products" SET products.price = purchase_orders.price WHERE "products"."id" IN (SELECT "products"."id" FROM "products" INNER JOIN "purchase_orders" ON "purchase_orders"."id" = "products"."purchase_order_id")

While i think a query like this should be generated:

UPDATE products
SET price = purchase_orders.price
FROM purchase_orders
WHERE purchase_orders.id = products.purchase_order_id;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment