Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Method with_permissions_to generating very inefficient queries. #100

Open
lillq opened this Issue Apr 20, 2011 · 3 comments

Comments

Projects
None yet
2 participants

lillq commented Apr 20, 2011

The following code generates sql as I would expect and with a large number of items this runs very fast. In my case ~3ms.

me = User.find_by_name("me")
with_user(me) do
  FeedItem.with_permissions_to(:read).limit(11)
end

Generates:

SELECT DISTINCT "feed_items".id 
FROM "feed_items" 
LEFT OUTER JOIN "users" ON "users"."id" = "feed_items"."user_id" 
LEFT OUTER JOIN "roles_users" ON "users"."id" = "roles_users"."user_id" 
LEFT OUTER JOIN "roles" ON "roles"."id" = "roles_users"."role_id" 
LEFT OUTER JOIN "roles_users" "roles_users_users" ON "roles_users_users"."user_id" = "users"."id" 
LEFT OUTER JOIN "users" "recipient_users_feed_items" ON "recipient_users_feed_items"."id" = "feed_items"."recipient_user_id" 
LEFT OUTER JOIN "roles_users" "roles_users_join" ON "recipient_users_feed_items"."id" = "roles_users_join"."user_id" 
LEFT OUTER JOIN "roles" "roles_users_2" ON "roles_users_2"."id" = "roles_users_join"."role_id" 
LEFT OUTER JOIN "roles_users" "roles_users_users_2" ON "roles_users_users_2"."user_id" = "recipient_users_feed_items"."id" 
WHERE (("roles"."id" = NULL) OR ("roles"."id" IS NULL) OR ("roles"."id" IN (1,3,4,6,2,8,5) AND 
      "users"."disabled" = 'f' AND "roles_users_2"."id" IN (1,3,4,6,2,8,5) AND 
      "recipient_users_feed_items"."disabled" = 'f' AND "feed_items"."recipient_user_id" IS NOT NULL) OR 
      ("roles"."id" IN (1,3,4,6,2,8,5) AND "users"."disabled" = 'f' AND "feed_items"."recipient_user_id" IS NULL) OR 
      ("roles"."id" IN (1,3,4,6,2,8,5) AND "users"."disabled" = 'f' AND "roles_users_2"."id" IN (1,3,4,6,2,8,5) AND 
      "recipient_users_feed_items"."disabled" = 'f' AND "feed_items"."recipient_user_id" IS NOT NULL) OR 
      ("roles"."id" IN (1,3,4,6,2,8,5) AND "users"."disabled" = 'f' AND "feed_items"."recipient_user_id" IS NULL)) LIMIT 11

But adding an order clause to this query generates a very inefficient query. In my case this runs at ~300ms. Roughly 100 times slower

me = User.find_by_name("me")
with_user(me) do
  FeedItem.with_permissions_to(:read).order("feed_items.id DESC").limit(11)
end

Generates:

SELECT * FROM (
  SELECT DISTINCT ON ("feed_items".id) "feed_items".id, feed_items.id AS alias_0 
  FROM "feed_items" 
  LEFT OUTER JOIN "users" ON "users"."id" = "feed_items"."user_id" 
  LEFT OUTER JOIN "roles_users" ON "users"."id" = "roles_users"."user_id" 
  LEFT OUTER JOIN "roles" ON "roles"."id" = "roles_users"."role_id" 
  LEFT OUTER JOIN "roles_users" "roles_users_users" ON "roles_users_users"."user_id" = "users"."id" 
  LEFT OUTER JOIN "users" "recipient_users_feed_items" ON "recipient_users_feed_items"."id" = "feed_items"."recipient_user_id" 
  LEFT OUTER JOIN "roles_users" "roles_users_join" ON "recipient_users_feed_items"."id" = "roles_users_join"."user_id" 
  LEFT OUTER JOIN "roles" "roles_users_2" ON "roles_users_2"."id" = "roles_users_join"."role_id" 
  LEFT OUTER JOIN "roles_users" "roles_users_users_2" ON "roles_users_users_2"."user_id" = "recipient_users_feed_items"."id" 
  WHERE (("roles"."id" = NULL) OR ("roles"."id" IS NULL) OR ("roles"."id" IN (1,3,4,6,2,8,5) AND "users"."disabled" = 'f' AND 
        "roles_users_2"."id" IN (1,3,4,6,2,8,5) AND "recipient_users_feed_items"."disabled" = 'f' AND 
        "feed_items"."recipient_user_id" IS NOT NULL) OR ("roles"."id" IN (1,3,4,6,2,8,5) AND "users"."disabled" = 'f' AND 
        "feed_items"."recipient_user_id" IS NULL) OR ("roles"."id" IN (1,3,4,6,2,8,5) AND "users"."disabled" = 'f' AND 
        "roles_users_2"."id" IN (1,3,4,6,2,8,5) AND "recipient_users_feed_items"."disabled" = 'f' AND 
        "feed_items"."recipient_user_id" IS NOT NULL) OR ("roles"."id" IN (1,3,4,6,2,8,5) AND "users"."disabled" = 'f' AND 
        "feed_items"."recipient_user_id" IS NULL))
) AS id_list 
ORDER BY id_list.alias_0 DESC 
LIMIT 11

Here the Order and Limit clauses are added outside of a Distinct Select On all feed items causing the query to fetch every feed_item. If you dont do this as a subquery and have the order and limit on the inner query it runs about much faster, from 365.784 ms to 2.906 ms on our dataset.

I am using:
postgres 9.0.3
declarative_authorization 0.5.2
rails 3.0.6

Owner

stffn commented Apr 20, 2011

From the way decl_auth works, I'd say that this is outside of the control of decl_auth. decl_auth just generates the query options. The order clause is then added by further ARel functionality.

If you have any insights into how decl_auth can optimize its query option generation to better suit ARel, we should definitely go for that, though.

lillq commented Apr 24, 2011

Hey stffn,

Rebuilding the Relation's includes_values as outer joins with the meta_where gem causes the query to be structured in much more efficient SQL.

module DeclarativeAuthorizationJoinsPatch

  def self.rebuild_joins(relation)
    includes_values = relation.includes_values
    relation.includes_values = []

    unless includes_values.blank?
      relation.joins_values = convert_of_joins_to_outer(includes_values)
    end

    relation
  end

  protected

  # This method uses the metawhere gem
  def self.convert_of_joins_to_outer(hash)
    case hash
      when Array
        hash.map! do |e|
          convert_of_joins_to_outer(e)
        end
      when Hash
        hash.inject({}){|result, (key, value)|
          new_key = case key
            when Symbol then key.outer
            else key
          end

          new_value = case value
            when Hash then convert_of_joins_to_outer(value)
            when Array then convert_of_joins_to_outer(value)
            else value.outer
          end

          result[new_key] = new_value
          result
        }
    end
  end

end

From my previous example:

me = User.find_by_name("me")
with_user(me) do
  DeclarativeAuthorizationJoinsPatch.rebuild_joins(FeedItem.with_permissions_to(:read)).order("feed_items.id DESC").limit(11)
end

This executes this SQL that runs in 2.9ms:

SELECT "feed_items".* 
FROM "feed_items" 
LEFT OUTER JOIN "users" ON "users"."id" = "feed_items"."user_id" 
LEFT OUTER JOIN "roles_users" ON "users"."id" = "roles_users"."user_id" 
LEFT OUTER JOIN "roles" ON "roles"."id" = "roles_users"."role_id" 
LEFT OUTER JOIN "roles_users" "roles_users_users" ON "roles_users_users"."user_id" = "users"."id" 
LEFT OUTER JOIN "users" "recipient_users_feed_items" ON "recipient_users_feed_items"."id" = "feed_items"."recipient_user_id" 
LEFT OUTER JOIN "roles_users" "roles_users_join" ON "recipient_users_feed_items"."id" = "roles_users_join"."user_id" LEFT OUTER JOIN "roles" "roles_users_2" ON "roles_users_2"."id" = "roles_users_join"."role_id" 
LEFT OUTER JOIN "roles_users" "roles_users_users_2" ON "roles_users_users_2"."user_id" = "recipient_users_feed_items"."id" 
WHERE (("roles"."id" = NULL) OR ("roles"."id" IS NULL) OR ("roles"."id" IN (1,3,4,6,2,8,5) AND 
             "users"."disabled" = 'f' AND "roles_users_2"."id" IN (1,3,4,6,2,8,5) AND "recipient_users_feed_items"."disabled" = 'f' AND 
             "feed_items"."recipient_user_id" IS NOT NULL) OR ("roles"."id" IN (1,3,4,6,2,8,5) AND "users"."disabled" = 'f' AND 
             "feed_items"."recipient_user_id" IS NULL) OR ("roles"."id" IN (1,3,4,6,2,8,5) AND "users"."disabled" = 'f' AND 
             "roles_users_2"."id" IN (1,3,4,6,2,8,5) AND "recipient_users_feed_items"."disabled" = 'f' AND 
             "feed_items"."recipient_user_id" IS NOT NULL) OR ("roles"."id" IN (1,3,4,6,2,8,5) AND "users"."disabled" = 'f' AND 
            "feed_items"."recipient_user_id" IS NULL)) 
ORDER BY feed_items.id DESC 
LIMIT 11

I hope this gives someone more familiar with this project insite into the query generation.

Adding this solution to my code gave me several big gains in performance.

Owner

stffn commented Apr 27, 2012

Sounds good, but we wouldn't want to depend on meta_where, I'd say. We'd need a patch that would use meta_where where available

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment