Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do you paginate responses when using AssociationLoader? #114

Closed
anujbiyani opened this issue Oct 18, 2019 · 7 comments
Closed

How do you paginate responses when using AssociationLoader? #114

anujbiyani opened this issue Oct 18, 2019 · 7 comments

Comments

@anujbiyani
Copy link

What's the recommended way to paginate a response when using AssociationLoader?

I've tried the following:

field :posts_connection, Types::PostType.connection_type, null: false
def posts_connection
  AssociationLoader.for(model: User, association_name: :posts).load(object)
end

but this results in two SQL queries being generated, one with my relay-defined pagination queries and one without.

I read through #26 and rmosolgo/graphql-ruby#425 but am unclear on how to use the lazy execution API to solve this. Adding the following to my schema did not help:

lazy_resolve(AssociationLoader, :perform)
@rmosolgo
Copy link
Contributor

@anujbiyani and I looked over this just now. My takeaway is that the example AssociationLoader would work great for belongs_to or has_one relationships, where load-by-id operations can be batched up, but it doesn't work for has_many relationships. Is that right?

That is, I can't think of how has_many loads could really be batched. What kind of SQL would it emit in that case?

@eapache
Copy link
Contributor

eapache commented Nov 20, 2019

If you're on a sufficiently recent version of MySQL/Postgres/whatever you can use window functions, roughly similar to the explanation at https://www.the-art-of-web.com/sql/partition-over/.

@anujbiyani
Copy link
Author

Ah okay, thanks.

@fabn
Copy link

fabn commented Dec 2, 2020

@anujbiyani how did you solved this? Could you provide an example of has_many with pagination?

anujbiyani added a commit to anujbiyani/association-loader-has-many that referenced this issue Dec 2, 2020
i expected the test to generate SQL queries like:
  User Load (0.1ms)  SELECT "users".* FROM "users" LIMIT ?  [["LIMIT", 2]]
  Post Load (0.1ms)  SELECT "posts".* FROM "posts" WHERE "posts"."user_id" IN (?, ?)  [["user_id", 1], ["user_id", 2]]
  Post Load (0.1ms)  SELECT "posts".* FROM "posts" WHERE "posts"."user_id" = ?  [["user_id", 1]]
  Post Load (0.1ms)  SELECT "posts".* FROM "posts" WHERE "posts"."user_id" = ?  [["user_id", 2]]

which would reproduce the bug in Shopify/graphql-batch#114, but instead i got:
  User Load (0.1ms)  SELECT "users".* FROM "users" LIMIT ?  [["LIMIT", 2]]
  Post Load (0.1ms)  SELECT "posts".* FROM "posts" WHERE "posts"."user_id" IN (?, ?)  [["user_id", 1], ["user_id", 2]]

which is correct. going to try downgrading versions / changing the association loader to see why i can't reproduce in this test repo
@anujbiyani
Copy link
Author

@fabn The solution is to simply not use AssociationLoader with a has_many. Just

field :posts_connection, Types::PostType.connection_type, null: false
def posts_connection
  object.posts
end

is what you want.

If you're loading many posts for many users, it does introduce an N+1 but there's no way around that.
If you're loading many posts for one user, then there's nothing to batch up so AssociationLoader doesn't provide any benefit.

Let's walk through the two possible paths:

Loading paginated posts for many users

Here's a sample graphql query and corresponding sql for batch loading without pagination:

query {
  users {
    posts {
      id
    }
  }
}

select * from users
select * from posts where user_id in (1,2,...)

If you try and paginate that, the sql query doesn't really makes sense:

query {
  users {
    posts(first: 5) {
      nodes { id }
    }
  }
}

select * from users
select * from posts where user_id in (1,2,...) limit 5

The limit would get applied to the posts, but that would get you only 5 posts overall not 5 posts per user.

Loading paginated posts for one user

In this case, you could use AssociationLoader and paginate. If you do, you get the extra query that I mentioned in OP. BUT AssociationLoader isn't actually helpful because there's only one set of posts to fetch, so there's nothing to batch.

query {
  user { # singular, just loading one user here
    posts {
      id
    }
  }
}

select * from users where id = 1
select * from posts where user_id = 1

Now look at it paginated:

query {
  user { # still singular
    posts(first: 5) {
      nodes { id }
    }
  }
}

select * from users where user_id = 1
select * from posts where user_id = 1 # this is the extra query pointed out in my OP
select * from posts where user_id = 1 limit 5

There's no batching here since it's just loading posts for one user, so AssociationLoader doesn't get you anything.

I've rewritten this comment 5 times as I went through the scenarios over and over, so apologies if this isn't very clear. I went as far as proving this out in a sample app. There are tests you can run (setup the app with bundle install and rails db:setup) which, when you look at the logs, show that paginating and batch loading yields basically the same queries as paginating and not-batch loading, with the exception of the one errant extra query.

So the solution is to just not batch load a has_many association.

@grjones
Copy link

grjones commented Oct 19, 2022

@anujbiyani

select * from posts where user_id in (1,2,...) limit 5

This should just be transformed into UNIONs for each user_id including a limit 5 for each select. One SQL query really ought to be possible.

@letiesperon
Copy link
Contributor

letiesperon commented Apr 9, 2024

I would also like to see an example of how to solve this.
I couldn't figure it out so I've created this custom loader for this purpose, but I have the feeling it can be prettier and it currently does not support has_many through relations.

Sharing in case it helps anybody, and in case anybody spots anything that might be wrong with it:

# Loader for paginated associations without N+1s when called from a GraphQL field.
#
# DISCLAIMER: It does NOT work with polymorphic associations or has_many :through associations.
#
# Example usage in `BookType`:
#
# def kept_published_comments(page:, per_page:)
#   Loaders::PaginatedAssociationLoader.for(Book, :published_comments, :kept, order: 'created_at DESC', page: 2, per_page: 10).load(object)
# end

module Loaders
  class PaginatedAssociationLoader < GraphQL::Batch::Loader
    def initialize(model, association_name, *scope_names,
                   order:,
                   page: 1,
                   per_page: 25
      super()

      @model = model
      @association_name = association_name
      @scope_names = scope_names
      @order = order
      @page = page || 1
      @page = 1 if page <= 0
      @per_page = per_page

      validate_association_exists
      validate_scopes_exist
      validate_not_has_many_through
    end

    def perform(records)
      association_query = association_klass.where(join_field => records)

      association_query = association_query.merge(reflection.scope) if reflection.scope.present?

      association_query = apply_scopes(association_query)
      association_query = apply_pagination(association_query)

      association_query_grouped = association_query.group_by { |record| record[join_field] }

      records.each do |record|
        record_key = record[reflection.active_record_primary_key]
        associated_records = association_query_grouped[record_key] || []

        fulfill(record, associated_records)
      end
    end

    private

    attr_reader :model, :association_name, :scope_names, :order, :page, :per_page

    def reflection
      @reflection ||= model.reflect_on_association(association_name)
    end

    def validate_association_exists
      return if reflection

      raise ArgumentError, "No association #{association_name} on #{model}"
    end

    def validate_scopes_exist
      scope_names.each do |scope_name|
        unless reflection.klass.respond_to?(scope_name)
          raise ArgumentError, "The associated class does not respond to '#{scope_name}'"
        end
      end
    end

    def validate_not_has_many_through
      return unless reflection.is_a?(ActiveRecord::Reflection::ThroughReflection)

      raise ArgumentError, "The association #{association_name} on #{model} " \
                           "is a 'has_many :through' association which is not supported"
    end

    def apply_scopes(association_query)
      scope_names.each do |scope_name|
        association_query = association_query.merge(association_klass.send(scope_name))
      end

      association_query
    end

    def apply_pagination(association_query)
      records_sql = association_query
                    .select("#{association_table}.* AS #{association_table}, " \
                            "ROW_NUMBER() OVER (PARTITION BY #{join_field} " \
                            "ORDER BY #{order}) as row_number")
                    .to_sql

      min_record_index = ((page - 1) * per_page)
      max_record_index = min_record_index + per_page

      association_klass
        .from("(#{records_sql}) AS #{association_table}")
        .where('row_number > ? AND row_number <= ?', min_record_index, max_record_index)
    end

    def association_table
      reflection.klass.table_name
    end

    def association_klass
      reflection.klass
    end

    def join_field
      reflection.join_primary_key
    end
  end
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants