Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double call of scope_items method before and after pagination #4253

Closed
andrey-lizunov opened this issue Nov 22, 2022 · 4 comments · Fixed by #4263
Closed

Double call of scope_items method before and after pagination #4253

andrey-lizunov opened this issue Nov 22, 2022 · 4 comments · Fixed by #4263

Comments

@andrey-lizunov
Copy link

andrey-lizunov commented Nov 22, 2022

Describe the bug

Hello.
I see weird behaviour in my project when resolver with cursor pagination calls scope_items method in type two times.
First time the scope in this method is a full set of data without pagination. Second time the scope with pagination.
It affects the performance because first call works with big amounts of data.
I use scope_items method to remove records that not available for current user and policy checks for thousands of records is not what I want :)

Versions

graphql version: 2.0.15
rails (or other framework): 6.1.5

GraphQL schema

class OfferingType < ::Domains::Types::Base::Object
  ...
  def self.scope_items(items, context)
    items.select do |item|
      ::OfferingPolicy.new(current_user, item).show?
    end
  end
end
class OfferingsResolver < ::Domains::Types::Base::Query
  include SearchObject.module(:graphql)
  type ::OfferingType.connection_type, null: false
  ...
end

GraphQL query

query Offerings {
  offerings(
    first: 3,
  ) {
    edges {
      cursor
    }
    nodes {
      id
    }
  }
}
{
  "data": {
    "offerings": {
      "edges": [
        {
          "cursor": "MQ"
        },
        {
          "cursor": "Mg"
        },
        {
          "cursor": "Mw"
        }
      ],
      "nodes": [
        {
          "id": "1407"
        },
        {
          "id": "486"
        },
        {
          "id": "1190"
        }
      ]
    }
  }
}

Steps to reproduce

Try to call any resolver with cursor pagination that works with type object with scope_items method. Set breakpoint in scope_items method.

Expected behavior

scope_items method call should be made once for paginated data set

Actual behavior

scope_items call made for non paginated and paginated data

Additional context

Add any other context about the problem here.

With these details, we can efficiently hunt down the bug!

@rmosolgo
Copy link
Owner

Hi, thanks for the detailed report! I agree the list should only be scoped once and that the current behavior is a bug.

I think the reason for this bug is that connection types make an extra call to .scope_items, then the connection's nodes (or edges) list also makes a call to scope items. (Field#scoped? returns true for both list-type fields and connection-type fields.) So, I think the fix is to somehow make Field#scoped? return false when the field is a list-type, but when it belongs to a Connection type.

@andrey-lizunov
Copy link
Author

@rmosolgo could you check the PR and provide your feedback? Does it look like valid fix?

@cheald
Copy link

cheald commented Jan 10, 2023

I think this fix ended up producing the opposite of the desired behavior; scope_items is now called only on the unpaginated connection, rather than on the final materialized paginated array. What I want is for scoped_items to not run for the connection, but to only run for the list that results from that connection.

I've got this in my BaseObject, which is how I dealt with the two-pass behavior before:

    def self.scope_items(items, context)
      if items.is_a? Array
        items.select { |i| authorized?(i, context) }
      else
        items
      end
    end

Right now, graphql-ruby's behavior when any item in a list is unauthorized is to null the entire list. This turns out to be a problem in a few cases in my application, where my field resolvers resolve lists of records and then count on type authorization to determine if it's visible or not. What I want to have happen is for a list of items to be loaded, and then any unauthorized items just silently removed. I did this rather than allowing for nullable lists, as nullable nodes connections end up being a pain clientside, where we have to handle T[] | null rather than just applying list operations everywhere.

In 2.0.15, the two-pass nature of scope_items allowed this in a performant manner when dealing with paginated ActiveRecord associations. Obviously, an association might represent many thousands of records, and it is highly inefficient to load them all and authorize each of them. The way this worked it that when the field initially resolves, items is an unpaginated connection, because the pagination constraints are imposed by ActiveRecordRelationConnection later in resolution. Therefore, we just passed the relation through unchanged. However, during the second pass, when the list has been coalesced into an actual limited array of results, we can then do a performant pass of the array to remove any unauthorized entries that would otherwise cause the entire connection to return null.

What's happening now is the worst of both: the ActiveRecord_Relation gets passed to scoped_items (where I can't actually do anything with it, because my authorization logic relies on things that I can't do entirely in the DB, and I need authorized records), so it gets ignored, but then the resulting array of my type does not get passed to scope_items.

@cheald
Copy link

cheald commented Jan 10, 2023

In my case, this does the trick: I do actually double-scope, but the scope on the connection turns into a no-op, then I re-enable scoping on the nodes/edges. This results in scope_items only being run on my arrays.

class Types::BaseConnection < Types::BaseObject 
    include GraphQL::Types::Relay::ConnectionBehaviors 

  def self.edge_type(*args, field_options: {}, **kwargs)
    field_options[:scope] = true # Add this to all nodes fields
    super(*args, field_options: field_options, **kwargs)
  end

  def self.nodes_field(field_options: {}, **kwargs)
    field_options[:scope] = true # Add this to all nodes fields
    super(field_options: field_options, **kwargs)
  end

  # Don't apply any actual scoping to connections. Instead, we're going to scope the nodes directly.
  def self.scope_items(items, context)
    items
  end
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants