Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support scalable pagination #613

Open
Arachnid opened this issue Nov 29, 2018 · 10 comments
Open

Support scalable pagination #613

Arachnid opened this issue Nov 29, 2018 · 10 comments

Comments

@Arachnid
Copy link

Presently, it's possible to query entities using a where clause, but this uses offsets from start or end, which likely won't scale well if paging over a large dataset. It'd be good to use the graphql connection pattern, or something similar, where result sets return an opaque cursor that can be passed in on subsequent calls to pick up where the previous query left off.

@leoyvens
Copy link
Collaborator

Settings offsets is not the most convenient API and we need good pagination support, connections seem like a good model to follow.

offsets from start or end, which likely won't scale well if paging over a large dataset

Could you elaborate on what's the issue you're envisioning here?

@Arachnid
Copy link
Author

Could you elaborate on what's the issue you're envisioning here?

In most database systems, a query like SELECT * FROM table LIMIT x OFFSET y involves the database internally iterating over and discarding the first y results. This results in the cost of paginating over a large dataset being O(n^2) instead of O(n). Using cursors, in contrast, doesn't suffer from this issue.

@leoyvens
Copy link
Collaborator

@Arachnid I see. Though that seems to be more a concern of implementation than of graphql interface. We could do a good implementation of graphql offsets that doesn't use OFFSET, and it's also possible to do a bad implementation of cursors that does use OFFSET on the DB.

@Arachnid
Copy link
Author

Arachnid commented Dec 17, 2018 via email

@fubhy
Copy link
Member

fubhy commented May 28, 2019

I wrote a little utility hook that takes care of automatically scraping the endpoint for more results (using skip & limit parameters) until it's exhausted:

import { useQuery } from '@apollo/react-hooks';
import { useRef, useEffect } from 'react';
import { DocumentNode } from 'graphql';

type QueryPair = [DocumentNode, DocumentNode];
type ProceedOrNotFn = (result: any, expected: number) => boolean;

export function useScrapingQuery([query, more]: QueryPair, proceed: ProceedOrNotFn, props?: any) {
  const limit = (props.variables && props.variables.limit) || 100;
  const skip = useRef((props.variables && props.variables.skip) || 0);
  const result = useQuery(query, {
    ...props,
    variables: {
      ...(props && props.variables),
      limit,
      skip,
    },
  });

  useEffect(() => {
    if (!!result.loading || !!result.error || !proceed(result.data, skip.current + limit)) {
      return;
    }

    result.fetchMore({
      query: more,
      variables: {
        ...result.variables,
        skip: skip.current + limit,
      },
      updateQuery: (previous, options) => {
        skip.current = skip.current + limit;

        const moreResult = options.fetchMoreResult;
        const output = Object.keys(moreResult).reduce(
          (carry, current) => ({
            ...carry,
            [current]: carry[current].concat(moreResult[current] || []),
          }),
          previous,
        );

        return output;
      },
    });
  }, [result, skip.current]);

  return result;
}

Basically, you pass a query tuple (first query mandatory, second is optional to provide a custom query for the "fetch more" logic (e.g. if the first query has other, non-paginated fields in it).

Example:

import gql from 'graphql-tag';

export const FundOverviewQuery = gql`
  query FundOverviewQuery($limit: Int!) {
    funds(orderBy: name, first: $limit) {
      id
      name
      gav
      grossSharePrice
      isShutdown
      creationTime
    }

    nonPaginatedQueryField(orderBy: timestamp) {
      ...
    }
  }
`;

export const FundOverviewContinueQuery = gql`
  query FundOverviewContinueQuery($limit: Int!, $skip: Int!) {
    funds(orderBy: name, first: $limit, skip: $skip) {
      id
      name
      gav
      grossSharePrice
      isShutdown
      creationTime
    }
  }
`;

It uses the "limit" and "skip" query variables. The hook automatically adds these by default.

Additionally, you need to provide a callback that checks if more needs to be fetched after each cycle.

Full usage example:

const FundList: React.FunctionComponent<FundListProps> = props => {
  const proceed = (current: any, expected: number) => {
    if (current.funds && current.funds.length === expected) {
      return true;
    }

    return false;
  };

  const result = useScrapingQuery([FundOverviewQuery, FundOverviewScrapingQuery], proceed, {
    ssr: false,
  });

  return <div>{...}</div>; // Render full fund list (keeps adding more items until the resource is exhausted.
}

@softwaredev927
Copy link

I also have same problem in our project.
If we don't use any where clause, we can simply save total count in a schema and use that, but we are using complex where clause and it's impossible to save all count of items filtered by each queries.

I want to request a feature that you can provide in the following way.

// assume I have entity like this
type Token @entity {
ID String!
price BigInt!
}

// then we can query like this
query {
tokens(where: {price_gt:"30"}) {
ID
}
}

// in this case can we use like this?
query {
countOf: tokens(where: {price_gt:"30"}) {
count
}
tokens(where: {price_gt:"30"}, first:1000) {
ID
}
}
// if we use special alias like "countOf", can you return one entity that has field count?

I think it's not too difficult to add this feature in your dev team.
If you guys don't have time, I can work with you to add this feature.
Thanks

@dotansimha
Copy link
Contributor

Just adding my thoughts here.

Today, pagination is implemented on the root of every Query type, and returns a ListType of an entity.

We can implement Cursor-based pagination (see spec here https://relay.dev/graphql/connections.htm). It's supported in all popular clients, and makes pagination super easy and robust (since it's cursor based, so it's easier to get a reliable response, instead of using skip).

We can expose a Connection type on the root Query, without changing the existing - the new field can co-exists with the current API without breaking changes.

Here's an example:

type Query {
  purpose(id: ID): Purpose!
  purposes(filter: PurposeFilter): [Purpose!]!
  purposeConnection(filter: PurposeFilter, paginate: PaginationFilter): PurposeConnection!
}

input PaginationFilter {
  before: String
  after: String
  first: Int
  last: Int
}

type Purpose { ... }

type PageInfo {
  hasNextPage: Boolean!
  hasPreviousPage: Boolean!
  startCursor: String
  endCursor: String
}

type PurposeEdge {
  node: Purpose
  cursor: String!
}

type PurposeConnection {
  pageInfo: PageInfo!
  edges: [PurposeEdge]!
}

@montanaflynn
Copy link

montanaflynn commented Mar 4, 2022

Having count aggregation would be very useful for pagination and displaying information in UIs. For example when filtering with where you could also include count aggregate with the same conditions and then have a page UI something like:

Found 49 tokens

(show first 10 tokens)

[1] [2] [3] [4]

@dotansimha dotansimha assigned saihaj and unassigned dotansimha Mar 15, 2023
@0xJem
Copy link

0xJem commented May 1, 2023

Is this still being worked on? Pagination with lots of historical data is a huge pain, and applying offsets really, really doesn't scale.

@eldimious
Copy link

I think it would be useful to add counter / cursor as pagination. Any idea if this feature will be supported?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.