Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Foolproof Relay Pagination API #113

Draft
wants to merge 83 commits into
base: master
Choose a base branch
from
Draft

Conversation

dillonkearns
Copy link
Owner

@dillonkearns dillonkearns commented Feb 22, 2019

Adding Relay Pagination to elm-graphql

In-Progress Design So Far

Here's a summary of how you do pagination using the design I have in this pull request so far. Note that this would be the design available in your generated code if your schema uses the Official Relay Modern Connections Spec to do pagination.

Note: Google gives the Relay Legacy Spec as the first search result, be sure to look at the Relay Modern Spec instead.

  1. Your data is wrapped with the type Paginator.

This type represents the following things (none of which are changed manually)

  • A list of data which it adds to over time
  • The cursor, which points to how far along in the pagination you are
  • Whether there are more pages of data to load (right now... if you check later, that could change, but it's still worth tracking this information)
type alias Model =
    { paginator : Paginator Forward Stargazer }

type alias Stargazer =
    { login : String
    , starredAt : Github.Scalar.DateTime
    }
  1. We initialize empty data, giving it a direction to paginate. Note that the direction cannot be changed once you've started paginating (though of course you could build a new Paginator with backward).
init : Flags -> ( Model, Cmd Msg )
init flags =
    Paginator.forward
        |> (\paginator ->
                ( { paginator = paginator }
                , makeRequest 50 paginator
                )
           )
  1. Repository.stargazers here follows the Relay Connections Spec, so we pass in a pageSize and a Paginator type as we see below. Note that you pass in a SelectionSet StarGazer Github.Object.StargazerEdge. Why StarGazerEdge, not a Node? Two reasons. 1. the edge can contain information not just about a node, but about its relationship through that connection, and 2. The Relay Connections Specification doesn't actually mention a nodes { ... } field, only edges { node { ... } }.
query : Paginator Forward Stargazer -> SelectionSet Response RootQuery
query paginator =
    Query.repository { owner = "dillonkearns", name = "elm-graphql" }
        (Repository.stargazers
            50 -- pageSize - can be hardcoded, or vary over time
            paginator
            identity  -- these are any additional optional arguments passed in
            stargazerSelection  -- here you define how to get a batch of items
        )
        |> SelectionSet.nonNullOrFail


stargazerSelection : SelectionSet Stargazer Github.Object.StargazerEdge
stargazerSelection =
    SelectionSet.map2 Stargazer
        -- you can grab core data from the Node like this
        (Github.Object.StargazerEdge.node Github.Object.User.login)
        -- plus you can grab meta-data from the "Edge" (represents the relationship)
        Github.Object.StargazerEdge.starredAt


-- you'll want to parameterize a function to make a request so you can
-- continue when you get your pagination response back
makeRequest : PaginatedData Stargazer String -> Cmd Msg
makeRequest paginator =
    query paginator
        |> Graphql.Http.queryRequest "https://api.github.com/graphql"
        |> Graphql.Http.withHeader "authorization" "Bearer <TOKEN>"
        |> Graphql.Http.send GotResponse

update : Msg -> Model -> ( Model, Cmd Msg )
update msg model =
    GotResponse response ->
        case response of
            Ok updatedPaginator ->
                -- in some cases, you might continue paginated automatically like here
                -- other times you might want to do it on scroll, or button click, etc.
                ( { model | paginator = updatedPaginator }, makeRequest model.paginator )

            Err error ->
                ( model, Cmd.none )  -- you might want to retry a few times here

Guiding Goals

  • Make it impossible to do the wrong thing

  • Make it easy to do the right thing - ideally, we can give just one nice API that disallows using the low-level pagination directly, and instead gives you a high-level interface. This will be less confusing and less error-prone. But having the low-level available could potentially be an escape hatch, so it might be an option to consider.

  • Make Impossible States Impossible - it would be awesome if it was impossible to make invalid pagination requests. It's common to get runtime exceptions with APIs if you 1) forget to pass in pagination arguments (but perhaps some APIs allow you to paginate or not... Github for example will fail if you're not explicit), 2) pass in both Forward and Backward pagination arguments (though it's not technically illegal in the Relay spec, it's strongly discouraged... let's just assume it's illegal).
    Let's also eliminate things like 3) changing directions midway through paginating. And 4) let's make it impossible to update just the new data, but not update the cursor. You update them all atomically. This can be accomplished by passing in a data structure that includes the list of data so far, plus the pagination data. The selection set will add on to that data structure and update the pagination data with its response.

References

Open questions

  • Some APIs require you to pass in either forward or backward pagination args and will throw an error if none are provided (like Github, for example). Should the Paginator argument be a Maybe to support this case? Or would some other design allow for it? Or would that cause some unwanted corner cases and more confusing API? TODO look for examples.
  • Can we make some sort of RemoteData equivalent that captures the possible pagination data states?

Notes

  • You could get back hasNextPage false, and then try it again later to check if any new data has been added since. To check if that's changed, you can simply make another request after Paginator.hasMoreData returns False and see whether you get more data, and whether Paginator.hasMoreData changes to True.
  • What should the API look like if there is only a first or only a last, but not both, we'll use a phantom type to constrain what type of paginator can be passed in (e.g. it will accept only Paginator Forward MyDataType).
  • The Relay Spec says you can technically pass in both first and last, but it is highly inadvisable. See this note in the official spec. Let's just disallow it completely.
  • Retry logic is best handled by wrapping the Paginator type in a way that models the users specific retry logic for their domain. Relay client has retry config built-in. But it's easy to build a wrapper that is custom tailored for your needs, and this will make the core API less bloated and more flexible.

Possible strategies

Allow CLI Flag to Skip Relay Generation

Pros

  • Allows you to opt-out of
  • Will most likely do the "right thing" for most users (unless I'm missing something?)
  • Allows elimination of impossible states because if you use the feature, there are no escape hatches. The escape hatch is to opt-out of the relay code generation.

Cons

  • All or nothing. Either you follow the Relay spec and can use it, or you don't and can't get any of the benefits. This is probably reasonable. It would be good to give users a warning if they don't follow the Relay specification despite similar naming... tell them exactly what is preventing it from being considered a Relay schema. If you supply a flag, like --fail-on-relay-schema-violation then it will exit the CLI if there is a violation with non-zero exit status.
  • The code generation code will be a little bit more complicated

Allow both low-level access and high-level access.

Pros

  • Simple in that it does the same thing no matter what for all schemas

Cons

  • Harder to discover the pagination helpers potentially
  • Larger generated API so more confusing
  • Doesn't eliminate possible states

@dillonkearns
Copy link
Owner Author

@tmattio, thank you so much for contributing to the conversation! It's great to have insights from you, especially since you're using Relay pagination already! And all the better that you have experience with paginating using other GraphQL clients 😄

How do you handle previously queried data? It would be very nice to have a way to keep them in the paginator, it would avoid having to add custom logic around the paginator. In reason-apollo (or apollo for this matter), you would merge the results when you receive them and the data passed in the views (render) would contains the merge result.

If I'm understanding your question correctly, you're asking about the standard use case where you need to take, for example, the 1st page of results, and then combine that with the 2nd page of results after getting that response. Correct me if I misunderstood the question!

The way this works is like so:

  1. Initialize a Paginator. It has an empty list of nodes.
  2. Pass your initialized Paginator into the helper function for a Connection field (like stargazers in the example code above).
  3. The SelectionSet for that Connection gives you back an updated Paginator. The merging is correctly handled for you under the hood, so as soon as you get the response data back you have the correct data!
  4. Repeat steps 2 and 3, except this time you are passing in a Paginator which contains a page of data already.

I hope that answers the question! I think it's a nice design because it's impossible to forget to add it back. The updated Paginator you get back from your SelectionSet contains the correct cursor and merged data. So you can't remember to update the cursor, but forget to merge the data, or vice versa.

Can you change the definition of the paginated node? (i.e. I often add a total field in the paginated node that provides the total number of edges)

I'm still trying to figure out the most elegant way to do this, but what I have in mind is that I will strip off everything related to low-level pagination, and then keep whatever is left. For example, let's take this definition:

type StargazerConnection {
  # these are all for pagination
  edges: [StargazerEdge]
  nodes: [User]
  pageInfo: PageInfo!

  # this is meta-data about the Connection itself
  # it is not needed for pagination, but the user might want to fetch it
  totalCount: Int!
}

type StargazerEdge {
  node: User!
  starredAt: DateTime!
  
  # this is to help with pagination, but it's not strictly required
  # the generated code will rely on PageInfo instead
  cursor: String!
}

Given these types in the schema, the generated code would effectively give you access to data for this schema (the pagination data would not be directly accessible, it would be used by the internals for the high-level pagination API):

type StargazerConnection {
  totalCount: Int!
}

type StargazerEdge {
  # the user creates a SelectionSet for StargazerEdge and
  # passes it in to the generated pagination helper function
  node: User!
  starredAt: DateTime!
}

Note that the edges is not available directly any more. You can only access that through the high-level Pagination API. So the only way you can access a StargazerEdge is with a Paginator. However, you can still make a request directly to get totalCount from the StargazerConnection without using a Paginator.

In the same way, can you change the parameters of the paginated node? (i.e. add a filter parameter alongside first, last, etc.)

Absolutely! See the code snippet in my comment above, it's the identity function there (I know it's a lot of details!):

        (Repository.stargazers
            50 -- pageSize - can be hardcoded, or vary over time
            paginator
            identity  -- these are any additional optional arguments passed in
            stargazerSelection  -- here you define how to get a batch of items
        )

I often had problem with reason-apollo when the node type was a union or an interface. Maybe it's not a problem with the current design, but I wanted to note it ;)

Indeed, I've considered this use case as well and it will work out quite nicely with this design!

Also, it is out of scope for this PR, but Relay opens the door to great caching optimizations with the ID of the nodes. Is this something you've been contemplating as well?

I have been thinking about that. I'm not sure what that would look like, or if it would be a good idea for the elm-graphql to handle. If anything, it could potentially be a separate layer that plugs into it. It's a little hard to say! I would love to collect use cases around that. I opened up a separate issue to discuss that: #114.

Thanks again for sharing your thoughts, I really appreciate it!

@Maxim-Filimonov
Copy link
Contributor

Great initial work on the concept. I'm a bit confused about the direction though. When a paginator with a number of pages like 1..2....3...4...5 is used would we end up constantly creating and destroying paginators depending on which page users try to go to?
I do understand pagination for infinite scrolling or "load more" case wouldn't change the direction. This is a bit different use case.

@dillonkearns
Copy link
Owner Author

When a paginator with a number of pages like 1..2....3...4...5 is used would we end up constantly creating and destroying paginators depending on which page users try to go to?

Hey @Maxim-Filimonov, thanks for the comment. It is an interesting use case, and it's a perfectly reasonable way to do pagination. But it's not supported by the Relay Connections Protocol. It's inherent to the cursor-based pagination that you can't jump to a specific page since cursors are opaque. So there's no way to say "give me page 5". If you wanted to go backwards or forwards and create pages, that's really a client-side concern at the point. You just need to be sure to load enough data to present the page you want to show. I hope that clarifies things!

@tmattio
Copy link

tmattio commented Feb 28, 2019

Hi @dillonkearns!

Thanks for the detailed answer 😄

3. The SelectionSet for that Connection gives you back an updated Paginator. The merging is correctly handled for you under the hood, so as soon as you get the response data back you have the correct data!

Yes! This is perfect, the views doesn't have to know the data is paginated, love it.

For the other items, it sounds great also, thanks again for taking the time to write back.

@dillonkearns
Copy link
Owner Author

@tmattio fantastic, I'm glad you like the design! Thank you for your input!

@AdrianRibao
Copy link

@dillonkearns thank you for this! it looks pretty good.

Do you think is going to be merged into master in the near future?

@dillonkearns
Copy link
Owner Author

@AdrianRibao this is still a ways off, there are a few cases I still need to consider and haven't quite figured out the right approach for. I'll keep this thread posted when I pick this back up, but for now this will still require some more brainstorming to push through.

@dillonkearns dillonkearns marked this pull request as draft February 4, 2021 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants