Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use task scheduler to process optimistic responses and execute error rollback. #3

Open
wants to merge 4 commits into
base: main-atl
Choose a base branch
from

Conversation

kyle-painter
Copy link
Collaborator

Operations in Relay will be flushed synchronously by default which has been observed to cause potential race conditions in the client app. In practice this is very difficult to reproduce and is dependant on various factors that may affect the component render order of the application.

An effective method to mitigate these potential errors is by configuring a task scheduler to batch updates, however, the OperationExecutor doesn't use the scheduler when processing optimistic responses, or during rollback after a network error meaning these events are still susceptible to obscure race condition errors. This PR uses the configured scheduler in these scenarios to help alleviate this class of error.

Example of an invalid render due to such an error:

Warning: Relay: Expected to have been able to read non-null data for fragment `?` declared in `useFragment()`, since fragment reference was non-null. Make sure that that `useFragment()`'s parent isn't holding on to and/or passing a fragment reference for data that has been deleted. 

I intend on raising this PR upstream in the main repo but I'm prioritising updates here initially for early feedback and validation.

tomgasson and others added 4 commits October 10, 2023 14:28
@kyle-painter kyle-painter self-assigned this Mar 7, 2024
@tomgasson
Copy link
Collaborator

tomgasson commented Mar 7, 2024

This fork isn't operating on a pull-request model, it's just a snapshot which allows us to use github-actions to build.

Want to:

In terms of this change itself I'm not sure if this is valid. As I understand it, most of the OperationExecutor runs synchronously (and most of relay internally throughout) and it really tends to expect single-threaded and sequential execution.

For example, this.cancel() is intended to update the current state (this._state = ) and deferring that means that iteration could continue or fall into an infinite loop.

The idea of scheduling throughout relay is not to defer any part of computation, but to batch together things which are already going to be handled concurrently anyway: notably DOM updates and fetching

@tomgasson
Copy link
Collaborator

Have you got a description of the problem or a reduced test case example (internal or external)?

I definitely think you know enough to have a valid case here, but it's likely something which could instead be solved by:

  • Working out the actual event ordering and fixing a bug (which would most likely rely on some other synchronous checks added to this file)
  • Working out how the different events to relay are eventually coming out the other side and working to change how those are batched up for react

@kyle-painter
Copy link
Collaborator Author

I've only observed this issue in a local app environment and had very little success trying to debug what are the magic series of inputs/outputs that lead to the error. I also attempted to create a minimal reproduction from the relay-examples repo and again no luck.

To briefly summarise we have a connection of rows in a table and we execute a mutation to create a new row. This row is optimistically appended to the connection using @appendNode, and when a network error occurs Relay will attempt the rollback.

<Table ref="client:root:tableConnection">
  <Row ref="client:root:tableConnection:edges:0" />
  <Row ref="client:root:tableConnection:edges:1" />
  <Row ref="client:root:tableConnection:edges:2" />
<Table>

Mutation is executed

<Table ref="client:root:tableConnection">
  <Row ref="client:root:tableConnection:edges:0" />
  <Row ref="client:root:tableConnection:edges:1" />
  <Row ref="client:root:tableConnection:edges:2" />
  <Row ref="client:root:tableConnection:edges:3" />  // Row is appended after optimistic update
<Table>

Network error and rollback occurs. During the rollback a synchronous update is flushed causing the table to be re-rendered. Interestingly, the table connection still includes 4 edges, but while trying to call useFragment from within the Row component Relay cannot find the record and logs the warning.

Warning: Relay: Expected to have been able to read non-null data for fragment `?` declared in `useFragment()`, since fragment reference was non-null. Make sure that that `useFragment()`'s parent isn't holding on to and/or passing a fragment reference for data that has been deleted. 

A very obscure issue to try and debug. The rendering behaviour appeared to change based on which fields I specified within my Row fragment, i.e. with Row.fieldA the error occurs, and without Row.fieldA it behaves as expected.

What I did narrow down however, is that rollback works as expected when a successful GraphQL payload is returned, and when removing the configured task scheduler it fails in the same fashion. This is how I landed on the above outcome. I'm happy to raise a PR upstream for further feedback, we could also consider moving the scheduler down specifically to where the rollback is executed during cancel so the current state update can remain synchronous.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants