New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GraphQL replication plugin fires exponentially increasing number of requests in case of server errors #2048
Comments
I am not sure if this is a bug. The EDIT: Ok, so the network inspect shows that we run too many requests. I will investigate. |
I played around a bit and also added a test. Everything looks fine for me. |
From the code, it looks like _runQueueCount only prevents the parallel execution of more than two instances of the run method. But since the retries are scheduled using a timeout, it is possible they are being scheduled at different times and hence many of them execute one after the other? |
See the test. There I call |
I am trying to add logs to understand what's happening better. But of the top of my head, this test does not look at the scenario when runPush (or runPull) fails. I am observing this bug only when there is a failure in push or pull |
So I've added logs whenever the run method is scheduled (Attached patch below). Here is what is happening:
2 & 3 above happen nearly at the same time.
r3 & r4 will execute in parallel. r5 & r6 will execute in parallel at a different time since the network response time (and nodejs scheduler latencies) lead to them being scheduled at a different time. Each of them will schedule 2 more requests to run. Occasionaly _runQueueCount becomes > 3 and a request gets ignored. Log output: Inital run invocation: Running sync scheduled at at 2020-04-21T09:02:08.540Z runQueueCount: 0 diff --git a/src/plugins/replication-graphql/index.ts b/src/plugins/replication-graphql/index.ts
index 0006ac31..cb4eef94 100644
--- a/src/plugins/replication-graphql/index.ts
+++ b/src/plugins/replication-graphql/index.ts
@@ -137,15 +137,18 @@ export class RxGraphQLReplicationState {
}
// ensures this._run() does not run in parallel
- async run(): Promise<void> {
+ async run(t: string = ""): Promise<void> {
if (this.isStopped()) {
return;
}
if (this._runQueueCount > 2) {
+ console.log("Ignoring run request scheduled at ", t)
return this._runningPromise;
}
+ const startTime = new Date();
+ console.log(`Running sync scheduled at ${t} at ${startTime.toISOString()} runQueueCount: ${this._runQueueCount}`);
this._runQueueCount++;
this._runningPromise = this._runningPromise.then(async () => {
this._subjects.active.next(true);
@@ -155,6 +158,8 @@ export class RxGraphQLReplicationState {
if (!willRetry && this._subjects.initialReplicationComplete['_value'] === false)
this._subjects.initialReplicationComplete.next(true);
+
+ console.log(`Sync complete for run tirggered at ${t}in ${new Date().valueOf() - startTime.valueOf()}ms runQueueCount: ${this._runQueueCount}`);
this._runQueueCount--;
});
return this._runningPromise;
@@ -167,7 +172,9 @@ export class RxGraphQLReplicationState {
const ok = await this.runPush();
if (!ok) {
willRetry = true;
- setTimeout(() => this.run(), this.retryTime);
+ const t = new Date().toISOString();
+ console.log("Retry from run after push", t)
+ setTimeout(() => this.run(t), this.retryTime);
}
}
@@ -175,6 +182,7 @@ export class RxGraphQLReplicationState {
const ok = await this.runPull();
if (!ok) {
willRetry = true;
+ console.log("Retry from run after pull", new Date().toISOString())
setTimeout(() => this.run(), this.retryTime);
}
}
@@ -203,6 +211,8 @@ export class RxGraphQLReplicationState {
}
} catch (err) {
this._subjects.error.next(err);
+
+ console.log("Retry from runPull", new Date().toISOString())
setTimeout(() => this.run(), this.retryTime);
return false;
}
@@ -308,7 +318,10 @@ export class RxGraphQLReplicationState {
}
this._subjects.error.next(err);
- setTimeout(() => this.run(), this.retryTime);
+ const t = new Date().toISOString();
+ console.log("Retry from runPush", t)
+
+ setTimeout(() => this.run(t), this.retryTime);
return false;
} |
I just realized the reason why the second request (r2) finishes at 52ms (almost exactly twice) as r1 is because in the run method we are chaining requests rxdb/src/plugins/replication-graphql/index.ts Line 158 in 53e0fd5
|
Yes the |
@pubkey Have added a test case. Please take a look |
@gautambt thank you that helped. I add a fix which and also change the behavior on errors, please check the attached commit |
Awesome :) This looks good to me |
I am still seeing this issue on 9.0.0-beta.11 |
I'm closing this because I think the original issue is fixed. |
Issue
In scenarios where the server returns a GraphQL error, the number of requests being fired for retrying the request keeps increasing over time. This is happening because the retry is being done from both the runPush method as well as in the run method:
rxdb/src/plugins/replication-graphql/index.ts
Line 168 in 1c98328
rxdb/src/plugins/replication-graphql/index.ts
Line 302 in 1c98328
On an error, runPush method will schedule a retry and run method will also schedule a retry. When both the retries fire they will schedule two more retries each and so on.
The fix seems to be to remove the retries from runPush & runPull and do the retry once (in case of either method failing) the _run method. I can raise a pull request if this looks like the correct fix.
Info
Code
The issue can be reproduced by applying the following patch to the heros and trying to insert a hero. In the network console you will see 3 request (1 Options request, 1 POST for setHuman and 1 POST of feedForRxDBReplication). After 10 seconds you will see 6 requests and after 20 seconds you will see 12 requests and so on.
The text was updated successfully, but these errors were encountered: