Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new phone who dis #335

Closed
aboodman opened this issue Nov 18, 2020 · 5 comments · Fixed by #880
Closed

new phone who dis #335

aboodman opened this issue Nov 18, 2020 · 5 comments · Fixed by #880
Milestone

Comments

@aboodman
Copy link
Contributor

It is frequently the case, especially during development, but occasionally in production too, that a server just wants to drop a particular client's state. Just... forget it. rm -rf, start over.

Unfortunately, when that happens, the client is then wedged. It goes to sync, uses its existing client ID, and the data layer doesn't recognize them.

We need a way for the server and client to negotiate this, and end up with the client getting a fresh state too.

@aboodman aboodman assigned phritz and unassigned phritz Nov 18, 2020
@aboodman
Copy link
Contributor Author

cc @phritz . I'm not sure what the priority of this should be, but it does happen quite frequently. Probably will happen to almost every user at some point during development.

@aboodman
Copy link
Contributor Author

I am not sure if the right thing is for the client to always interpret this particular condition as it should start over, or if it should be an explicit action of some type that the server takes to tell clients to nuke themselves.

@phritz
Copy link
Contributor

phritz commented Nov 18, 2020

I wanna explore what we expect to happen in the following cases.

  • brand new client pushes to data layer: data layer authenticates the user and then creates a lmid record for the client while applying the changes
  • brand new client pulls (without first pushing): data layer auths the user and... i think we want it to return the clientview with lmid 0 even though a lmid record doesn't exist for the client. (i guess it could create one if it wanted to).
  • if data layer forgets the clientid
    • client pushes with old clientid: the data layer doesn't have the clientid but maybe can look at the mutation ids to tell if it is new or old. if the mutations being pushed start with mutation 1 i don't think we can tell whether this is an old client that hadn't pushed any data or a brand new client. but if the first mutation being pushed has id > 1 and we dont have the clientid then we know this is a old client pushing. in this case we can signal back to the app to start over.
    • client pulls with the old clientid: we said above that if the clientview gets a request for an unknown clientid it should assume it is new return the view with a lmid 0. however we want to reset it if it is old. i dont think there is any way of telling whether that is the case right now but if we passed the client's lmid in the clientview request then it could check whether the client's lmid is > 0. if so and we have no record of the clientid, it's an old client and should be reset. however again like for push i dont think we can distinguish between a brand new client and an old client that hasn't pushed any mutations.

seems like generally we can detect old clients who should be nuked so long as they have pushed at least one mutation. i'm not sure if it is important to be able to detect old clients who should be nuked who have not pushed a mutation.

regarding wether the client should detect or server signal for it to nuke itself. i suspect we probably want a signal from the server to the client to nuke itself so the server can correct its mistakes.

as for the reset itself, it seems like the app should drive the reset of repc because the app is also going to have to do its own reset/bookkeeping. to reset repc do we... drop the database? if so what happens to stuff that is mid-flight in another tab? or if one tab comes to life after another resets, can it detect that gracefully and cause its app instance to reset? what's required to reset the bindings -- do we tear everything down and start completely over eg with registering mutations etc?

@aboodman aboodman transferred this issue from rocicorp/replicache-old Mar 25, 2021
@arv arv mentioned this issue Nov 5, 2021
31 tasks
grgbkr added a commit that referenced this issue Nov 22, 2021
…ing bootstraping from existing client state. (#712)

Simplified Dueling Dags always creates a new Client for each new tab.  To enable fast startup of new tabs utilizing previous stored data Simplified Dueling Dags bootstraps new clients by forking an existing Client's state. 

When forking from another Client, the fork should be based on the existing Client's most recent base snapshot (which may not be its latest head).  This is necessary because pending mutations (LocalMutationCommits) cannot be forked as the last mutation id series is different per client.

It is important that the last mutation id for the new client be set to 0, since a replicache server implementation will start clients for which they do not have a last mutation id stored at last mutation id 0.  If the server receives a request from a client with a non-0 last mutation id, for which it does not have a last mutation id stored, it knows that it is unsafe for it to execute mutations form the client, as it could result in re-running mutations or otherwise failing to guarantee sequential execution of mutations.  This tells the server that this is an old client that it has GC'd (we need some way to signal this to the client so it can reset itself, see #335). 

When choosing a Client to bootstrap from, it is safe to pick any Client, but it is ideal to chose the Client with the most recent snapshot from the server.  Currently the age of snapshots is not stored, so this implementation uses a heuristic of choosing the base snapshot of the Client with the newest heartbeat timestamp. 

See larger design at https://www.notion.so/Simplified-DD1-1ed242a8c1094d9ca3734c46d65ffce4#64e4299105dd490a9ffbc6c9c771f5d2

Part of #671
@aboodman aboodman added this to the R3: Replicache Ready Release milestone Mar 7, 2022
@arv
Copy link
Contributor

arv commented Mar 18, 2022

Proposal

Expand PullResponse to be something like:

type PullResponseOK = {
  cookie?: ReadonlyJSONValue;
  lastMutationID: number;
  patch: PatchOperation[];
};

type ClientStateNotFoundResponse = {clientStateNotFound: true};

type PullResponse = PullResponse | ClientStateNotFoundResponse;

When we get a ClientStateNotFoundResponse we call onClientStateNotFound (yes, we reuse the existing callback)

@aboodman
Copy link
Contributor Author

aboodman commented Mar 18, 2022 via email

arv added a commit that referenced this issue Mar 18, 2022
The server can now tell the client that it does not know about a client.
When this happens the client calls `onClientStateNotFound`.

The server can return:

```json
{"error": "ClientStateNotFound"}
```

Fixes #335
arv added a commit that referenced this issue Mar 18, 2022
The server can now tell the client that it does not know about a client.
When this happens the client calls `onClientStateNotFound`.

The server can return:

```json
{"error": "ClientStateNotFound"}
```

Fixes #335
arv added a commit that referenced this issue Mar 25, 2022
The server can now tell the client that it does not know about a client.
When this happens the client calls `onClientStateNotFound`.

The server can return:

```json
{"error": "ClientStateNotFound"}
```

Fixes #335
@arv arv closed this as completed in #880 Mar 25, 2022
arv added a commit that referenced this issue Mar 25, 2022
The server can now tell the client that it does not know about a client.
When this happens the client calls `onClientStateNotFound`.

The server can return:

```json
{"error": "ClientStateNotFound"}
```

Fixes #335
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants