Possible fix for PredictionThreshold error dropping rollback requests causing desync #77
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Here are some changes and comments that seem to fix #75
This is a draft PR as I am not necessarily suggesting we merge this as is (especially the test - but included for reference to help see what conditions cause this issue. Putting in PR mostly so easier to comment on specific pieces of diff here.
This fixes the test, and alsp seems to prevent any further desync in Jumpy.
Overview
The core of the problem is that when adding input at max prediction, we return PredictThreshold error and then requests are not returned and processed by client. By this point, we had already cleared first_incorrect_frame, updated confirmed frame on sync layer, and dropped inputs from before confirmed frame. This is a problem because the pre-confirmed frame inputs sent to be used in correction were not actually processed and are now gone.
Now we speculatively check if client will hit prediction error earlier in advance_frame. At this point, sync_layer.confirmed_frame has not been updated, so we use the 'speculative' confirmed frame local value in advance_frame.
We now only reset first_incorrect_frame and commit speculative confirmed frame to sync layer if and only if we are certain we are not at max prediction window. This means if we do error, in future frames we will still determine we need to rollback due to first_incorrect_frame being preserved, and our inputs / last confirmed frame are still valid.
Other Notes:
I put a couple TODOs around other spots I expect will cause desync, have not handled all failure cases related to this issue. Will investigate those further later + repro desync for those cases in their own tests to verify.
I also have not reviewed sync test session or any spectator code, possibly related issues there.