Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reset joinSession cache only on connect_document_error #19974

Merged
merged 7 commits into from
Mar 18, 2024

Conversation

pragya91
Copy link
Contributor

@pragya91 pragya91 commented Mar 6, 2024

The response of joinSession api is saved in the cache (for the duration it is valid) so that, in case there is a socket disconnect, we can use the same socket endpoint from the cache instead of making a new joinSession network request.
Up until now, the odsp-driver code was clearing the cache on any error that occurred on the socket, which is incorrect.

This PR changes the code such that it only clears the cache if the error is a fluid protocol error (connect_document_error), and does not clear the cache if the error is socket.io error (connect_error).

AB#7057

@github-actions github-actions bot added area: driver Driver related issues area: odsp-driver base: main PRs targeted against main branch and removed area: driver Driver related issues area: odsp-driver labels Mar 6, 2024
@pragya91 pragya91 requested review from a team and andre4i March 6, 2024 14:37
@pragya91 pragya91 marked this pull request as ready for review March 6, 2024 14:38
@pragya91 pragya91 force-pushed the praggarg/no-reset-joinsession-cache branch from ffddea8 to 91214b6 Compare March 7, 2024 18:44
@github-actions github-actions bot added area: driver Driver related issues area: odsp-driver area: tests Tests to add, test infrastructure improvements, etc public api change Changes to a public API labels Mar 7, 2024
@github-actions github-actions bot removed public api change Changes to a public API area: tests Tests to add, test infrastructure improvements, etc labels Mar 7, 2024
@msfluid-bot
Copy link
Collaborator

msfluid-bot commented Mar 11, 2024

@fluid-example/bundle-size-tests: +54 Bytes
Metric NameBaseline SizeCompare SizeSize Diff
aqueduct.js 514.59 KB 514.59 KB No change
connectionState.js 680 Bytes 680 Bytes No change
containerRuntime.js 249.02 KB 249.02 KB No change
loader.js 128.15 KB 128.15 KB No change
map.js 41.36 KB 41.36 KB No change
matrix.js 143.67 KB 143.67 KB No change
odspDriver.js 97.5 KB 97.5 KB No change
odspPrefetchSnapshot.js 41.87 KB 41.87 KB No change
sharedString.js 161.26 KB 161.26 KB No change
sharedTree.js 330.11 KB 330.11 KB No change
Total Size 1.81 MB 1.81 MB +54 Bytes

Baseline commit: 1d07902

Generated by 🚫 dangerJS against b6baf1a

@@ -763,6 +763,8 @@ export class DocumentDeltaConnection
details: JSON.stringify({
...this.getConnectionDetailsProps(),
}),
// We use this param to clear the joinSession cache if the error happens in connect_document flow.
isSocketError: handler === "connect_document_error" ? false : true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the comment where this is called from, it looks like this is not a socket error. So, maybe it should not be called socket error:
image

Consider adding the handler directly as errorDetails or something a property and the call site can decide based on that to clear cache or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, consider adding this new property to IAnyDriverError (or any of its base interfaces) to make it type safe. That way, the call site doesn't have to cast to any.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of adding the handler value into the error props itself. i.e. instead of having isSocketError flag, just adding a property like {errorFrom: handler} into the error props. That way all the possible values will be visible to the odspDelayLoadedDeltaStream to consume if needed in the future.

I am not too confident about adding the property into the IAnyDriverError interface (primarily because it is a more committing change :D). @vladsud If you have any insights to share about this, would really appreciate it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agarwal-navin Wanted to confirm with you if you wanted to do one last review on this PR before I merge. I have made the changes to add the complete handler property value to the error object, but not adding the prop to the IAnyDriverError because I don't see it adding value anytime soon for other drivers.

@pragya91 pragya91 merged commit 1f382d4 into microsoft:main Mar 18, 2024
31 checks passed
pragya91 added a commit that referenced this pull request Apr 22, 2024
Reverts #19974 PR which
was applied as a performance improvement. In the original PR, we checked
for the kind of error received on the socket to decide whether to clear
joinSession cache or not and reuse the existing session whenever
possible. However a bug was identified where one section of the code was
not update to stamp the errorFrom property which is used to perform such
check which resulted in the cache never clearing for and the code
getting stuck in a loop to continue connection retries.

Impact of revert: There is no compatibility-issue/regression that this
revert would introduce, as it was only a perf improvement.

Follow up: Captured in
[AB#7833](https://dev.azure.com/fluidframework/235294da-091d-4c29-84fc-cdfc3d90890b/_workitems/edit/7833)
pragya91 added a commit to pragya91/FluidFramework-1 that referenced this pull request Apr 22, 2024
…soft#20784)

Reverts microsoft#19974 PR which
was applied as a performance improvement. In the original PR, we checked
for the kind of error received on the socket to decide whether to clear
joinSession cache or not and reuse the existing session whenever
possible. However a bug was identified where one section of the code was
not update to stamp the errorFrom property which is used to perform such
check which resulted in the cache never clearing for and the code
getting stuck in a loop to continue connection retries.

Impact of revert: There is no compatibility-issue/regression that this
revert would introduce, as it was only a perf improvement.

Follow up: Captured in
[AB#7833](https://dev.azure.com/fluidframework/235294da-091d-4c29-84fc-cdfc3d90890b/_workitems/edit/7833)
pragya91 added a commit that referenced this pull request Apr 22, 2024
… (#20789)

Reverts #19974 PR which
was applied as a performance improvement. In the original PR, we checked
for the kind of error received on the socket to decide whether to clear
joinSession cache or not and reuse the existing session whenever
possible. However a bug was identified where one section of the code was
not update to stamp the errorFrom property which is used to perform such
check which resulted in the cache never clearing for and the code
getting stuck in a loop to continue connection retries.

Impact of revert: There is no compatibility-issue/regression that this
revert would introduce, as it was only a perf improvement.

Follow up: Captured in

[AB#7833](https://dev.azure.com/fluidframework/235294da-091d-4c29-84fc-cdfc3d90890b/_workitems/edit/7833)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: driver Driver related issues area: odsp-driver base: main PRs targeted against main branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants