Make Container Attach Retriable for Non-Runtime Failures#19246
Merged
anthony-murphy merged 49 commits intoJan 30, 2024
Conversation
…to make-attach-retyable
…to make-attach-retyable
…to make-attach-retyable
…to make-attach-retyable
…to make-attach-retyable
dannimad
reviewed
Jan 16, 2024
dannimad
reviewed
Jan 16, 2024
dannimad
reviewed
Jan 16, 2024
…phy/FluidFramework-1 into make-attach-retyable
Co-authored-by: Matt Rakow <ChumpChief@users.noreply.github.com>
…phy/FluidFramework-1 into make-attach-retyable
dannimad
approved these changes
Jan 29, 2024
Contributor
dannimad
left a comment
There was a problem hiding this comment.
LGTM and I don't see any outstanding comments.
ChumpChief
reviewed
Jan 30, 2024
ChumpChief
approved these changes
Jan 30, 2024
Contributor
ChumpChief
left a comment
There was a problem hiding this comment.
I'm a little nervous about expanding the capabilities of attach (adding retriability) given that it's a bit convoluted already, but this seems like a fine approach.
One scary comment to consider, rest are more minor.
Co-authored-by: Matt Rakow <ChumpChief@users.noreply.github.com>
…to make-attach-retyable
…to make-attach-retyable
anthony-murphy
added a commit
that referenced
this pull request
Jan 31, 2024
Added some new tests in #19246 that are failing CI due to the increased the compat matrix. This moves them to the reduced compat section of the file to prevent failures. Co-authored-by: Tony Murphy <anthonm@microsoft.com>
alexvy86
pushed a commit
to alexvy86/FluidFramework
that referenced
this pull request
Feb 13, 2024
…9246) ### Background This change is in preparation for allowing consumers of fluid to call serialize on a container while a container is in an attaching state, which will include a failed attach where the failure didn't cause the container to close. This will aid in the case of offline, network, or server outage by allowing the application to capture the un-attached state of the container, preserve it locally, and later create a new file from it. AB#5502 ### Overview In order to make attachment retriable I've made explicit types that represent each stage of the attachment state machine and called it `AttachmentData`. The container tracks the current attachment data, gets the `AttachState`, and gets the base snapshot information in the case of offline for it. The biggest changes are to the `attach` method itself. Now that is retriable, we had to remove the `_attachStarted` member which gated the call to a single instance. I've replaced that with the `runSingle` helper method which ensure only one attach promise is running at a time and ensure that all the parameters for all parallel watchers of that promise used the same arguments. I've also encapsulated the management of the state machine in a new method `runRetriableAttachProcess `. The primary reason for this is testability, as the new method is easily unit testable, and I've added a number of unit test to validate its behavior. testability was achieved by limiting the number of dependencies the method needed, so it uses callbacks to perform the actual operations necessary to attach the container. The split also helped to segment the error handling for the container. Specifically, only the callbacks that engage with the runtime will close the container on failure, as if we can't create a summary, or propagate an attachment state, we can't safely retry. The container creates the callbacks for `runRetriableAttachProcess ` as follows: - `setAttachmentData` manages the attachment data member, which is exposed via AttachState, and performs eventing and runtime callbacks. - `createAttachmentSummary` creates the summary from the runtime, add the protocol tree, and the closes the container if any of that fails - `createOrGetStorageService` resolves the request, calls create container, and sets up the containers document services. it only returns the storage service, as that is all that might be needed to upload blobs. Lastly the container's attach method configures the delta connection, and logs. As was previously the case, we still error if attach is called for an invalid state, and we still limit serialize calls to only being supported in a detached state. AB#5911 --------- Co-authored-by: Tony Murphy <anthonm@microsoft.com> Co-authored-by: jzaffiro <110866475+jzaffiro@users.noreply.github.com> Co-authored-by: Matt Rakow <ChumpChief@users.noreply.github.com>
alexvy86
pushed a commit
to alexvy86/FluidFramework
that referenced
this pull request
Feb 13, 2024
…oft#19402) Added some new tests in microsoft#19246 that are failing CI due to the increased the compat matrix. This moves them to the reduced compat section of the file to prevent failures. Co-authored-by: Tony Murphy <anthonm@microsoft.com>
anthony-murphy
added a commit
that referenced
this pull request
Mar 1, 2024
…19893) ## Background This change is the final change for allowing consumers of fluid to call serialize on a container while a container is in an attaching state, which will include a failed attach where the failure didn't cause the container to close. This will aid in the case of offline, network, or server outage by allowing the application to capture the un-attached state of the container, preserve it locally, and later create a new file from it. [AB#5502](https://dev.azure.com/fluidframework/235294da-091d-4c29-84fc-cdfc3d90890b/_workitems/edit/5502) ## Overview This change enables calling container.serialize while a container is in the attaching state. It is the culmination of many smaller changes that have gone in to enable this change. - #18829 - #19400 - #19246 - #19517 - #19518 - #19590 - #19634 - #19802 --------- Co-authored-by: Tony Murphy <anthonm@microsoft.com> Co-authored-by: Daniel Madrid <105010181+dannimad@users.noreply.github.com> Co-authored-by: jzaffiro <110866475+jzaffiro@users.noreply.github.com> Co-authored-by: Matt Rakow <ChumpChief@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
This change is in preparation for allowing consumers of fluid to call serialize on a container while a container is in an attaching state, which will include a failed attach where the failure didn't cause the container to close. This will aid in the case of offline, network, or server outage by allowing the application to capture the un-attached state of the container, preserve it locally, and later create a new file from it.
AB#5502
Overview
In order to make attachment retriable I've made explicit types that represent each stage of the attachment state machine and called it
AttachmentData. The container tracks the current attachment data, gets theAttachState, and gets the base snapshot information in the case of offline for it.The biggest changes are to the
attachmethod itself.Now that is retriable, we had to remove the
_attachStartedmember which gated the call to a single instance. I've replaced that with therunSinglehelper method which ensure only one attach promise is running at a time and ensure that all the parameters for all parallel watchers of that promise used the same arguments.I've also encapsulated the management of the state machine in a new method
runRetriableAttachProcess. The primary reason for this is testability, as the new method is easily unit testable, and I've added a number of unit test to validate its behavior. testability was achieved by limiting the number of dependencies the method needed, so it uses callbacks to perform the actual operations necessary to attach the container.The split also helped to segment the error handling for the container. Specifically, only the callbacks that engage with the runtime will close the container on failure, as if we can't create a summary, or propagate an attachment state, we can't safely retry.
The container creates the callbacks for
runRetriableAttachProcessas follows:setAttachmentDatamanages the attachment data member, which is exposed via AttachState, and performs eventing and runtime callbacks.createAttachmentSummarycreates the summary from the runtime, add the protocol tree, and the closes the container if any of that failscreateOrGetStorageServiceresolves the request, calls create container, and sets up the containers document services. it only returns the storage service, as that is all that might be needed to upload blobs.Lastly the container's attach method configures the delta connection, and logs. As was previously the case, we still error if attach is called for an invalid state, and we still limit serialize calls to only being supported in a detached state.
AB#5911