New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Get rid of object attachment flow #9575
Comments
It's worth also mentioning that the existing state tracking and attach flow is not entirely correct and it somehow just works. This PR (#9548) attempts to fix a bunch of the issues and simplify the flow but there are still issues. For instance, the state tracking in handles cannot be made consistent with the underlying object and with other handles to the same object. It will fall apart if someone creates additional handles to data stores / DDS and starts using those instead of the ones FF creates - SharedObject has its own handle and data store handles are created by PureDataObject in aqueduct. We have to fix these issues one way or another and the proposal in this work item seems like the right path forward. |
For 4. one additional consideration is that DataObject associated with the Datastore, can potentially have access to it after it is attached, even without the handle. We need to figure out how communicate that "initialization" is completed, and the DataObject can safely load and rematerialize on the remote client. |
Re # 2: PureDataObject.initializeInternal() is async, so orderSequentially() around creation will not work.
Stepping back, it's basically two angles of concern:
@curtisman, I think it's other way around (see # 1 above in this post). |
I might be misunderstanding something here, but I thought you want things attached immediately when it is created. So the sequence of how data object are initialized you described above will need to change (i.e. instead of createDetachedDataStore as the first step, it will just be createDataStore. I believe your # 1 is the problem, where a remote client (the summarizer) will rematerialize a partially initialized data store and create a DataObject that doesn't expect things to be partially initialized. |
I'm practical :) I want to remove any tracking (binding) and logic around it - that's the main goal. In my view, this is still "attached immediately", in a sense that all the binding code is gone, and the object is attached right away, before its visibility (handle or alias) changes. I'm happy to look at the next level (of breaking it even further down), but we need to clearly spell out goals. I'd say the reason we will look into next round (if we do) is to simplify summarization paths in code and have just one path (async), if that's possible. That would require those considerations of creating objects (in storage) right away and using ops through creation flow - I think it's doable, but I'd rather not think about it now (and assess if that's the right direction), unless we believe the state after first phase (if it becomes permanent) is worse than what we have today. |
Originally, I understood the proposal as completely removing delay attach, which has implications to existing patterns that are in use, even though the requirements not explicit spelled out at the moment. That is why I provide above list of concerns. This is a fundamental change. While we want to take steps in the right direction, but I am not yet clear that this is the right direction that will address our long-term needs. That's why I am asking questions. I agree with you that we need to spell out the requirements for our scenarios and goals. My interpretation of your comment above is that basically you still have "delay attach" for data stores and DDSes, but only within the duration of creation and initialization for the corresponding
I am more at ease of this solution because it is just reducing flexibility, instead of total removal of capability, the scope of consideration is smaller:
The caveat is that all these reasoning assumes nested channels will be here, but we don't have a plan or design for it yet. Nevertheless, can you amend your proposal to put more detail and specifics of the change will be? It might be easier to talk about specifics and avoid misunderstanding. |
I've updated description. I'd use this analogy in assessing correctness. But the moment there is any kind of asynchrony in the system, continuation / timer callbacks can be scheduled and inspect data model (starting with roots). Data model should be consistent (in some definition of consistency) to support correctness of these processes. Some objects might be in partially created state at that moment, but they should not be globally accessible (from roots). Basically, I think there should be no unique affordances for remote clients. Coming back to you comment. I'd phrase it slightly differently. From runtime perspective, attachment happens immediately when data store runtime is attached to a context. Runtime does not specify how this flow is used and leaves it up to users to decide what part of initialization is done at channel (data store runtime) creation, and what part happens after object is attached. Correctness of the system is controlled by visibility of this object in the system (ability to reach from roots) and is same for remote & local clients. I believe the only substantial / meaningful change in behavior is creation of swarm of objects with cycles. While it has some issues in current implementation (as described in description), they are minor and easy to fix. With new flow, if such objects depend on each other when performing runtime activities (like summarization, i.e., not directly invoked by developer actions), then we will run into issues. I believe current scenarios do not have such dependencies (as current flow does not address correctness in this flow either). Channel unification work should address such scenarios (a bit more on that in description). Yes, I believe it solves #9127, there should be no difference between attached or detached container (other than existing notification telling DDSs when to start sending ops). I'll point out that it's possible that perf for some future scenarios will be worse with the proposal. I'd prefer us collectively move away from init props to objects exposing proper object model and users using it to initialize object after they are created. In current model clients have more control when object is attached. I'm not sure this is runtime problem to solve though. I.e. we expose detached data store runtime creation (that we use today, like in DataObject), users can leverage that to build alternative solutions that manually control attachment process. I'd personally not go that route (at least not for complex interactions involving many objects created at once), and rely more on future unification of channels, that would allow all objects under parent channel to be attached in one go (similar to cycles solution). |
Is there an issue open where I can learn more about the "future unification of channels" which is mentioned several times in this discussion? |
@markfields, I found #3469, but it's very sparse :( |
This issue has been automatically marked as stale because it has had no activity for 180 days. It will be closed if no further activity occurs within 8 days of this comment. Thank you for your contributions to Fluid Framework! |
Background
Objects (DDSs and DataStores) are created initially detached. They (a swarm of detached, but connected objects forming DAG) is tracked and attached together to attached container whenever single object's handle is stored in attached DDS.
There is a ton of code doing this logic and tracking states, and it's spread out across DDSs, Data stores, handles, etc.
Correctness issues with existing code
Proposal
Proposal looks into actual changes of mechanics, as well as risks / side-effects:
The text was updated successfully, but these errors were encountered: