Stage removal of URL-based DDS types#26764
Conversation
There was a problem hiding this comment.
Pull request overview
Stages backward-compatible loading support for migrating DDS factory type identifiers away from URL-like strings (which can be rewritten by certain reverse proxies) toward stable short-name identifiers.
Changes:
- Added a legacy-type redirect table and a fallback lookup path when resolving DDS factories from stored channel attributes.
- Annotated various DDS factories with commented “future” short-name type strings to be enabled after a safe rollout window.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/runtime/datastore/src/channelContext.ts | Adds legacy URL-type → short-name redirect mapping and fallback registry lookup during channel load. |
| packages/dds/tree/src/sharedTreeAttributes.ts | Documents planned future non-URL type string for SharedTree (currently commented out). |
| packages/dds/task-manager/src/taskManagerFactory.ts | Documents planned future non-URL type string for TaskManager (currently commented out). |
| packages/dds/shared-summary-block/src/sharedSummaryBlockFactory.ts | Documents planned future non-URL type string for SharedSummaryBlock (currently commented out). |
| packages/dds/sequence/src/sequenceFactory.ts | Documents planned future non-URL type string for SharedString (currently commented out). |
| packages/dds/register-collection/src/consensusRegisterCollectionFactory.ts | Documents planned future non-URL type string for ConsensusRegisterCollection (currently commented out). |
| packages/dds/pact-map/src/pactMapFactory.ts | Documents planned future non-URL type string for PactMap (currently commented out). |
| packages/dds/ordered-collection/src/consensusOrderedCollectionFactory.ts | Documents planned future non-URL type string for ConsensusQueue (currently commented out). |
| packages/dds/matrix/src/runtime.ts | Documents planned future non-URL type string for SharedMatrix (currently commented out). |
| packages/dds/map/src/mapFactory.ts | Documents planned future non-URL type string for SharedMap (currently commented out). |
| packages/dds/map/src/directoryFactory.ts | Documents planned future non-URL type string for SharedDirectory (currently commented out). |
| packages/dds/legacy-dds/src/signal/sharedSignalFactory.ts | Documents planned future non-URL type string for SharedSignal (currently commented out). |
| packages/dds/legacy-dds/src/array/sharedArrayFactory.ts | Documents planned future non-URL type string for SharedArray (currently commented out). |
| packages/dds/ink/src/inkFactory.ts | Documents planned future non-URL type string for Ink (currently commented out). |
| packages/dds/counter/src/counterFactory.ts | Documents planned future non-URL type string for SharedCounter (currently commented out). |
| packages/dds/cell/src/cellFactory.ts | Documents planned future non-URL type string for SharedCell (currently commented out). |
| const legacyTypeRedirects: Readonly<Record<string, string>> = { | ||
| "https://graph.microsoft.com/types/directory": "SharedDirectory", | ||
| "https://graph.microsoft.com/types/map": "SharedMap", | ||
| "https://graph.microsoft.com/types/counter": "SharedCounter", | ||
| "https://graph.microsoft.com/types/cell": "SharedCell", | ||
| "https://graph.microsoft.com/types/mergeTree": "SharedString", | ||
| "https://graph.microsoft.com/types/sharedmatrix": "SharedMatrix", | ||
| "https://graph.microsoft.com/types/tree": "SharedTree2", | ||
| "https://graph.microsoft.com/types/ink": "Ink", | ||
| "https://graph.microsoft.com/types/pact-map": "PactMap", | ||
| "https://graph.microsoft.com/types/task-manager": "TaskManager", | ||
| "https://graph.microsoft.com/types/consensus-queue": "ConsensusQueue", | ||
| "https://graph.microsoft.com/types/consensus-register-collection": | ||
| "ConsensusRegisterCollection", | ||
| "https://graph.microsoft.com/types/shared-summary-block": "SharedSummaryBlock", | ||
| "https://graph.microsoft.com/types/SharedArray": "SharedArray", | ||
| "https://graph.microsoft.com/types/signal": "SharedSignal", | ||
| }; |
There was a problem hiding this comment.
Going to reply here since it lets us keep the discussion to a comment thread, but fyi this applies to @markfields's comment here
@Abe27342 is the shape of the rewritten URLs consistent or predictable enough that we could also include in this fallback codepath some optimistic mapping from rewritten URLs to the known type? To rescue those docs/tenants that have the reverse proxy problem.
It's hard to say. From the telemetry we observed, the reverse proxy solutions mostly were using subdomain techniques in practice, so rather than map along the lines of what I put in the description, they were more like map github.com/foo to github.my-reverse-proxy.com/foo. So... we could consider something like "regex match on /types/{old-type-name} and forward" and I think this would work in practice but limit us flexibility for defining future DDSes. That's a reasonable tradeoff worth considering (ex: we commit to never including "/" in future DDS type names).
There was a problem hiding this comment.
won't this table just get re-written by the proxy? i'm fine to remove url like going forward. i wonder if we need to encode the old names? base64? so they can't be easily re-written by a proxy
There was a problem hiding this comment.
i think all the customer problems are when they mix and match proxy vs not. the solution today works if all usages go though the same proxy. it only fail when some users of a document proxy, and some do not
There was a problem hiding this comment.
for the regex, couldn't we also only run that if we can find the type? so first is direct check? then encoded look up table? last we try a regex re-map. i think this doesn't limit the future, and the future will always pass direct check. regex re-map is just a last resort attempt and we would fail if we did nothing, so risk of being wrong isn't too bad
|
@Abe27342 is the shape of the rewritten URLs consistent or predictable enough that we could also include in this fallback codepath some optimistic mapping from rewritten URLs to the known type? To rescue those docs/tenants that have the reverse proxy problem. |
| ); | ||
| } | ||
| const factory = registry.get(channelFactoryType); | ||
| let factory = registry.get(channelFactoryType); |
There was a problem hiding this comment.
this redirect logic is actually not sufficient for rollout compat story. I'd fix that up before checking something like this in (and add some tests).
Specifically if the code here starts loading documents with the new Type fields, it doesn't map them to their correct DDSes, just errors. The code as presented is what needs to live long-term.
There was a problem hiding this comment.
(mark as draft if not ready to merge?)
There was a problem hiding this comment.
Probably fair on the first iteration, but seems like general response is pretty favorable and I'd like to go through with getting it or a variant of it merged. So I'll leave it as non-draft.
(this comment is now addressed in latest anyway)
ChumpChief
left a comment
There was a problem hiding this comment.
I'm ok with this in concept, certainly as a fix for the issue you describe.
The probably-harder fix I prefer is to deprecate the Type static on the factory entirely, and instead have container and data object authors name the entries in their own registries. I believe that last time I looked we depend on the static member internally though, so this is more a note for any future-thinking you're doing rather than a recommendation for this PR.
packages/dds/cell/src/cellFactory.ts
Outdated
| */ | ||
| // New type string, to be activated once the migration has been fully shipped dark and is safe to flip. | ||
| // See legacyTypeRedirects in packages/runtime/datastore/src/channelContext.ts. | ||
| // public static readonly Type = "SharedCell"; |
There was a problem hiding this comment.
One feature of a URL is its uniqueness, which is an important property here too. Maybe this would never be a problem worth worrying about, but we might consider:
- Ensure uniqueness by appending a precomputed GUID as part of the type?
- Include a versioning string of some sort?
I don't feel super-strongly about these at the moment though, it's not like these URLs do a great job of satisfying those requirements either.
There was a problem hiding this comment.
using GUIDs from the start would be reasonable but I think introducing them at this point is not really worth it considering we now have effectively a closed ecosystem of identifiers we need to support.
In terms of versioning, my gut is to build something like that we wouldn't want to overload the function of type here and instead lean more heavily on other information in the attributes blob (we already write a "version" field which is pointless today)
| */ | ||
| // New type string, to be activated once the migration has been fully shipped dark and is safe to flip. | ||
| // See legacyTypeRedirects in packages/runtime/datastore/src/channelContext.ts. | ||
| // IMPORTANT: "SharedTree" cannot be used here because experimental/dds/tree already uses that string |
There was a problem hiding this comment.
Maybe just call this "Tree"?
There was a problem hiding this comment.
Yeah--latest version opts instead to just drop all of the initial part of the URLs. That keeps the update rule / logic simple and avoids need for a full lookup path.
steffenloesch
left a comment
There was a problem hiding this comment.
Looks good to me. We should have a work item to track this, so that we'll switch to the non-url Type in a few weeks.
Description
Deployed Fluid-based solutions can run into problems when their customers have certain reverse-proxy solutions which rewrite URLs to go through predefined endpoint (this allows routing traffic through known domains which can be used for various interception purposes). The core problem in those cases is that JS bundles containing Fluid libraries have rewritten "Type" values for DDSes pointing to the intercepted reverse proxy domain. Concretely, the problem resembles:
Customer has a reverse proxy that's configured to route https://github.com to
my-reverse-proxy.com/?path=github.com, where the my-reverse-proxy.com domain serves assets from github.com (or whatever's in the path parameter) but replaces any links on that domain with an analogous link to my-reverse-proxy.com.Fluid DDS types resemble URLs but they are not actually used as such, they function as registry keys. Furthermore, they are persisted in the document under the channel attributes blob. So, such auto-rewriting of URLs causes issues where documents created on a device subject to such a reverse proxy solution would only be usable as long as the reverse proxy has the exact same sort of configuration (and documents created on a device without that sort of thing would not be openable on one that does have such redirection). All of this looks effectively like document corruption to the user.
Historically, I believe URL-like values for DDSes were chosen when Fluid had more ambition for loading DDSes dynamically. We are at this point very far from that space (instead opting to close DDS development to within the repository for API reasons). There's no reason these need to be URLs.
This stages a redirect table in the DDS lookup process that will allow us to later update the type values to be non-URL based. The change cannot be performed immediately for compatibility reasons at rollout time (risk of users with newer version of the code writing documents that older versions do not understand).