feat: add sessionAffinity setting for prefix-cache optimization#444
Merged
threepointone merged 2 commits intocloudflare:mainfrom Mar 18, 2026
Merged
feat: add sessionAffinity setting for prefix-cache optimization#444threepointone merged 2 commits intocloudflare:mainfrom
threepointone merged 2 commits intocloudflare:mainfrom
Conversation
Add sessionAffinity option to WorkersAIChatSettings that sends an x-session-affinity header with inference requests. This routes requests with the same key to the same backend replica, improving prefix-cache hit rates across conversation turns. - Binding path: sessionAffinity is passed as extraHeaders to binding.run() - REST path: extraHeaders are now forwarded in fetch headers instead of being discarded
🦋 Changeset detectedLatest commit: 414b4d5 The changes in this PR will be included in the next version bump. This PR includes changesets to release 2 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
commit: |
Add a sessionAffinity option across Workers AI adapters/providers to route requests with the same key to the same backend replica via the x-session-affinity header for prefix-cache optimization. Implementation details: - Extend WorkersAiAdapterConfig with an optional sessionAffinity string. - Propagate sessionAffinity as x-session-affinity to binding.run() via createWorkersAiBindingFetch(extraHeaders), to REST requests via defaultHeaders, and to gateway mode via createGatewayFetch call. - Merge sessionAffinity with user-provided extraHeaders in the WorkersAI provider so both headers are forwarded together. Other changes: - Add and update tests covering binding.fetch, adapter behavior, and REST/binding header merging. - Update README docs for tanstack-ai and workers-ai-provider to document sessionAffinity usage. - Add changeset files to trigger a patch release for the relevant packages and minor formatting updates to demos.json.
threepointone
approved these changes
Mar 18, 2026
Collaborator
threepointone
left a comment
There was a problem hiding this comment.
added a version for tanstack-ai as well
Merged
Merged
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a
sessionAffinityoption to bothworkers-ai-providerand@cloudflare/tanstack-aithat sends anx-session-affinityheader with inference requests, routing requests with the same key to the same backend replica for prefix-cache reuse.workers-ai-provider@cloudflare/tanstack-aiContext
Workers AI already accepts the
x-session-affinityheader and routes accordingly. This PR adds the client-side plumbing so both packages can send it. A follow-up PR to theagentspackage will auto-set this using the Durable Object ID soAIChatAgentusers get prefix-cache optimization for free.Changes
workers-ai-providerworkersai-chat-settings.tssessionAffinity?: stringfieldworkersai-chat-language-model.tssessionAffinity→extraHeaders, merged with any user-providedextraHeadersso neither is silently droppedutils.tsextraHeadersin REST fetch headers instead of discarding themREADME.mdsessionAffinity,safePrompt)text-generation.test.ts@cloudflare/tanstack-aicreate-fetcher.tssessionAffinity?: stringtoWorkersAiAdapterConfig; updatecreateWorkersAiBindingFetchto forwardextraHeaderstobinding.run()workers-ai.tssessionAffinitythrough all three config modes (binding viaextraHeaders, REST viadefaultHeaders, gateway viacreateGatewayFetchheaders)README.mdsessionAffinityconfig optionbinding-fetch.test.tsworkers-ai-adapter.test.tsTest plan
workers-ai-providertests pass, 6 new tests added@cloudflare/tanstack-aitests pass, 4 new tests addedextraHeadersfor both binding and REST paths