Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in high traffic sites on Cloudflare Workers with remote JWKS #355

Closed
2 tasks done
evanderkoogh opened this issue Feb 1, 2022 · 5 comments
Closed
2 tasks done

Comments

@evanderkoogh
Copy link

What happened?

On Cloudflare Workers it is not possible for one request to await an unsettled promise that was created by another request because that inverts the control flow and the runtime can't keep track of which task are associated with which request.

The reload function at:

jose/src/jwks/remote.ts

Lines 91 to 111 in bd7bf37

async reload() {
if (!this._pendingFetch) {
this._pendingFetch = fetchJwks(this._url, this._timeoutDuration, this._options)
.then((json) => {
if (!isJWKSLike(json)) {
throw new JWKSInvalid('JSON Web Key Set malformed')
}
this._jwks = { keys: json.keys }
this._cooldownStarted = Date.now()
this._pendingFetch = undefined
})
.catch((err: Error) => {
this._pendingFetch = undefined
throw err
})
}
await this._pendingFetch
}
}

uses a shared promise this._pendingFetch.

So if a second request comes in at the same isolate as another request is updating, this will lead to an exception in the second request as soon as it awaits the unsettled promise.

A work-around is instead of awaiting the promise, to do a setInterval that checks say every 5ms if a condition is changed.

I saw you already have a detection for Cloudflare Workers in /src/runtime/browser/env.ts https://github.com/panva/jose/blob/bd7bf3789c146d765bbee2db0c93ba035020b24c/src/runtime/browser/env.ts

My proposed solution would be to create a function pollingAwait that wraps the setInterval solution in another Promise so that you can await that. And pick to use either a straight await or a pollingAwait depending on the value in the browser/env check.

Version

4.3.9

Runtime

Cloudflare Workers

Runtime Details

latest version/compatability date

Code to reproduce

console.log('regular use of the JWKS feature')

Required

  • I have searched the issues tracker and discussions for similar topics and couldn't find anything related.
  • I agree to follow this project's Code of Conduct
@panva panva added the waiting for feedback The OP is asked for feedback or a proposal label Feb 1, 2022
@panva panva changed the title Race condition in high traffic sites on Cloudflare Workers with remote JW Race condition in high traffic sites on Cloudflare Workers with remote JWKS Feb 1, 2022
panva added a commit that referenced this issue Feb 1, 2022
@panva
Copy link
Owner

panva commented Feb 1, 2022

@evanderkoogh see 630d614 if that's roughly what you had in mind.

I'd also like your input on vercel/next.js#30739 - Edge Functions run on CF Workers, but the runtime is not detectable like pure CF Workers are (with WebSocketPair global) so until the aforementioned issue is somehow solved, Edge Functions will suffer the same problem.

@evanderkoogh
Copy link
Author

Hey @panva, that is pretty much what I had in mind. Although it did trigger me to write a small library that could be used for this instead to get around some of the more obscure edge cases.

Because there are actually 2 actual race conditions. One is the one described here, which is the way more likely one, where you await a promise that has not been settled yet. But the other one is that if the connection is closed on the request that is executing the promise, all outstanding Javascript Eventloop tasks for that request are removed, meaning that the promise might never be cleaned up.

Oh and I have asked the platform team about adding a proper identifier for Cloudflare Workers, which we will recommend Vercel to also have (although probably with a different value or something)

@panva
Copy link
Owner

panva commented Feb 2, 2022

How do you want to move on from here then?

panva added a commit that referenced this issue Feb 4, 2022
panva added a commit that referenced this issue Feb 4, 2022
panva added a commit that referenced this issue Feb 4, 2022
panva added a commit that referenced this issue Feb 4, 2022
@panva
Copy link
Owner

panva commented Feb 4, 2022

After a couple weird commits 630d614 still stands as the most recent.

@panva
Copy link
Owner

panva commented Feb 7, 2022

https://github.com/panva/jose/releases/tag/v4.5.0 resolves the more likely condition, please keep me posted when there's a proper identifier for the runtime and likewise let me know if there's anything more to be done. Thank you for bringing this to my attention.

@panva panva removed triage waiting for feedback The OP is asked for feedback or a proposal labels Feb 7, 2022
@github-actions github-actions bot locked and limited conversation to collaborators May 9, 2022
github-actions bot pushed a commit that referenced this issue Mar 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants