Open
Description
Right now there is no way to call Body#formData
on arbitrary user input without potentially causing an OOM. It would be great if we could constrain the formData
parser so it stops parsing if a body exceeds a certain size.
Example API:
const form = req.formData({ maxSize: 10_000_000 /* 10MB */ })
If the body size exceeds the max size, a QuotaExceededError
DOMException
should be thrown.
This might make sense for .text()
, .blob()
, .arrayBuffer()
, and .json()
also.
Activity
lucacasonato commentedon Jan 12, 2023
The background on this, is that
formData
andjson
are prevalent ways to parse the request body in HTTP servers that use fetch'sRequest
andResponse
headers:jasnell commentedon Jan 13, 2023
I'd be interested in this for both Node.js and Workers. I think it makes the most sense to add to all of the body mixin methods. Defining it as an options object makes perfect sense as I've also considered the possibility of adding a signal option to these to allow cancellation, e.g.
lucacasonato commentedon Feb 1, 2023
Is there implementer interest from any browsers for this?
I have a draft spec PR here: #1600. If we can get some implementer interest, I can write WPTs - the implementation here should be relatively trivial.
cc @annevk @saschanaz @ricea
ricea commentedon Feb 2, 2023
Interesting. My impression is that these features are not very useful on the browser-side. I wonder if ServiceWorkers would benefit? It's hard to commit resources to something that doesn't directly benefit our users.
annevk commentedon Feb 2, 2023
Is there evidence of use in libraries or requests for this functionality on Stack Overflow?
@jakearchibald perhaps you can comment on SW utility?
saschanaz commentedon Feb 2, 2023
Logically it makes sense for server use case, but I also think this is less useful for browsers. Can this be something like "a limit chosen by user agent" instead of a user given limit?
lucacasonato commentedon Feb 4, 2023
@ricea Possibly. I agree that in browsers the need for this is less evident, because the worst thing a DOS in a service worker can do is take down one origin for one user momentarily. On servers however, this DOS vector can take down one origin for all users.
@annevk I can not find requests on Stack Overflow, but there have definitely been CVEs related to request body parse handling in server-side JS runtimes and libraries. Fastify, a popular Node.js server framework, has an option to specify a maximum body size for requests. Similar options exist for express and many other frameworks.
@saschanaz We already imply this through this sentence in the infra spec:
The idea behind a specific limit option for us is that we'd like to set a reasonable default limit, but provide our users with an opt-out mechanism to specify a higher limit in case they need it. For us it is impractical to have a limit that is not in any way opt-out by the user.
There is a very strong need for this from both Deno and Cloudflare Workers, and it would be great if for parity's sake browsers matched behaviour here.
Hypothetically, if the implementation work in browsers is contributed by us, would you be more inclined to consider this? This is the path we are taking with #1346:
Something similar was also done for
Response.json
(the static one). It was implemented in Chromium by me.saschanaz commentedon Feb 4, 2023
Oh, so this is to opt out instead of to opt in for stricter limit. Could such opt-out be in a command line option instead, though?
Edit: Also, can the user-agent defined limit be dynamic here (e.g. depends on the current system memory usage) so that users won't have to opt out because of too strict default limit?
jasnell commentedon Feb 4, 2023
Not in our case in Cloudflare workers. Users have no access to CLI options. An API option is ideal.
lucacasonato commentedon Feb 4, 2023
Same for us - Deno Deploy does not have CLI flags. Even in the Deno CLI, we prefer letting users specify all configuration fully within code. This make application portability much simpler.
andreubotella commentedon Feb 4, 2023
Since Infra specifies that user agents can have implementation-defined limits, I think the way to specify this would be to have this user-set limit as an extra limit on top of the implementation-defined ones. That way, browsers could implement it as a stricter limit, and server-side runtimes would be free to raise their regular limits to match the user-set limit.
saschanaz commentedon Feb 4, 2023
(I'm not strictly against this, just exploring some options here which you probably already considered outside this thread)
But that API option can be in whatever form, like
Deno.foo
as in Deno, right? Of course that would be nonstandard and not pretty, but not sure this option is pretty either if it's only useful on servers.jimmywarting commentedon Feb 6, 2023
I'm not particularly +1 or -1 on this idea. but there are other way to battle this as well (to not hit OOM)
for starter
content-length
header yourself already. and also reject request that do not have them.content-length
but then maybe perhaps you could do something cool/useful in combination of having a fetch observer that can track how much you have downloaded / upload. and if it have downloaded more thanx
bytes then you could abort the request / response withabortSignal
or something. I have long been waiting for something like a progress monitoring solution that XHR have had for years but fetch don't.formData()
as files are often backed up by the filesystem anyways. I have proposed that decoding large file from.formData()
also gets saved directly to the disc and handed back to you as aFile
backed up by the file system and isn't something that isn't kept in memory.Body.blob()
to also be backed up by the file system..formData()
solution likeWebTransports
instead.x-
headers, so you would know how large each file is before you even begin to parse the payload with.formData()
The only value i see for this
maxSize
is within servers (not browsers) that's why i'm leaning more towards applying internal solutions that circumvents this OOM issues by doing the same thing browser dose to solve this problem without necessary adding new syntactic feature on top that isn't "really" necessary or can be solved in other ways without extra features/option that's only going to be useful for the server.I know Bun.js use parts of webkits builtin web api's rather then trying to reimplement things.
I know that they can represent a file backed up by the file system, i hear the verdic is that they do not support
Body.formData()
yet cuz HTMLFileElement made it a bit tricky. But if i could guess, then their solution would be to write files to the disc and hand you a native File handle backOther cool (off-topic) thing would be if it where a 2 way communication where you could send file descriptions (extra streams) within a stream. but such things only exist in HTTP/2. then you could decide for yourself if you want to read file xyz and knowing beforehand how large the file is. but this would be something completely different.
perhaps maybe a better solution is just to simply post one individual file per request and not use
FormData
at all if you intend to send/accept large files.wattroll commentedon Nov 24, 2023
How about sinking a
clone()
of the request prior to callingformData()
?matthieusieben commentedon Dec 21, 2023
@wattroll, your approach is great but it does load everything in memory.
The following does essentially the same without draining the stream:
Note: this can probably be made shorter
wattroll commentedon Dec 22, 2023
@matthieusieben
Good point. Everything in this case would be at most
maxByteLength
bytes that will get buffered into thetee()
-ed stream duringrequest.clone().body?.pipeTo(...)
and be held until theawait request.formData()
completes.Buffering up to the limit could be acceptable for the use-cases where the limit is low, but not for larger limits with final consumers that stream the chunks efficiently to somewhere.
I used that approach for a login adapter that accepts email/password form which I wanted to limit very tightly, so that my auth service isn't exposed to OOM with large and slow payloads.
I think it should be possible to sink the cloned request into byte counter and application sink in parallel, so that when the byte counter tips over it cancels the other one. I haven't tried that yet. I also don't know whether any server-side runtimes feature disk-backed
Body#formData()
. Per my understandingformData
has to receive the body fully before returning (in order to provide theBlob#size
property for each of theFile
in the form).