tsuraan mentioned the following on reddit:
it would be really nice if yesod had equivalents to Snap's handleMultipart,
runRequestBody, and setResponseBody. These three functions make it
easy to write a site that can handle arbitrary amounts of data without doing
any writing to the local filesystem, which is a really nice thing to be able to do.
Right now, runRequestBody and setResponseBody can be done at the wai level,
but I couldn't find an equivalent to handleMultipart, and besides it would be cool
if they were exposed to the yesod level anyhow."
This might be an API omission. It also looks like it's not possible to customize the FileSink in the Yesod API.
It's all possible via waiRequest and sendWaiResponse, though I'm not at all opposed to providing higher-level functionality. We also might want to consider abstracting the details of where an uploaded file's contents are stored.
See #358 regarding the FileSink customization (now BackEnd). Do you have any thoughts on what a higher level API would be?
I believe FileUploadSource provides a relatively high-level streaming interface. Without specific use cases, I can't really try to improve it further. @luite Do you have any thoughts?
How does one use FileUploadSource? It’s only defined in the hidden module Yesod.Internal.Request, only reexported in the hidden module Yesod.Internal.Core, and is never constructed anywhere.
Export FileUpload (#262)
That was a mistake, it should be exported from Yesod.Core. The newest commit fixes that.
FileUploadSource :: Sink ByteString (ResourceT IO) (Source (ResourceT IO) ByteString) -> FileUpload
This API doesn’t seem to provide streaming uploads; it looks more like a generalized way to slurp in an entire file before spitting the entire thing out again. That doesn’t really help if you imagine that I want to do some kind of live processing on the file as it comes in—even something as simple as displaying an AJAXy progress meter for the uploaded file. And for some applications, I may not even want to keep the entire file around to spit back out at the end.
If you don't want to keep the contents of the file, then return a dummy Source. But what prevents you from being able to perform "live processing". I'd really like something more concrete for a design goal than "streaming uploads," as it's far too vague.
That’s why I provided AJAXy progress meters as an example. In the current design, the POST handler has no opportunity to communicate with the fileUpload function until after the entire file is uploaded, so the latter can’t know where to send progress updates to.
I conjecture that a good API would be a generalization of fileAFormReq that, instead of returning an AForm sub master FileInfo, takes any FileName -> ContentType -> Sink ByteString (ResourceT IO) t and returns an AForm sub master t.
AForm sub master FileInfo
FileName -> ContentType -> Sink ByteString (ResourceT IO) t
AForm sub master t
If we're talking about having an API that allows you to produce output to the user while consuming input, I think it has to sit at the WAI level. There's no way we could return a normal Yesod response after we've already started sending data to the client. Perhaps we can still provide something to help out at the Yesod level, but I'm still not sure what the API would look like.
I don’t think sending output back on the same connection while still consuming input is a useful thing to do. Snap does actually have an API for that (transformRequestBody), but it comes with a number of safety caveats, and based on my tests, neither Firefox nor Chrome accept more than a few hundred kilobytes of that output before sending all the input (presumably my TCP congestion window size or something).
But what is useful is the ability, while consuming input, to send output somewhere else, such as on a separate connection from the same user on which we want to send back progress data, or to another server on which we’re really going to store the file. My proposal allows that via IO side effects of the sink.
And why doesn't the FileUploadSource constructor fit the bill? You can perform any arbitrary IO action inside the Sink as well.
I have no problem adding an extra API for this feature, I just want to understand the motivation and exactly what it will look like.
Since the POST handler has no opportunity to communicate with the fileUpload function until after the entire file is uploaded, the fileUpload function has no way to associate the incoming file with the request. So it can’t take different actions based on the file name, content type, which form and which control on the form the file was submitted to, or which user submitted it. That’s critical in my AJAX progress bar example.
Another example of something one might want to do with this is refuse to store uploads from unauthenticated users (even temporarily), to prevent denial of service attacks.
OK, so it sounds to me like we could solve this with two changes:
Assuming we had all that, would that solve the problem?
Almost, but that still doesn’t let me do different things to files submitted in different controls to the same form. (Presumably that also falls into the category of things that need wai-extra changes.)
Meaning that the Sink needs the filename, content type, and parameter name, right?
Also, from an API cleanliness perspective (instead of a completeness perspective), I’m still not happy that FileUploadSource restricts me to returning things that can be transmitted via a Source (ResourceT IO) ByteString. I can of course return anything using an IORef and side effects, but that’s uglier.
Source (ResourceT IO) ByteString
My proposal above also lets me associate the upload action with the form control in a way that feels much more direct and safe than parsing the parameter name as a string, especially given that the name might be Yesod-generated.
Your proposal is very vague. Can you try implementing it and sending a pull request? I don't really understand fully what you're getting at.
I’m additionally slightly unhappy that, even if I know my fileUpload handler returns a FileUploadSource, I need to pattern-match away the other two cases FileUploadMemory and FileUploadDisk as impossible.
I’m willing to try implementing it. The one thing I’m not sure about is what to do with snd reqRequestBody, but I guess I’ll assume for now that I can just put  there when the new mechanism is used.
Provide more information to file upload sinks (yesodweb/yesod#262)
Reflect wai changes for #262
I've pushed changes to wai and yesod to provide the extra needed information to the Sinks. I believe that should be sufficient to implement this feature. If you need any other changes from wai, please let me know ASAP, as we're going to be making the next release soon (possibly in the next day).