Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support parsing multiple files for one form field #2165

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

timcraft
Copy link

@timcraft timcraft commented Apr 1, 2024

This implements support for parsing multipart requests that contain multiple files with the same name.

From RFC7578 Section 4.3:

To match widely deployed implementations, multiple files MUST be sent by supplying each file in a separate part but all with the same "name" parameter.

This isn't strictly backwards compatible because it affects Rack::Utils.parse_nested_query, however I would argue the existing behaviour there is a bug on the basis that a) it's not a feature to have data silently dropped/overwritten so not something many are likely to be relying upon, and b) parse_nested_query doesn't currently match the behaviour of parse_query which does the correct thing:

$ irb -r rack
irb(main):001> Rack::Utils.parse_query('foo=1&foo=2')
=> {"foo"=>["1", "2"]}
irb(main):002> Rack::Utils.parse_nested_query('foo=1&foo=2')
=> {"foo"=>"2"}

You can test out uploading multiple files from browsers using something like this:

require 'sinatra'
require 'json'

get '/' do
  <<-HTML
    <form action="/" method="post" enctype="multipart/form-data">
      <input type="file" name="field" multiple>

      <button type="submit">Upload</button>
    </form>
  HTML
end

post '/' do
  <<-HTML
    <pre>#{JSON.pretty_generate(request.POST)}</pre>
  HTML
end

@jeremyevans
Copy link
Contributor

Doesn't Rack already support this if the name ends with []? Then Rack will have the parameter store an array of uploaded files. My advice would be to use field[] as the name, then you can use that name to upload as many files as you want.

Personally, the parse_query behavior seems wrong to me. If we are going to change things in a backwards-incompatible manner, we should make parse_query behave like parse_nested_query, not vice-versa. However, I don't think it's worth breaking backwards compatibility here.

@timcraft
Copy link
Author

timcraft commented Apr 1, 2024

Doesn't Rack already support this if the name ends with []?

Yes but that's a Rails invention, not part of the spec.

My advice would be to use field[] as the name

What about when it's not possible to change the name of the field on the client side, or you need to support field names with and without the [] suffix?

Personally, the parse_query behavior seems wrong to me.

Browsers, the multipart-post gem, and the cgi gem all support fields with the same name. For example:

$ irb -r cgi
irb(main):001> CGI.parse('foo=1&foo=2')
=> {"foo"=>["1", "2"]}

It's also implemented that way in other languages. For example:

$ python3
...
>>> from urllib.parse import parse_qs
>>> parse_qs('foo=1&foo=2')
{'foo': ['1', '2']}

$ bun repl
...
> querystring.parse('foo=1&foo=2')
Object <[Object: null prototype] {}> { foo: [ '1', '2' ] }

@jeremyevans
Copy link
Contributor

Doesn't Rack already support this if the name ends with []?

Yes but that's a Rails invention, not part of the spec.

While it originally came from Rails, the behavior has been part of Rack for I think over 15 years.

My advice would be to use field[] as the name

What about when it's not possible to change the name of the field on the client side, or you need to support field names with and without the [] suffix?

The main advantage of the current behavior is that if you specify [] on the client side, you are sure the server will interpret as an array, and if you do not specify [], you are sure the server will interpret as a string/file. If you try to automatically convert from string to array at first duplicate key, then all code that deals with the parameter must be prepared to handle both string/file and array. That's a large backwards compatibility breakage across the entire ecosystem, which would affect almost all applications that use parse_nested_query for parsing (which I would guess is over 99% of Ruby web applications).

If we were going to support this behavior, it would have to be opt-in and off by default. Considering that the inability to control the client side parameter name is fairly rare, I'm not sure it is worth supporting at all.

@matthewd
Copy link
Contributor

matthewd commented Apr 1, 2024

What about when it's not possible to change the name of the field on the client side, or you need to support field names with and without the [] suffix?

Then you don't want the [custom, built-atop-the-underlying-protocol] Rack nested params parsing algorithm; you want raw access to the parameters as supplied. That's something Rack provides at a couple of levels, including via parse_query -- but parse_nested_query is a different thing.

@timcraft
Copy link
Author

timcraft commented Apr 1, 2024

Then you don't want the [custom, built-atop-the-underlying-protocol] Rack nested params parsing algorithm; you want raw access to the parameters as supplied. That's something Rack provides at a couple of levels, including via parse_query -- but parse_nested_query is a different thing.

Yes... I don't want the nested params parsing algorithm but that's what Rack::Multipart.parse_multipart and Rack::Multipart::Parser.parse use internally. Maybe I'm looking in the wrong place but I can't see how to get raw access to the parameters without the nested params parsing algorithm? Using Rack::Utils.parse_query or Rack::Utils.parse_nested_query directly wouldn't work, because those aren't for multipart/form-data.

Would implementing something custom similar to Rack::Multipart::ParamList and passing that to Rack::Multipart.parse_multipart be the supported way to do that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants