Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArgumentError: invalid byte sequence in UTF-8 #673

Closed
mensfeld opened this issue Mar 30, 2014 · 17 comments
Closed

ArgumentError: invalid byte sequence in UTF-8 #673

mensfeld opened this issue Mar 30, 2014 · 17 comments

Comments

@mensfeld
Copy link

Hey guys. Rack gets crazy when you pass invalid UTF-8 string.

You can try with putting this into app URL:

?%28t%B3odei%29

It will brake app with this error: ArgumentError: invalid byte sequence in UTF-8

More details also here: http://dev.mensfeld.pl/2014/03/rack-argument-error-invalid-byte-sequence-in-utf-8/

rack-1.5.2/lib/rack/utils.rb:104 normalize_params
rack-1.5.2/lib/rack/utils.rb:96 block in parse_nested_query
rack-1.5.2/lib/rack/utils.rb:93 each
rack-1.5.2/lib/rack/utils.rb:93 parse_nested_query
rack-1.5.2/lib/rack/request.rb:373 parse_query
actionpack-4.0.4/lib/action_dispatch/http/request.rb:321 parse_query
rack-1.5.2/lib/rack/request.rb:188 GET
actionpack-4.0.4/lib/action_dispatch/http/request.rb:274 GET
actionpack-4.0.4/lib/action_dispatch/http/parameters.rb:16 parameters
actionpack-4.0.4/lib/action_dispatch/http/filter_parameters.rb:37 filtered_parameters
actionpack-4.0.4/lib/action_controller/metal/instrumentation.rb:22 process_action
actionpack-4.0.4/lib/action_controller/metal/params_wrapper.rb:250 process_action
activerecord-4.0.4/lib/active_record/railties/controller_runtime.rb:18 process_action
actionpack-4.0.4/lib/abstract_controller/base.rb:136 process
actionpack-4.0.4/lib/abstract_controller/rendering.rb:44 process
actionpack-4.0.4/lib/action_controller/metal.rb:195 dispatch
actionpack-4.0.4/lib/action_controller/metal/rack_delegation.rb:13 dispatch
actionpack-4.0.4/lib/action_controller/metal.rb:231 block in action
actionpack-4.0.4/lib/action_dispatch/routing/route_set.rb:80 call
actionpack-4.0.4/lib/action_dispatch/routing/route_set.rb:80 dispatch
actionpack-4.0.4/lib/action_dispatch/routing/route_set.rb:48 call
actionpack-4.0.4/lib/action_dispatch/journey/router.rb:71 block in call
actionpack-4.0.4/lib/action_dispatch/journey/router.rb:59 each
actionpack-4.0.4/lib/action_dispatch/journey/router.rb:59 call
actionpack-4.0.4/lib/action_dispatch/routing/route_set.rb:674 call
omniauth-1.2.1/lib/omniauth/strategy.rb:186 call!
omniauth-1.2.1/lib/omniauth/strategy.rb:164 call
rack-canonical-host-0.0.9/lib/rack/canonical_host.rb:19 call
warden-1.2.3/lib/warden/manager.rb:35 block in call
warden-1.2.3/lib/warden/manager.rb:34 catch
warden-1.2.3/lib/warden/manager.rb:34 call
rack-1.5.2/lib/rack/etag.rb:23 call
rack-1.5.2/lib/rack/conditionalget.rb:25 call
rack-1.5.2/lib/rack/head.rb:11 call
actionpack-4.0.4/lib/action_dispatch/middleware/params_parser.rb:27 call
actionpack-4.0.4/lib/action_dispatch/middleware/flash.rb:241 call
rack-1.5.2/lib/rack/session/abstract/id.rb:225 context
rack-1.5.2/lib/rack/session/abstract/id.rb:220 call
actionpack-4.0.4/lib/action_dispatch/middleware/cookies.rb:486 call
activerecord-4.0.4/lib/active_record/query_cache.rb:36 call
activerecord-4.0.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:626 call
actionpack-4.0.4/lib/action_dispatch/middleware/callbacks.rb:29 block in call
activesupport-4.0.4/lib/active_support/callbacks.rb:373 _run__FRAGMENT__call__callbacks
activesupport-4.0.4/lib/active_support/callbacks.rb:80 run_callbacks
actionpack-4.0.4/lib/action_dispatch/middleware/callbacks.rb:27 call
actionpack-4.0.4/lib/action_dispatch/middleware/remote_ip.rb:76 call
actionpack-4.0.4/lib/action_dispatch/middleware/debug_exceptions.rb:17 call
actionpack-4.0.4/lib/action_dispatch/middleware/show_exceptions.rb:30 call
railties-4.0.4/lib/rails/rack/logger.rb:38 call_app
railties-4.0.4/lib/rails/rack/logger.rb:20 block in call
activesupport-4.0.4/lib/active_support/tagged_logging.rb:68 block in tagged
activesupport-4.0.4/lib/active_support/tagged_logging.rb:26 tagged
activesupport-4.0.4/lib/active_support/tagged_logging.rb:68 tagged
railties-4.0.4/lib/rails/rack/logger.rb:20 call
actionpack-4.0.4/lib/action_dispatch/middleware/request_id.rb:21 call
rack-1.5.2/lib/rack/methodoverride.rb:21 call
rack-1.5.2/lib/rack/runtime.rb:17 call
activesupport-4.0.4/lib/active_support/cache/strategy/local_cache.rb:83 call
rack-1.5.2/lib/rack/sendfile.rb:112 call
railties-4.0.4/lib/rails/engine.rb:511 call
railties-4.0.4/lib/rails/application.rb:97 call
railties-4.0.4/lib/rails/railtie/configurable.rb:30 method_missing
puma-2.7.1/lib/puma/configuration.rb:68 call
puma-2.7.1/lib/puma/server.rb:486 handle_request
puma-2.7.1/lib/puma/server.rb:357 process_client
puma-2.7.1/lib/puma/server.rb:250 block in run
puma-2.7.1/lib/puma/thread_pool.rb:92 call
puma-2.7.1/lib/puma/thread_pool.rb:92 block in spawn_thread
@nijikon
Copy link

nijikon commented Mar 30, 2014

👍

1 similar comment
@karol-blaszczyk
Copy link

+1

@raggi
Copy link
Member

raggi commented Apr 1, 2014

It is a web servers responsibility to translate IO to valid binary representations for the application layer. This isn't the whole picture though, in this case, the webserver has done that - the webserver does not know the encoding of the URI...

It is the responsibility of the IETF to define the validity of URI data in various encodings (not done), and so it is not entirely valid for web servers to make no assumptions for this field for the above...

Rack itself uses a binary regular expression here, which expects binary input strings. This is our response to the above subtleties. In normal operation (say, Webrick + Rack), this error is not raised...

The reason that this error is raised in your application is:

You have middleware in your stack that is forcing this string to UTF-8, even when it is not valid UTF-8. The code that is doing this is bugged.

Observe:

s = "a=\xff"
# => "a=\xFF"
s.force_encoding("binary")
# => "a=\xFF"
s.valid_encoding?
# => true
Rack::Utils.parse_nested_query(s)
# => {"a"=>"\xFF"}
s.force_encoding("utf-8")
# => "a=\xFF"
s.valid_encoding?
# => false
Rack::Utils.parse_nested_query(s)
ArgumentError: invalid byte sequence in UTF-8
        from /usr/local/google/home/raggi/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/utils.rb:93:in `split'
        from /usr/local/google/home/raggi/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/utils.rb:93:in `parse_nested_query'
        from (irb):21
        from /usr/local/google/home/raggi/.rbenv/versions/2.0.0-p247/bin/irb:12:in `<main>'

This is a rails bug. Calls to force_encoding should always assert that their output is valid.

@raggi raggi closed this as completed Apr 1, 2014
@mensfeld
Copy link
Author

mensfeld commented Apr 2, 2014

@raggi Thx :-) sorry for that - cheers!

@nanaya
Copy link
Contributor

nanaya commented May 18, 2014

You have middleware in your stack that is forcing this string to UTF-8, even when it is not valid UTF-8. The code that is doing this is bugged.

This part does default to UTF-8, though. And thanks to that, trying to do a parse_nested_query with invalid key string will raise ArgumentError as I mentioned in issue #610, without obvious way to change it.

irb(main):003:0> s="\xFF=a"
=> "\xFF=a"
irb(main):004:0> s.force_encoding("binary")
=> "\xFF=a"
irb(main):005:0> s.valid_encoding?
=> true
irb(main):006:0> Rack::Utils.parse_nested_query(s)
ArgumentError: invalid byte sequence in UTF-8
        from /home/edho/app/ruby21/lib/ruby/gems/2.1.0/gems/rack-1.5.2/lib/rack/utils.rb:104:in `normalize_params'
        from /home/edho/app/ruby21/lib/ruby/gems/2.1.0/gems/rack-1.5.2/lib/rack/utils.rb:96:in `block in parse_nested_query'
        from /home/edho/app/ruby21/lib/ruby/gems/2.1.0/gems/rack-1.5.2/lib/rack/utils.rb:93:in `each'
        from /home/edho/app/ruby21/lib/ruby/gems/2.1.0/gems/rack-1.5.2/lib/rack/utils.rb:93:in `parse_nested_query'
        from (irb):6
        from /home/edho/app/ruby21/bin/irb:11:in `<main>'

@kennym
Copy link

kennym commented Jul 17, 2014

Just leaving this in case of anyone is having trouble with Rails: https://github.com/whitequark/rack-utf8_sanitizer/

❤️

Hamms added a commit to code-dot-org/code-dot-org that referenced this issue Mar 17, 2017
Doing so provides us with no advantages, and has the side effect of
triggering [this bug](rack/rack#673) in Rack
when said form is submitted in browsers that do not support UTF-8

Note that that enforcement is intended as a protection against
users submitting Latin-1 values when we expect unicode; since this form
supports no user-provided values and doesn't persist any data, that's
not an issue we need to worry about.

Note also that the only browsers for which this error is likely to be
generated are browsers that we do not officially support, so an argument
could be made for simply ignoring this error. But, I figure, I already
got the fix right here.
Hamms added a commit to code-dot-org/code-dot-org that referenced this issue Mar 21, 2017
Doing so provides us with no advantages, and has the side effect of
triggering [this bug](rack/rack#673) in Rack
when said form is submitted in browsers that do not support UTF-8

Note that that enforcement is intended as a protection against
users submitting Latin-1 values when we expect unicode; since this form
supports no user-provided values and doesn't persist any data, that's
not an issue we need to worry about.

Note also that the only browsers for which this error is likely to be
generated are browsers that we do not officially support, so an argument
could be made for simply ignoring this error. But, I figure, I already
got the fix right here.
@rgaufman
Copy link

rgaufman commented Jan 6, 2022

I get this occasional exception in production, what is the correct solution for this?

gems/rack-2.2.3/lib/rack/query_parser.rb:86:in `normalize_params': invalid byte sequence in UTF-8 (Rack::QueryParser::InvalidParameterError)     from gems/rack-2.2.3/lib/rack/query_parser.rb:71:in `block in parse_nested_query'     from gems/rack-2.2.3/lib/rack/query_parser.rb:68:in `each'     from gems/rack-2.2.3/lib/rack/query_parser.rb:68:in `parse_nested_query'     from gems/rack-2.2.3/lib/rack/request.rb:590:in `parse_query'     from gems/rack-2.2.3/lib/rack/request.rb:454:in `POST'     from gems/rack-2.2.3/lib/rack/request.rb:469:in `params'     from gems/rack-2.2.3/lib/rack/request.rb:32:in `params'     from /data/deployer/.bundle/ruby/3.0.0/remotipart-53f5556aaf42/lib/remotipart/middleware.rb:16:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/rack-2.2.3/lib/rack/tempfile_reaper.rb:15:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/rack-2.2.3/lib/rack/etag.rb:27:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/rack-2.2.3/lib/rack/conditional_get.rb:40:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/rack-2.2.3/lib/rack/head.rb:12:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/actionpack-6.1.4.4/lib/action_dispatch/http/permissions_policy.rb:22:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/actionpack-6.1.4.4/lib/action_dispatch/http/content_security_policy.rb:18:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/rack-2.2.3/lib/rack/session/abstract/id.rb:266:in `context'     from gems/rack-2.2.3/lib/rack/session/abstract/id.rb:260:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/actionpack-6.1.4.4/lib/action_dispatch/middleware/cookies.rb:689:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/actionpack-6.1.4.4/lib/action_dispatch/middleware/callbacks.rb:27:in `block in call'     from gems/activesupport-6.1.4.4/lib/active_support/callbacks.rb:98:in `run_callbacks'     from gems/actionpack-6.1.4.4/lib/action_dispatch/middleware/callbacks.rb:26:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/actionpack-6.1.4.4/lib/action_dispatch/middleware/actionable_exceptions.rb:18:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/bugsnag-6.24.1/lib/bugsnag/integrations/rack.rb:51:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/actionpack-6.1.4.4/lib/action_dispatch/middleware/debug_exceptions.rb:29:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/actionpack-6.1.4.4/lib/action_dispatch/middleware/show_exceptions.rb:33:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/lograge-0.11.2/lib/lograge/rails_ext/rack/logger.rb:15:in `call_app'     from gems/railties-6.1.4.4/lib/rails/rack/logger.rb:26:in `block in call'     from gems/activesupport-6.1.4.4/lib/active_support/tagged_logging.rb:99:in `block in tagged'     from gems/activesupport-6.1.4.4/lib/active_support/tagged_logging.rb:37:in `tagged'     from gems/activesupport-6.1.4.4/lib/active_support/tagged_logging.rb:99:in `tagged'     from gems/railties-6.1.4.4/lib/rails/rack/logger.rb:26:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/actionpack-6.1.4.4/lib/action_dispatch/middleware/remote_ip.rb:81:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/request_store-1.5.0/lib/request_store/middleware.rb:19:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/actionpack-6.1.4.4/lib/action_dispatch/middleware/request_id.rb:26:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/rack-2.2.3/lib/rack/method_override.rb:24:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/rack-2.2.3/lib/rack/runtime.rb:22:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/activesupport-6.1.4.4/lib/active_support/cache/strategy/local_cache_middleware.rb:29:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/actionpack-6.1.4.4/lib/action_dispatch/middleware/executor.rb:14:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from lib/rack_with_quiet_assets.rb:15:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/actionpack-6.1.4.4/lib/action_dispatch/middleware/static.rb:24:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/rack-2.2.3/lib/rack/sendfile.rb:110:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/actionpack-6.1.4.4/lib/action_dispatch/middleware/host_authorization.rb:113:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/railties-6.1.4.4/lib/rails/engine.rb:539:in `call'     from gems/newrelic_rpm-8.2.0/lib/new_relic/agent/instrumentation/middleware_tracing.rb:101:in `call'     from gems/puma-5.5.2/lib/puma/configuration.rb:249:in `call'     from gems/puma-5.5.2/lib/puma/request.rb:77:in `block in handle_request'     from gems/puma-5.5.2/lib/puma/thread_pool.rb:340:in `with_force_shutdown'     from gems/puma-5.5.2/lib/puma/request.rb:76:in `handle_request'     from gems/puma-5.5.2/lib/puma/server.rb:447:in `process_client'     from gems/puma-5.5.2/lib/puma/thread_pool.rb:147:in `block in spawn_thread'

@raggi
Copy link
Member

raggi commented Jan 6, 2022

@rgaufman in your question you ask about what is the correct solution, but I'm not clear on the problem. You reference an exception that occurs due to malformed input. What is the problem you want to solve?

@rgaufman
Copy link

rgaufman commented Jan 7, 2022

I suppose this does what I want: https://github.com/whitequark/rack-utf8_sanitizer/ - trying it now to see if these exceptions go away. These exceptions are coming from I'm guessing random hacking attempts? - what is the right way to handle these bad requests? -- should I use fail to ban or something similar, are there examples of what to do about these?

@csuhta
Copy link

csuhta commented Jan 7, 2022

@rgaufman in your question you ask about what is the correct solution, but I'm not clear on the problem. You reference an exception that occurs due to malformed input. What is the problem you want to solve?

The main issue here is that (with many common setups) Rack will raise this exception when anyone in the world (bots, people testing, search engines) drive by your Rack-powered site and submit a URL with invalid params. This is super annoying if you use an exception collation service.

I'm not sure if the issue stems from other parts of the Ruby application sending Rack invalid text, but I've mostly seen it occur with both Puma + Rails and Unicorn + Rails. One or several of those gem authors may not know that they are responsible for sending Rack cleaned up-input or being the guardian process, like you specified in this comment #673 (comment)

This error also takes other forms such as exception messages containing:

invalid %-encoding
string contains null byte
input string invalid

@csuhta
Copy link

csuhta commented Jan 7, 2022

Another way to state the problem would be:

Given that there's usually three "layers" to a Ruby web app:

  1. A web server (Puma, Webbrick, etc)
  2. Rack
  3. The application (Rails, Roda, etc)

Which interface is officially responsible for handling invalid UTF-8 or incoming malformed text in URLs or HTTP headers is not well-agreed by the different gem authors, which leads to encoding problems sometimes blowing up in the Rack layer (and people creating middleware to just catch/sanitize/discard/422 the requests "manually")

@jeremyevans
Copy link
Contributor

IMO, there are really only two layers, webserver and application. The rack layer I think refers to middleware, but middleware are just rack applications that (usually) delegate to other rack applications. Middleware can be written using the same frameworks used to write applications, though I'm don't think Rails supports operating as middleware. So IMO, there is no difference between the rack and application layers.

Also IMO, the application layer is the proper layer for such exception handling. Most applications try to show nice error pages even for invalid input, and in general that is only possible at the application layer. Some applications might only want to deal with UTF-8, and reject input in other encodings (or transcode to UTF-8), while there are other applications (less frequent these days) that deal with non-UTF-8 data, and may want to reject UTF-8 or convert it to another encoding (if possible).

I think classes the rack library exposes (e.g. Rack::Request), as well as utility methods (e.g. Rack::Utils#unescape), could probably do a better job of error handling, such as raising more specific exception classes for encoding issues/invalid input. This would allow applications using those parts of the rack library to globally rescue such exceptions and centrally handle them. Maybe such nicer error handling is something we could ship in Rack 3.

@raggi
Copy link
Member

raggi commented Jan 7, 2022

In terms of the call stack in @rgaufman's recent first comment, the multipart middleware could catch this exception and handle it gracefully, returning a 400 Bad Request instead. A patch would need to be introduced here: https://github.com/rack/rack/blob/master/lib/rack/multipart.rb#L45. As @jeremyevans rightly notes, a more specific error would be helpful - in the past there's been discussion about requesting that the Stdlib itself produces a more specific error, too.

In the more general case, the issue with this exception is that it can come from any use of query parsing in any middleware that uses either the Rack helper libraries that delegate to the stdlib parser, or from use of the stdlib parser itself. The helper libraries can not return a 400 themselves - the middleware / app callers must do so, so they must catch the exception.

@csuhta your response implies a lot of blame, suggesting that the issue is as simple as a lack of agreement. Many of the code bases involved are themselves encoding agnostic at large, it's once you start making use of encoding specific features that you run into more interesting challenges. When using a framework like Rails, being sure to use it's helpers to generate forms and so on will add explicit declarations to improve the situation. Using, say, part of Rails, but then writing raw form tags yourself or with an entirely separate template system may lead to these situations, as does serving an API and having customers using older HTTP client libraries, or say, defaults on a lot of more enterprise stacks on Windows. "We" the authors have actually discussed encodings at length, I believe Yehuda even wrote quite a bit about those discussions, back in the day. We've also done some relatively extensive browser behavior research such as https://github.com/rack/multifail.

A more general solution to the "please don't raise exceptions" problem: if you know your application is going to be using these helpers, perform this parse first (hit Request#params) handling the exception. Rack::Request#params caches the parse results, so this should not add significant overhead to your application.

Such a middleware would look like so:

class ParameterParser
  def initialize(app)
    @app = app
  end

  def call(env)
    req = Rack::Request::new(env)
    begin
      req.params
    rescue ArgumentError
      return [400, {}, []]
    end
    @app.call(env)
  end
end

@csuhta
Copy link

csuhta commented Jan 7, 2022

My teams currently use a middleware very much like that, and I agree that a much more specific exception being raised would be really helpful (rescuing all ArgumentError and checking the exception messages is a smell)

I didn't mean to imply that anyone in particular is to blame or is dropping the ball, just that the solution in this scenario is often unclear to the application author or the responsibilities are not being explicitly set.

For example, if you are starting out with Rails development and you choose Puma + Rails (and Rack implicitly), and your web app starts getting even moderate traffic, you will run into these kinds of encoding exceptions just from drive-by visitors, and from there it makes it seem like the error is coming from Rack and the proper pattern to solve it is really only discussed in GitHub threads like these.

I agree it should probably be handled by the application layer or middleware, so that you can serve a nice error page or you can attempt to sanitize the string if you really need to be kind to certain incoming requests.

A good way to approach it in Rack 3 (at least for my team's use cases) would be a suite of more specific exceptions being raised, and also documentation that the application is responsible for deciding how they deal with the exceptions/strings

@ioquatix
Copy link
Member

I would accept a PR for a more specific exception.

@ioquatix ioquatix reopened this Jan 26, 2022
@raggi
Copy link
Member

raggi commented Jan 26, 2022 via email

@jeremyevans
Copy link
Contributor

This was fixed by c1e5fbb

snickell pushed a commit to snickell/code-dot-org that referenced this issue Jul 13, 2023
Doing so provides us with no advantages, and has the side effect of
triggering [this bug](rack/rack#673) in Rack
when said form is submitted in browsers that do not support UTF-8

Note that that enforcement is intended as a protection against
users submitting Latin-1 values when we expect unicode; since this form
supports no user-provided values and doesn't persist any data, that's
not an issue we need to worry about.

Note also that the only browsers for which this error is likely to be
generated are browsers that we do not officially support, so an argument
could be made for simply ignoring this error. But, I figure, I already
got the fix right here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants