New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArgumentError: invalid byte sequence in UTF-8 #673

Closed
mensfeld opened this Issue Mar 30, 2014 · 6 comments

Comments

Projects
None yet
6 participants
@mensfeld

mensfeld commented Mar 30, 2014

Hey guys. Rack gets crazy when you pass invalid UTF-8 string.

You can try with putting this into app URL:

?%28t%B3odei%29

It will brake app with this error: ArgumentError: invalid byte sequence in UTF-8

More details also here: http://dev.mensfeld.pl/2014/03/rack-argument-error-invalid-byte-sequence-in-utf-8/

rack-1.5.2/lib/rack/utils.rb:104→ normalize_params
rack-1.5.2/lib/rack/utils.rb:96→ block in parse_nested_query
rack-1.5.2/lib/rack/utils.rb:93→ each
rack-1.5.2/lib/rack/utils.rb:93→ parse_nested_query
rack-1.5.2/lib/rack/request.rb:373→ parse_query
actionpack-4.0.4/lib/action_dispatch/http/request.rb:321→ parse_query
rack-1.5.2/lib/rack/request.rb:188GET
actionpack-4.0.4/lib/action_dispatch/http/request.rb:274GET
actionpack-4.0.4/lib/action_dispatch/http/parameters.rb:16→ parameters
actionpack-4.0.4/lib/action_dispatch/http/filter_parameters.rb:37→ filtered_parameters
actionpack-4.0.4/lib/action_controller/metal/instrumentation.rb:22→ process_action
actionpack-4.0.4/lib/action_controller/metal/params_wrapper.rb:250→ process_action
activerecord-4.0.4/lib/active_record/railties/controller_runtime.rb:18→ process_action
actionpack-4.0.4/lib/abstract_controller/base.rb:136→ process
actionpack-4.0.4/lib/abstract_controller/rendering.rb:44→ process
actionpack-4.0.4/lib/action_controller/metal.rb:195→ dispatch
actionpack-4.0.4/lib/action_controller/metal/rack_delegation.rb:13→ dispatch
actionpack-4.0.4/lib/action_controller/metal.rb:231→ block in action
actionpack-4.0.4/lib/action_dispatch/routing/route_set.rb:80→ call
actionpack-4.0.4/lib/action_dispatch/routing/route_set.rb:80→ dispatch
actionpack-4.0.4/lib/action_dispatch/routing/route_set.rb:48→ call
actionpack-4.0.4/lib/action_dispatch/journey/router.rb:71→ block in call
actionpack-4.0.4/lib/action_dispatch/journey/router.rb:59→ each
actionpack-4.0.4/lib/action_dispatch/journey/router.rb:59→ call
actionpack-4.0.4/lib/action_dispatch/routing/route_set.rb:674→ call
omniauth-1.2.1/lib/omniauth/strategy.rb:186→ call!
omniauth-1.2.1/lib/omniauth/strategy.rb:164→ call
rack-canonical-host-0.0.9/lib/rack/canonical_host.rb:19→ call
warden-1.2.3/lib/warden/manager.rb:35→ block in call
warden-1.2.3/lib/warden/manager.rb:34catch
warden-1.2.3/lib/warden/manager.rb:34→ call
rack-1.5.2/lib/rack/etag.rb:23→ call
rack-1.5.2/lib/rack/conditionalget.rb:25→ call
rack-1.5.2/lib/rack/head.rb:11→ call
actionpack-4.0.4/lib/action_dispatch/middleware/params_parser.rb:27→ call
actionpack-4.0.4/lib/action_dispatch/middleware/flash.rb:241→ call
rack-1.5.2/lib/rack/session/abstract/id.rb:225→ context
rack-1.5.2/lib/rack/session/abstract/id.rb:220→ call
actionpack-4.0.4/lib/action_dispatch/middleware/cookies.rb:486→ call
activerecord-4.0.4/lib/active_record/query_cache.rb:36→ call
activerecord-4.0.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:626→ call
actionpack-4.0.4/lib/action_dispatch/middleware/callbacks.rb:29→ block in call
activesupport-4.0.4/lib/active_support/callbacks.rb:373_run__FRAGMENT__call__callbacks
activesupport-4.0.4/lib/active_support/callbacks.rb:80→ run_callbacks
actionpack-4.0.4/lib/action_dispatch/middleware/callbacks.rb:27→ call
actionpack-4.0.4/lib/action_dispatch/middleware/remote_ip.rb:76→ call
actionpack-4.0.4/lib/action_dispatch/middleware/debug_exceptions.rb:17→ call
actionpack-4.0.4/lib/action_dispatch/middleware/show_exceptions.rb:30→ call
railties-4.0.4/lib/rails/rack/logger.rb:38→ call_app
railties-4.0.4/lib/rails/rack/logger.rb:20→ block in call
activesupport-4.0.4/lib/active_support/tagged_logging.rb:68→ block in tagged
activesupport-4.0.4/lib/active_support/tagged_logging.rb:26→ tagged
activesupport-4.0.4/lib/active_support/tagged_logging.rb:68→ tagged
railties-4.0.4/lib/rails/rack/logger.rb:20→ call
actionpack-4.0.4/lib/action_dispatch/middleware/request_id.rb:21→ call
rack-1.5.2/lib/rack/methodoverride.rb:21→ call
rack-1.5.2/lib/rack/runtime.rb:17→ call
activesupport-4.0.4/lib/active_support/cache/strategy/local_cache.rb:83→ call
rack-1.5.2/lib/rack/sendfile.rb:112→ call
railties-4.0.4/lib/rails/engine.rb:511→ call
railties-4.0.4/lib/rails/application.rb:97→ call
railties-4.0.4/lib/rails/railtie/configurable.rb:30→ method_missing
puma-2.7.1/lib/puma/configuration.rb:68→ call
puma-2.7.1/lib/puma/server.rb:486→ handle_request
puma-2.7.1/lib/puma/server.rb:357→ process_client
puma-2.7.1/lib/puma/server.rb:250→ block in run
puma-2.7.1/lib/puma/thread_pool.rb:92→ call
puma-2.7.1/lib/puma/thread_pool.rb:92→ block in spawn_thread
@nijikon

This comment has been minimized.

nijikon commented Mar 30, 2014

👍

1 similar comment
@karol-blaszczyk

This comment has been minimized.

karol-blaszczyk commented Mar 30, 2014

+1

@raggi

This comment has been minimized.

Member

raggi commented Apr 1, 2014

It is a web servers responsibility to translate IO to valid binary representations for the application layer. This isn't the whole picture though, in this case, the webserver has done that - the webserver does not know the encoding of the URI...

It is the responsibility of the IETF to define the validity of URI data in various encodings (not done), and so it is not entirely valid for web servers to make no assumptions for this field for the above...

Rack itself uses a binary regular expression here, which expects binary input strings. This is our response to the above subtleties. In normal operation (say, Webrick + Rack), this error is not raised...

The reason that this error is raised in your application is:

You have middleware in your stack that is forcing this string to UTF-8, even when it is not valid UTF-8. The code that is doing this is bugged.

Observe:

s = "a=\xff"
# => "a=\xFF"
s.force_encoding("binary")
# => "a=\xFF"
s.valid_encoding?
# => true
Rack::Utils.parse_nested_query(s)
# => {"a"=>"\xFF"}
s.force_encoding("utf-8")
# => "a=\xFF"
s.valid_encoding?
# => false
Rack::Utils.parse_nested_query(s)
ArgumentError: invalid byte sequence in UTF-8
        from /usr/local/google/home/raggi/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/utils.rb:93:in `split'
        from /usr/local/google/home/raggi/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/utils.rb:93:in `parse_nested_query'
        from (irb):21
        from /usr/local/google/home/raggi/.rbenv/versions/2.0.0-p247/bin/irb:12:in `<main>'

This is a rails bug. Calls to force_encoding should always assert that their output is valid.

@raggi raggi closed this Apr 1, 2014

@mensfeld

This comment has been minimized.

mensfeld commented Apr 2, 2014

@raggi Thx :-) sorry for that - cheers!

@nanaya

This comment has been minimized.

Contributor

nanaya commented May 18, 2014

You have middleware in your stack that is forcing this string to UTF-8, even when it is not valid UTF-8. The code that is doing this is bugged.

This part does default to UTF-8, though. And thanks to that, trying to do a parse_nested_query with invalid key string will raise ArgumentError as I mentioned in issue #610, without obvious way to change it.

irb(main):003:0> s="\xFF=a"
=> "\xFF=a"
irb(main):004:0> s.force_encoding("binary")
=> "\xFF=a"
irb(main):005:0> s.valid_encoding?
=> true
irb(main):006:0> Rack::Utils.parse_nested_query(s)
ArgumentError: invalid byte sequence in UTF-8
        from /home/edho/app/ruby21/lib/ruby/gems/2.1.0/gems/rack-1.5.2/lib/rack/utils.rb:104:in `normalize_params'
        from /home/edho/app/ruby21/lib/ruby/gems/2.1.0/gems/rack-1.5.2/lib/rack/utils.rb:96:in `block in parse_nested_query'
        from /home/edho/app/ruby21/lib/ruby/gems/2.1.0/gems/rack-1.5.2/lib/rack/utils.rb:93:in `each'
        from /home/edho/app/ruby21/lib/ruby/gems/2.1.0/gems/rack-1.5.2/lib/rack/utils.rb:93:in `parse_nested_query'
        from (irb):6
        from /home/edho/app/ruby21/bin/irb:11:in `<main>'
@kennym

This comment has been minimized.

kennym commented Jul 17, 2014

Just leaving this in case of anyone is having trouble with Rails: https://github.com/whitequark/rack-utf8_sanitizer/

❤️

Hamms added a commit to code-dot-org/code-dot-org that referenced this issue Mar 17, 2017

Stop (unnecessarily) enforcing UTF8 for set_locale form
Doing so provides us with no advantages, and has the side effect of
triggering [this bug](rack/rack#673) in Rack
when said form is submitted in browsers that do not support UTF-8

Note that that enforcement is intended as a protection against
users submitting Latin-1 values when we expect unicode; since this form
supports no user-provided values and doesn't persist any data, that's
not an issue we need to worry about.

Note also that the only browsers for which this error is likely to be
generated are browsers that we do not officially support, so an argument
could be made for simply ignoring this error. But, I figure, I already
got the fix right here.

Hamms added a commit to code-dot-org/code-dot-org that referenced this issue Mar 21, 2017

Stop (unnecessarily) enforcing UTF8 for set_locale form
Doing so provides us with no advantages, and has the side effect of
triggering [this bug](rack/rack#673) in Rack
when said form is submitted in browsers that do not support UTF-8

Note that that enforcement is intended as a protection against
users submitting Latin-1 values when we expect unicode; since this form
supports no user-provided values and doesn't persist any data, that's
not an issue we need to worry about.

Note also that the only browsers for which this error is likely to be
generated are browsers that we do not officially support, so an argument
could be made for simply ignoring this error. But, I figure, I already
got the fix right here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment