ArgumentError: invalid byte sequence in UTF-8 #673

Closed
mensfeld opened this Issue Mar 30, 2014 · 6 comments

Projects

None yet

6 participants

@mensfeld

Hey guys. Rack gets crazy when you pass invalid UTF-8 string.

You can try with putting this into app URL:

?%28t%B3odei%29

It will brake app with this error: ArgumentError: invalid byte sequence in UTF-8

More details also here: http://dev.mensfeld.pl/2014/03/rack-argument-error-invalid-byte-sequence-in-utf-8/

rack-1.5.2/lib/rack/utils.rb:104→ normalize_params
rack-1.5.2/lib/rack/utils.rb:96→ block in parse_nested_query
rack-1.5.2/lib/rack/utils.rb:93→ each
rack-1.5.2/lib/rack/utils.rb:93→ parse_nested_query
rack-1.5.2/lib/rack/request.rb:373→ parse_query
actionpack-4.0.4/lib/action_dispatch/http/request.rb:321→ parse_query
rack-1.5.2/lib/rack/request.rb:188GET
actionpack-4.0.4/lib/action_dispatch/http/request.rb:274GET
actionpack-4.0.4/lib/action_dispatch/http/parameters.rb:16→ parameters
actionpack-4.0.4/lib/action_dispatch/http/filter_parameters.rb:37→ filtered_parameters
actionpack-4.0.4/lib/action_controller/metal/instrumentation.rb:22→ process_action
actionpack-4.0.4/lib/action_controller/metal/params_wrapper.rb:250→ process_action
activerecord-4.0.4/lib/active_record/railties/controller_runtime.rb:18→ process_action
actionpack-4.0.4/lib/abstract_controller/base.rb:136→ process
actionpack-4.0.4/lib/abstract_controller/rendering.rb:44→ process
actionpack-4.0.4/lib/action_controller/metal.rb:195→ dispatch
actionpack-4.0.4/lib/action_controller/metal/rack_delegation.rb:13→ dispatch
actionpack-4.0.4/lib/action_controller/metal.rb:231→ block in action
actionpack-4.0.4/lib/action_dispatch/routing/route_set.rb:80→ call
actionpack-4.0.4/lib/action_dispatch/routing/route_set.rb:80→ dispatch
actionpack-4.0.4/lib/action_dispatch/routing/route_set.rb:48→ call
actionpack-4.0.4/lib/action_dispatch/journey/router.rb:71→ block in call
actionpack-4.0.4/lib/action_dispatch/journey/router.rb:59→ each
actionpack-4.0.4/lib/action_dispatch/journey/router.rb:59→ call
actionpack-4.0.4/lib/action_dispatch/routing/route_set.rb:674→ call
omniauth-1.2.1/lib/omniauth/strategy.rb:186→ call!
omniauth-1.2.1/lib/omniauth/strategy.rb:164→ call
rack-canonical-host-0.0.9/lib/rack/canonical_host.rb:19→ call
warden-1.2.3/lib/warden/manager.rb:35→ block in call
warden-1.2.3/lib/warden/manager.rb:34catch
warden-1.2.3/lib/warden/manager.rb:34→ call
rack-1.5.2/lib/rack/etag.rb:23→ call
rack-1.5.2/lib/rack/conditionalget.rb:25→ call
rack-1.5.2/lib/rack/head.rb:11→ call
actionpack-4.0.4/lib/action_dispatch/middleware/params_parser.rb:27→ call
actionpack-4.0.4/lib/action_dispatch/middleware/flash.rb:241→ call
rack-1.5.2/lib/rack/session/abstract/id.rb:225→ context
rack-1.5.2/lib/rack/session/abstract/id.rb:220→ call
actionpack-4.0.4/lib/action_dispatch/middleware/cookies.rb:486→ call
activerecord-4.0.4/lib/active_record/query_cache.rb:36→ call
activerecord-4.0.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:626→ call
actionpack-4.0.4/lib/action_dispatch/middleware/callbacks.rb:29→ block in call
activesupport-4.0.4/lib/active_support/callbacks.rb:373→ _run__FRAGMENT__call__callbacks
activesupport-4.0.4/lib/active_support/callbacks.rb:80→ run_callbacks
actionpack-4.0.4/lib/action_dispatch/middleware/callbacks.rb:27→ call
actionpack-4.0.4/lib/action_dispatch/middleware/remote_ip.rb:76→ call
actionpack-4.0.4/lib/action_dispatch/middleware/debug_exceptions.rb:17→ call
actionpack-4.0.4/lib/action_dispatch/middleware/show_exceptions.rb:30→ call
railties-4.0.4/lib/rails/rack/logger.rb:38→ call_app
railties-4.0.4/lib/rails/rack/logger.rb:20→ block in call
activesupport-4.0.4/lib/active_support/tagged_logging.rb:68→ block in tagged
activesupport-4.0.4/lib/active_support/tagged_logging.rb:26→ tagged
activesupport-4.0.4/lib/active_support/tagged_logging.rb:68→ tagged
railties-4.0.4/lib/rails/rack/logger.rb:20→ call
actionpack-4.0.4/lib/action_dispatch/middleware/request_id.rb:21→ call
rack-1.5.2/lib/rack/methodoverride.rb:21→ call
rack-1.5.2/lib/rack/runtime.rb:17→ call
activesupport-4.0.4/lib/active_support/cache/strategy/local_cache.rb:83→ call
rack-1.5.2/lib/rack/sendfile.rb:112→ call
railties-4.0.4/lib/rails/engine.rb:511→ call
railties-4.0.4/lib/rails/application.rb:97→ call
railties-4.0.4/lib/rails/railtie/configurable.rb:30→ method_missing
puma-2.7.1/lib/puma/configuration.rb:68→ call
puma-2.7.1/lib/puma/server.rb:486→ handle_request
puma-2.7.1/lib/puma/server.rb:357→ process_client
puma-2.7.1/lib/puma/server.rb:250→ block in run
puma-2.7.1/lib/puma/thread_pool.rb:92→ call
puma-2.7.1/lib/puma/thread_pool.rb:92→ block in spawn_thread
@nijikon
nijikon commented Mar 30, 2014

👍

@raggi
Member
raggi commented Apr 1, 2014

It is a web servers responsibility to translate IO to valid binary representations for the application layer. This isn't the whole picture though, in this case, the webserver has done that - the webserver does not know the encoding of the URI...

It is the responsibility of the IETF to define the validity of URI data in various encodings (not done), and so it is not entirely valid for web servers to make no assumptions for this field for the above...

Rack itself uses a binary regular expression here, which expects binary input strings. This is our response to the above subtleties. In normal operation (say, Webrick + Rack), this error is not raised...

The reason that this error is raised in your application is:

You have middleware in your stack that is forcing this string to UTF-8, even when it is not valid UTF-8. The code that is doing this is bugged.

Observe:

s = "a=\xff"
# => "a=\xFF"
s.force_encoding("binary")
# => "a=\xFF"
s.valid_encoding?
# => true
Rack::Utils.parse_nested_query(s)
# => {"a"=>"\xFF"}
s.force_encoding("utf-8")
# => "a=\xFF"
s.valid_encoding?
# => false
Rack::Utils.parse_nested_query(s)
ArgumentError: invalid byte sequence in UTF-8
        from /usr/local/google/home/raggi/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/utils.rb:93:in `split'
        from /usr/local/google/home/raggi/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/utils.rb:93:in `parse_nested_query'
        from (irb):21
        from /usr/local/google/home/raggi/.rbenv/versions/2.0.0-p247/bin/irb:12:in `<main>'

This is a rails bug. Calls to force_encoding should always assert that their output is valid.

@raggi raggi closed this Apr 1, 2014
@mensfeld
mensfeld commented Apr 2, 2014

@raggi Thx :-) sorry for that - cheers!

@greysteil greysteil referenced this issue in rails/rails May 18, 2014
Closed

Utf8 params key #11795

@nanaya
nanaya commented May 18, 2014

You have middleware in your stack that is forcing this string to UTF-8, even when it is not valid UTF-8. The code that is doing this is bugged.

This part does default to UTF-8, though. And thanks to that, trying to do a parse_nested_query with invalid key string will raise ArgumentError as I mentioned in issue #610, without obvious way to change it.

irb(main):003:0> s="\xFF=a"
=> "\xFF=a"
irb(main):004:0> s.force_encoding("binary")
=> "\xFF=a"
irb(main):005:0> s.valid_encoding?
=> true
irb(main):006:0> Rack::Utils.parse_nested_query(s)
ArgumentError: invalid byte sequence in UTF-8
        from /home/edho/app/ruby21/lib/ruby/gems/2.1.0/gems/rack-1.5.2/lib/rack/utils.rb:104:in `normalize_params'
        from /home/edho/app/ruby21/lib/ruby/gems/2.1.0/gems/rack-1.5.2/lib/rack/utils.rb:96:in `block in parse_nested_query'
        from /home/edho/app/ruby21/lib/ruby/gems/2.1.0/gems/rack-1.5.2/lib/rack/utils.rb:93:in `each'
        from /home/edho/app/ruby21/lib/ruby/gems/2.1.0/gems/rack-1.5.2/lib/rack/utils.rb:93:in `parse_nested_query'
        from (irb):6
        from /home/edho/app/ruby21/bin/irb:11:in `<main>'
@kennym
kennym commented Jul 17, 2014

Just leaving this in case of anyone is having trouble with Rails: https://github.com/whitequark/rack-utf8_sanitizer/

❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment