New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why are route parameters encoded as "ASCII-8BIT"? #1362

Closed
sinclairtarget opened this Issue Nov 9, 2017 · 6 comments

Comments

Projects
None yet
4 participants
@sinclairtarget

sinclairtarget commented Nov 9, 2017

Using Ruby 2.4.1 and Sinatra 2.0.0.

Using the following get invocation:

get '/entries/:slug' do |slug|
  logger.info slug.encoding
end

Making a GET request to /entries/foo will log the encoding as ASCII-8BIT.

Why is the encoding not UTF-8? This plays havoc with sqlite3 at the very least, which interprets ASCII-8BIT strings as BLOB data instead of string data.

@burningTyger

This comment has been minimized.

Show comment
Hide comment
@burningTyger

burningTyger Nov 9, 2017

Member

I was having the same problem with tons of errors. I just enforced utf-8 on everything that comes via params. Works perfectly yet looks a bit awkward.

Member

burningTyger commented Nov 9, 2017

I was having the same problem with tons of errors. I just enforced utf-8 on everything that comes via params. Works perfectly yet looks a bit awkward.

@lvonk

This comment has been minimized.

Show comment
Hide comment
@lvonk

lvonk Nov 10, 2017

Contributor

@burningTyger how did you enforce this encoding? Is there a setting for this?

Contributor

lvonk commented Nov 10, 2017

@burningTyger how did you enforce this encoding? Is there a setting for this?

@sinclairtarget

This comment has been minimized.

Show comment
Hide comment
@sinclairtarget

sinclairtarget Nov 10, 2017

@lvonk "hello".force_encoding("utf-8")

That won't change the actual bytes of the string, just how ruby interprets them.

sinclairtarget commented Nov 10, 2017

@lvonk "hello".force_encoding("utf-8")

That won't change the actual bytes of the string, just how ruby interprets them.

@lvonk

This comment has been minimized.

Show comment
Hide comment
@lvonk

lvonk Nov 10, 2017

Contributor

@sinclairtarget Thanks, I was kind of hoping that there was a single setting in sinatra that would apply to all params.

Contributor

lvonk commented Nov 10, 2017

@sinclairtarget Thanks, I was kind of hoping that there was a single setting in sinatra that would apply to all params.

@sinclairtarget

This comment has been minimized.

Show comment
Hide comment
@sinclairtarget

sinclairtarget Nov 11, 2017

So the params hash pulled from the request body is converted to UTF-8 by default here. But the parameters pulled from the route are added to the params hash after #force_encoding has been called, here.

If Sinatra is already changing the encoding of POST bodies, and if it has a :default_encoding setting, then the route parameters should be converted too, IMO.

sinclairtarget commented Nov 11, 2017

So the params hash pulled from the request body is converted to UTF-8 by default here. But the parameters pulled from the route are added to the params hash after #force_encoding has been called, here.

If Sinatra is already changing the encoding of POST bodies, and if it has a :default_encoding setting, then the route parameters should be converted too, IMO.

@sinclairtarget

This comment has been minimized.

Show comment
Hide comment
@sinclairtarget

sinclairtarget Nov 11, 2017

@lvonk You could always just stick something like the following in front of Sinatra in your middleware stack:

class UnicodeOrBust
  def initialize(app)
    @app = app
  end

  def call(env)
    @app.call(force_path_encoding(env))
  end

  def force_path_encoding(env)
    env['PATH_INFO'] = env['PATH_INFO'].force_encoding('UTF-8')
    env
  end
end

sinclairtarget commented Nov 11, 2017

@lvonk You could always just stick something like the following in front of Sinatra in your middleware stack:

class UnicodeOrBust
  def initialize(app)
    @app = app
  end

  def call(env)
    @app.call(force_path_encoding(env))
  end

  def force_path_encoding(env)
    env['PATH_INFO'] = env['PATH_INFO'].force_encoding('UTF-8')
    env
  end
end

@namusyaka namusyaka added this to the v2.0.2 milestone Feb 6, 2018

bk2204 added a commit to bk2204/sinatra that referenced this issue Mar 25, 2018

bk2204 added a commit to bk2204/sinatra that referenced this issue Mar 31, 2018

bk2204 added a commit to bk2204/sinatra that referenced this issue Apr 1, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment