Skip to content

URI::Generic#to_s encoding incompatible #1188

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

koic
Copy link
Contributor

@koic koic commented Jan 6, 2016

Is this intentionally changed?

Ruby 2.2.4

> require 'uri'; URI::Generic.build(host: 'localhost').to_s.encoding.to_s
=> "UTF-8"

Ruby 2.3.0

> require 'uri'; URI::Generic.build(host: 'localhost').to_s.encoding.to_s
=> "ASCII-8BIT"

This patch is compatible with Ruby 2.2.

Thanks.

@nobu
Copy link
Member

nobu commented Jan 7, 2016

What problem do you have by this difference?

@koic
Copy link
Contributor Author

koic commented Jan 8, 2016

Thank you for your reply.

In the deprecated method URI.decode, multi-byte characters will NOT be able to decode.

Example Code

I expect that multi-byte characters ('日本語') will be decoded.

require 'uri'

p RUBY_VERSION

str = URI::Generic.build(scheme: 'http', host: 'localhost', path: '/test', query: 'language=日本語').to_s

p URI.decode(str)

Run with Ruby 2.2.4

It's just as I expected.

"2.2.4"
"http://localhost/test?language=日本語"

Run with Ruby 2.3.0

I did not expect this.

"2.3.0"
"http://localhost/test?language=\xE6\x97\xA5\xE6\x9C\xAC\xE8\xAA\x9E"

I prefer to use non-deprecated method, for example CGI.unescape, and so on.

On the one hand, I think the problem with this change is incompatibility with existing libraries.
(e.g. https://github.com/carrierwaveuploader/carrierwave/blob/0.10-stable/lib/carrierwave/storage/fog.rb#L342)

Thanks.

@nurse nurse self-assigned this Jan 13, 2016
@nurse
Copy link
Member

nurse commented Jan 13, 2016

The encoding of escaped URI should be US-ASCII, and maybe URI.decode should set the resulted encoding as UTF-8. Is this solve the issue?

Anyway fog uses the resulted string as filename. What your purpose of the decoded string?

@koic
Copy link
Contributor Author

koic commented Jan 13, 2016

That makes sense. It will be no problem if URI.decode should set the resulted encoding as UTF-8.

@koic
Copy link
Contributor Author

koic commented Jan 14, 2016

@nurse May I close this PR?

@hsbt hsbt closed this in aa90e3b Jan 14, 2016
@nurse
Copy link
Member

nurse commented Jan 14, 2016

After some additional thought, I merged it to backport 2.3.
trunk will be changed described above.

@koic
Copy link
Contributor Author

koic commented Jan 14, 2016

Appreciate it, many thanks.

@koic koic deleted the uri_generic_encoding_incompatibe branch January 14, 2016 12:40
hsbt pushed a commit that referenced this pull request Jan 14, 2016
	* lib/uri/generic.rb (URI::Generic#to_s): change encoding to
	  UTF-8 as Ruby 2.2/ by Koichi ITO <koic.ito@gmail.com>
	  #1188 fix GH-1188


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_3@53536 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
mrkn pushed a commit to mrkn/ruby that referenced this pull request Jan 20, 2016
  UTF-8 as Ruby 2.2/ by Koichi ITO <koic.ito@gmail.com>
  ruby#1188 fix rubyGH-1188

git-svn-id: svn+ssh://svn.ruby-lang.org/ruby/trunk@53535 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
mrkn pushed a commit to mrkn/ruby that referenced this pull request Apr 17, 2016
  UTF-8 as Ruby 2.2/ by Koichi ITO <koic.ito@gmail.com>
  ruby#1188 fix rubyGH-1188

git-svn-id: svn+ssh://svn.ruby-lang.org/ruby/trunk@53535 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
hsbt pushed a commit to ruby/uri that referenced this pull request Aug 6, 2019
  UTF-8 as Ruby 2.2/ by Koichi ITO <koic.ito@gmail.com>
  ruby/ruby#1188 fix GH-1188

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53535 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants