Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not blow up on broken punycode URLs #48

Closed
dentarg opened this issue Oct 23, 2015 · 11 comments
Closed

Do not blow up on broken punycode URLs #48

dentarg opened this issue Oct 23, 2015 · 11 comments
Labels

Comments

@dentarg
Copy link
Collaborator

dentarg commented Oct 23, 2015

The broken URLs

$ git diff
diff --git a/spec/lib/twingly/url_spec.rb b/spec/lib/twingly/url_spec.rb
index 85c2903..5786279 100644
--- a/spec/lib/twingly/url_spec.rb
+++ b/spec/lib/twingly/url_spec.rb
@@ -21,6 +21,8 @@ def invalid_urls
     "blablahttp://blog.twingly.com/",
     "gopher://blog.twingly.com/",
     "\n",
+    "http://xn--...-/",
+    "http://xn--t...-/",
   ]
 end

The failures

Failures:

  1) Twingly::URL#valid? returns false for an invalid URL (http://xn--...-/)
     Failure/Error: expect(described_class.parse(invalid_url).valid?).to be false
     NoMethodError:
       undefined method `size' for nil:NilClass
     # /Users/dentarg/.gem/ruby/2.2.3/gems/addressable-2.3.8/lib/addressable/idna/pure.rb:506:in `punycode_decode'
     # /Users/dentarg/.gem/ruby/2.2.3/gems/addressable-2.3.8/lib/addressable/idna/pure.rb:97:in `block in to_unicode'
     # /Users/dentarg/.gem/ruby/2.2.3/gems/addressable-2.3.8/lib/addressable/idna/pure.rb:95:in `map!'
     # /Users/dentarg/.gem/ruby/2.2.3/gems/addressable-2.3.8/lib/addressable/idna/pure.rb:95:in `to_unicode'
     # /Users/dentarg/.gem/ruby/2.2.3/gems/addressable-2.3.8/lib/addressable/uri.rb:2050:in `display_uri'
     # ./lib/twingly/url.rb:35:in `internal_parse'
     # ./lib/twingly/url.rb:21:in `parse'
     # ./spec/lib/twingly/url_spec.rb:175:in `block (4 levels) in <top (required)>'

  2) Twingly::URL#valid? returns false for an invalid URL (http://xn--t...-/)
     Failure/Error: expect(described_class.parse(invalid_url).valid?).to be false
     Addressable::IDNA::PunycodeBadInput:
       Input is invalid.
     # /Users/dentarg/.gem/ruby/2.2.3/gems/addressable-2.3.8/lib/addressable/idna/pure.rb:563:in `punycode_decode'
     # /Users/dentarg/.gem/ruby/2.2.3/gems/addressable-2.3.8/lib/addressable/idna/pure.rb:97:in `block in to_unicode'
     # /Users/dentarg/.gem/ruby/2.2.3/gems/addressable-2.3.8/lib/addressable/idna/pure.rb:95:in `map!'
     # /Users/dentarg/.gem/ruby/2.2.3/gems/addressable-2.3.8/lib/addressable/idna/pure.rb:95:in `to_unicode'
     # /Users/dentarg/.gem/ruby/2.2.3/gems/addressable-2.3.8/lib/addressable/uri.rb:2050:in `display_uri'
     # ./lib/twingly/url.rb:35:in `internal_parse'
     # ./lib/twingly/url.rb:21:in `parse'
     # ./spec/lib/twingly/url_spec.rb:175:in `block (4 levels) in <top (required)>'

pushed an temporary branch with the specs, https://github.com/twingly/twingly-url/tree/tmp/issue/48

@dentarg
Copy link
Collaborator Author

dentarg commented Oct 23, 2015

Notes

http://xn--...-/

[25] pry(main)> Addressable::URI.heuristic_parse("http://xn--...-/")
=> #<Addressable::URI:0x3fca3d522eec URI:http://xn--...-/>

[21] pry(main)> Addressable::URI.heuristic_parse("http://xn--...-/").display_uri
NoMethodError: undefined method `size' for nil:NilClass
from /Users/dentarg/.gem/ruby/2.2.3/gems/addressable-2.3.8/lib/addressable/idna/pure.rb:506:in `punycode_decode'

http://xn--t...-/

[23] pry(main)> Addressable::URI.heuristic_parse("http://xn--t...-/")
=> #<Addressable::URI:0x3fca3d53ae48 URI:http://xn--t...-/>

[22] pry(main)> Addressable::URI.heuristic_parse("http://xn--t...-/").display_uri
Addressable::IDNA::PunycodeBadInput: Input is invalid.
from /Users/dentarg/.gem/ruby/2.2.3/gems/addressable-2.3.8/lib/addressable/idna/pure.rb:563:in `punycode_decode'

display_uri https://github.com/sporkmonger/addressable/blob/addressable-2.3.8/lib/addressable/uri.rb#L2048-L2052

[27] pry(main)> Addressable::URI.heuristic_parse("http://xn--...-/").normalize.host
=> "xn--...-"
[28] pry(main)> Addressable::URI.heuristic_parse("http://xn--t...-/").normalize.host
=> "xn--t...-"

to_unicode https://github.com/sporkmonger/addressable/blob/addressable-2.3.8/lib/addressable/idna/pure.rb#L90-L107

[32] pry(main)> Addressable::IDNA.to_unicode("xn--...-")
NoMethodError: undefined method `size' for nil:NilClass
from /Users/dentarg/.gem/ruby/2.2.3/gems/addressable-2.3.8/lib/addressable/idna/pure.rb:506:in `punycode_decode'
[33] pry(main)> Addressable::IDNA.to_unicode("xn--t...-")
Addressable::IDNA::PunycodeBadInput: Input is invalid.
from /Users/dentarg/.gem/ruby/2.2.3/gems/addressable-2.3.8/lib/addressable/idna/pure.rb:563:in `punycode_decode'

punycode_decode https://github.com/sporkmonger/addressable/blob/addressable-2.3.8/lib/addressable/idna/pure.rb#L502-L621

NoMethodError: undefined method `size' for nil:NilClass 

comes from: https://github.com/sporkmonger/addressable/blob/addressable-2.3.8/lib/addressable/idna/pure.rb#L506

@dentarg
Copy link
Collaborator Author

dentarg commented Oct 23, 2015

http://xn--t...-/

For this one I suggest that we rescue Addressable::IDNA::PunycodeBadInput

@dentarg
Copy link
Collaborator Author

dentarg commented Oct 23, 2015

Alternative solution

From https://github.com/sporkmonger/addressable#install

You may optionally turn on native IDN support by installing libidn and the idn gem:

$ sudo apt-get install idn # Debian/Ubuntu
$ brew install libidn # OS X
$ gem install idn-ruby

See https://github.com/sporkmonger/addressable/blob/addressable-2.3.8/lib/addressable/idna.rb#L19-L25

[1] pry(main)> require "addressable/idna/native"
=> false
[2] pry(main)> Addressable::IDNA.to_unicode("xn--...-")
=> "xn--...-"
[3] pry(main)> Addressable::IDNA.to_unicode("xn--t...-")
=> "xn--t...-"
[4] pry(main)> Addressable::URI.heuristic_parse("http://xn--t...-/").display_uri
=> #<Addressable::URI:0x3fd6c24adaac URI:http://xn--t...-/>
[5] pry(main)> Addressable::URI.heuristic_parse("http://xn--t...-/").display_uri.host
=> "xn--t...-"
[6] pry(main)> Addressable::URI.heuristic_parse("http://xn--...-/").display_uri.host
=> "xn--...-"

The specs pass with this diff

diff --git a/twingly-url.gemspec b/twingly-url.gemspec
index 326fdaa..cbe1c12 100644
--- a/twingly-url.gemspec
+++ b/twingly-url.gemspec
@@ -14,6 +14,7 @@ Gem::Specification.new do |s|

   s.add_dependency "addressable", "~> 2"
   s.add_dependency "public_suffix", "~> 1.4"
+  s.add_dependency "idn-ruby", "~> 0.1"

   s.add_development_dependency "rake", "~> 10"
   s.add_development_dependency "rspec", "~> 3"

@dentarg
Copy link
Collaborator Author

dentarg commented Oct 23, 2015

http://xn--...-/

Reported upstream: sporkmonger/addressable#215

@dentarg
Copy link
Collaborator Author

dentarg commented Oct 23, 2015

Alternative solution

Heroku has libidn11 and libidn11-dev: https://devcenter.heroku.com/articles/cedar-ubuntu-packages

@dentarg
Copy link
Collaborator Author

dentarg commented Oct 23, 2015

https://github.com/deepfryed/idn-ruby nothing new since 2011, but maybe that's not a problem

@dentarg
Copy link
Collaborator Author

dentarg commented Oct 29, 2015

Using libidn11 is not great for potential JRuby support, but I'm sure Java has something similar that could be used.

@walro
Copy link
Contributor

walro commented Oct 29, 2015

Using libidn11 is not great for potential JRuby support, but I'm sure Java has something similar that could be used.

https://docs.oracle.com/javase/7/docs/api/java/net/IDN.html

@walro
Copy link
Contributor

walro commented Nov 2, 2015

It blew up for me on: http://xn--zckp1cyg1.sblo.jp/

@walro
Copy link
Contributor

walro commented Nov 3, 2015

For this one I suggest that we rescue Addressable::IDNA::PunycodeBadInput

This sounds like a good idea

@walro
Copy link
Contributor

walro commented Nov 5, 2015

Heroku has libidn11 and libidn11-dev: https://devcenter.heroku.com/articles/cedar-ubuntu-packages

Those are not the same as "idn", maybe it works anyway. idn depends upon libidn11. So I think we need to make a test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants