Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Removed dependency on domainatrix and replaced with public_suffix gem #7

Merged
merged 3 commits into from

2 participants

@dparis

Domainatrix has recently bumped their gem version and fixed some bugged behavior that postrank-uri was relying on. Namely, Domainatrix.parse when passed an invalid URI used to raise a method_missing error due to an unchecked nil when the domain wasn't seen in the valid list of TLDs. Since that has been fixed, their original intended behavior now occurs, where the invalid domain is passed through as if it were valid.

Domainatrix does not provide a stand-alone way to check the validity of a URI/domain otherwise, so I have replaced it with the seemingly better maintained public_suffix gem: https://github.com/weppos/publicsuffix-ruby

I can't guarantee that all previous behavior has been preserved, but all of the specs pass cleanly, so at least there are no known regressions.

@igrigorik If this looks good, I'd appreciate it if this could get bumped out to rubygems. Thanks!

@dparis

Hang on, found a minor regression in the valid? method that wasn't tested. New commit forthcoming shortly.

@dparis

The untested regression has been fixed and a spec was added to catch that case. Got bit by ruby's handling of nil cascading through chained boolean operations.

The specs are all passing cleanly still. As well, my app's spec suite is passing cleanly now too, so things look good on this end.

@igrigorik igrigorik merged commit f0b524b into from
@igrigorik
Owner

lgtm - thanks! 1.0.17 should be up on rubygems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
View
3  Rakefile
@@ -2,5 +2,4 @@ require 'bundler'
Bundler::GemHelper.install_tasks
require 'rspec/core/rake_task'
-
-Rspec::Core::RakeTask.new(:spec)
+RSpec::Core::RakeTask.new(:spec)
View
33 lib/postrank-uri.rb
@@ -1,22 +1,16 @@
# -*- encoding: utf-8 -*-
require 'addressable/uri'
-require 'domainatrix'
require 'digest/md5'
require 'nokogiri'
+require 'public_suffix'
require 'yaml'
module Addressable
class URI
def domain
- begin
- dp = Domainatrix.parse(self)
- rescue
- return nil
- end
-
- dom = dp.public_suffix
- dom = dp.domain.downcase + "." + dom unless dp.domain.empty?
+ host = self.host
+ (host && PublicSuffix.valid?(host)) ? PublicSuffix.parse(host).domain : nil
end
def normalized_query
@@ -103,11 +97,10 @@ def extract(text)
return [] if !text
urls = []
text.to_s.scan(URIREGEX[:valid_url]) do |all, before, url, protocol, domain, path, query|
- begin
+ # Only extract the URL if the domain is valid
+ if PublicSuffix.valid?(domain)
url = clean(url)
- Domainatrix.parse(url)
urls.push url.to_s
- rescue NoMethodError
end
end
@@ -223,10 +216,18 @@ def parse(uri, opts = {})
end
def valid?(uri)
- Domainatrix.parse(uri)
- true
- rescue
- false
+ # URI is only valid if it is not nil, parses cleanly as a URI,
+ # and the domain has a recognized, valid TLD component
+ return false if uri.nil?
+
+ is_valid = false
+ cleaned_uri = clean(uri, :raw => true)
+
+ if host = cleaned_uri.host
+ is_valid = PublicSuffix.valid?(host)
+ end
+
+ is_valid
end
end
end
View
2  lib/postrank-uri/version.rb
@@ -1,5 +1,5 @@
module PostRank
module URI
- VERSION = "1.0.16"
+ VERSION = "1.0.17"
end
end
View
8 postrank-uri.gemspec
@@ -14,11 +14,11 @@ Gem::Specification.new do |s|
s.rubyforge_project = "postrank-uri"
- s.add_dependency "addressable", ">= 2.3.0"
- s.add_dependency "domainatrix"
- s.add_dependency "nokogiri"
+ s.add_dependency "addressable", "~> 2.3.0"
+ s.add_dependency "public_suffix", "~> 1.1.3"
+ s.add_dependency "nokogiri", "~> 1.5.5"
+
s.add_development_dependency "rspec"
- #s.add_development_dependency "idn" # test with idn
s.files = `git ls-files`.split("\n")
s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
View
4 spec/postrank-uri_spec.rb
@@ -339,6 +339,10 @@ def e(text)
end
context 'valid?' do
+ it 'marks incomplete URI string as invalid' do
+ PostRank::URI.valid?('/path/page.html').should be_false
+ end
+
it 'marks www.test.c as invalid' do
PostRank::URI.valid?('http://www.test.c').should be_false
end
Something went wrong with that request. Please try again.