Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work with Twingly::URL objects instead of strings #42

Merged
merged 13 commits into from
Oct 20, 2015

Conversation

twingly-mob
Copy link
Member

@jage
Copy link
Contributor

jage commented Oct 13, 2015

I'd like to look at the method_missing implementation, should we implement respond_to? as well?
Would also like to add more tests with spaces and newlines around valid URLs.

EDIT: Handled in #44 and #45

@jage
Copy link
Contributor

jage commented Oct 14, 2015

I'd like a more dense inspect output.

$ bundle console
irb(main):001:0> Twingly::URL.parse("http://duh.se")
=> #<Twingly::URL:0x007fb3221e0fc8 @addressable_uri=#<Addressable::URI:0x3fd9910f0488 URI:http://duh.se>, @public_suffix_domain=#<PublicSuffix::Domain:0x007fb3222c5a88 @tld="se", @sld="duh", @trd=nil>>

EDIT: Handled in #43

@jage
Copy link
Contributor

jage commented Oct 14, 2015

Should this be true or false?

irb(main):007:0> url = "http://duh.se/"
=> "http://duh.se/"
irb(main):008:0> Twingly::URL.parse(url) == Twingly::URL.parse(url)

@walro
Copy link
Contributor

walro commented Oct 14, 2015

Should this be true or false?

[7] pry(main)> uri1 = URI.parse("http://duh.se")
=> #<URI::HTTP http://duh.se>
[8] pry(main)> uri2 = URI.parse("http://duh.se")
=> #<URI::HTTP http://duh.se>
[9] pry(main)> uri1 == uri2
=> true

[12] pry(main)> uri1 = Addressable::URI.parse("http://duh.se")
=> #<Addressable::URI:0x3fc35a260c0c URI:http://duh.se>
[13] pry(main)> uri2 = Addressable::URI.parse("http://duh.se")
=> #<Addressable::URI:0x3fc35b3ae180 URI:http://duh.se>
[14] pry(main)> uri1 == uri2
=> true

I'd say true :)

@jage
Copy link
Contributor

jage commented Oct 14, 2015

http://ruby-doc.org/core-2.2.3/Comparable.html might be interesting to use.

I'm thinking:

include Comparable

def <=>(anotherUrl)
  self.to_s <=> anotherUrl.to_s
end

@twingly-mob twingly-mob changed the title [WIP] Work with Twingly::URL objects instead of strings Work with Twingly::URL objects instead of strings Oct 14, 2015
@twingly-mob
Copy link
Member Author

What can we do about this:

irb(main):010:0> Twingly::URL.new.domain
NoMethodError: undefined method `domain' for nil:NilClass
    from /Users/twingly/repos/twingly-url/lib/twingly/url.rb:42:in `domain'

@dentarg
Copy link
Collaborator

dentarg commented Oct 14, 2015

Actually, we haven't solved #11

Would be nice if the normalizer could extract the normalized host:

http://domain.com.br/some_path -> http://www.domain.com.br/

I think the original title of that issue confused us, "Capability for extracting just the host part of an URL", we have #host but that doesn't include the scheme.

What we want is called "origin" in Addressable. We want to use it in Norrstrom ping_message.rb (specs + examples).

@dentarg
Copy link
Collaborator

dentarg commented Oct 14, 2015

We want to use it in Norrstrom

We don't want to use normalization there

@walro
Copy link
Contributor

walro commented Oct 15, 2015

new

I am thinking we implement initialize and let it accept Addressable::URI and PublicSuffix::Domain. Rename .setup to .internal_parse. I opted to make it public so we wouldn't have to #send, added the internal prefix instead to denote that you shouldn't use this in clients.

WDYT @twingly/dev ?

diff --git a/lib/twingly/url.rb b/lib/twingly/url.rb
index a31fc66..2924bff 100644
--- a/lib/twingly/url.rb
+++ b/lib/twingly/url.rb
@@ -16,12 +16,41 @@ module Twingly
     def self.parse(potential_url)
       potential_url = String(potential_url)

-      # TODO: Can we make this less send-y?
-      self.new.send(:setup, potential_url)
+      internal_parse(potential_url)
     rescue Twingly::URL::Error, Twingly::URL::Error::ParseError => error
       NullURL.new
     end

+    def self.internal_parse(potential_url)
+      if potential_url.is_a?(Addressable::URI)
+        addressable_uri = potential_url
+      else
+        addressable_uri = Addressable::URI.heuristic_parse(potential_url)
+      end
+
+      raise Twingly::Error::ParseError if addressable_uri.nil?
+
+      public_suffix_domain = PublicSuffix.parse(addressable_uri.display_uri.host)
+
+      self.new(addressable_uri, public_suffix_domain)
+    rescue Addressable::URI::InvalidURIError, PublicSuffix::DomainInvalid => error
+      error.extend(Twingly::URL::Error)
+      raise
+    end
+
+    def initialize(addressable_uri, public_suffix_domain)
+      unless addressable_uri.is_a?(Addressable::URI)
+        raise ArgumentError, "First parameter must be an Addressable::URI"
+      end
+
+      unless public_suffix_domain.is_a?(PublicSuffix::Domain)
+        raise ArgumentError, "Second parameter must be a PublicSuffix::Domain"
+      end
+
+      @addressable_uri      = addressable_uri
+      @public_suffix_domain = public_suffix_domain
+    end
+
     def scheme
       addressable_uri.scheme
     end
@@ -65,7 +94,7 @@ module Twingly
       normalized_url.host   = normalized_host
       normalized_url.path   = normalized_path

-      setup(normalized_url)
+      self.class.internal_parse(normalized_url)
     end

     def normalized_scheme
@@ -107,23 +136,6 @@ module Twingly

     attr_reader :addressable_uri, :public_suffix_domain

-    def setup(potential_url)
-      if potential_url.is_a?(Addressable::URI)
-        @addressable_uri = potential_url
-      else
-        @addressable_uri = Addressable::URI.heuristic_parse(potential_url)
-      end
-
-      raise Twingly::Error::ParseError if addressable_uri.nil?
-
-      @public_suffix_domain = PublicSuffix.parse(addressable_uri.display_uri.host)
-
-      self
-    rescue Addressable::URI::InvalidURIError, PublicSuffix::DomainInvalid => error
-      error.extend(Twingly::URL::Error)
-      raise
-    end
-
     def normalize_blogspot(host, domain)
       if domain.sld.downcase == "blogspot"
         host.sub(/\Awww\./i, "").sub(/#{domain.tld}\z/i, "com")
diff --git a/spec/lib/twingly/url_spec.rb b/spec/lib/twingly/url_spec.rb
index d654330..13c4e83 100644
--- a/spec/lib/twingly/url_spec.rb
+++ b/spec/lib/twingly/url_spec.rb
@@ -45,6 +45,14 @@ describe Twingly::URL do
     end
   end

+  describe "#initialize" do
+    context "when given input parameters of wrong types" do
+      it "raises an error" do
+        expect { described_class.new("a", "b") }.to raise_error
+      end
+    end
+  end
+
   describe "#scheme" do
     subject { url.scheme }
     it { is_expected.to eq("http") }

@dentarg
Copy link
Collaborator

dentarg commented Oct 15, 2015

Sounds good to me

@roback
Copy link
Member

roback commented Oct 15, 2015

👍 Better than send :)

@jage
Copy link
Contributor

jage commented Oct 15, 2015

👍

Rename #setup to .internal_parse
@walro
Copy link
Contributor

walro commented Oct 15, 2015

Are we done?

module Twingly
class URL
class NullURL
include Comparable
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation missing here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed

@dentarg
Copy link
Collaborator

dentarg commented Oct 15, 2015

This could be bad

$ git d
diff --git a/spec/lib/twingly/url_spec.rb b/spec/lib/twingly/url_spec.rb
index 13c4e83..b54c467 100644
--- a/spec/lib/twingly/url_spec.rb
+++ b/spec/lib/twingly/url_spec.rb
@@ -128,6 +128,17 @@ describe Twingly::URL do
     end
   end

+  describe "#normalized.origin" do
+    subject { described_class.parse(url).normalized.origin }
+
+    context "adds a trailing slash if missing in origin" do
+      let(:url)      { "http://twingly.com/" }
+      let(:expected) { "http://www.twingly.com/" }
+
+      it { is_expected.to eq(expected) }
+    end
+  end
+
   describe "#normalized" do
     subject { described_class.parse(url).normalized.to_s }
Failures:

  1) Twingly::URL#normalized.origin adds a trailing slash if missing in origin should eq "http://www.twingly.com/"
     Failure/Error: it { is_expected.to eq(expected) }

       expected: "http://www.twingly.com/"
            got: "http://www.twingly.com"

       (compared using ==)
     # ./spec/lib/twingly/url_spec.rb:138:in `block (4 levels) in <top (required)>'

@walro
Copy link
Contributor

walro commented Oct 15, 2015

This could be bad

As per discussion IRL, I don't think "/" should be part of any origin, normalized or not. We should view trailing slash as "we always require a starting / in our paths".

it { is_expected.to eq(url) }
end

context "adds a trailing slash if missing in origin" do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/ is not part of origin (which is scheme, host and port), so the wording is wrong here

@dentarg
Copy link
Collaborator

dentarg commented Oct 15, 2015

As per discussion IRL, I don't think "/" should be part of any origin, normalized or not. We should view trailing slash as "we always require a starting / in our paths".

👍

@twingly-mob
Copy link
Member Author

@twingly/dev What about now?

@jage
Copy link
Contributor

jage commented Oct 16, 2015

@twingly/dev What about now?

I'm a bit worried about encodings, otherwise this looks great! (Could be handled in separate issue)

twingly-mob and others added 5 commits October 16, 2015 11:44
Also add url that Ruby's stdlib URI couldn't parse.
Just ensure that .parse does a nice and clean parse without changing
anything. We have #normalized for that
Fix #45.

Pair-programmed with @dentarg.
walro added a commit that referenced this pull request Oct 20, 2015
Work with Twingly::URL objects instead of strings
@walro walro merged commit 409b0db into master Oct 20, 2015
@walro walro deleted the mob-session/2015-10-12 branch October 20, 2015 08:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants