Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trailing period causes DomainNotAllowed exception #7

Closed
paulhammond opened this issue Nov 15, 2010 · 6 comments
Closed

Trailing period causes DomainNotAllowed exception #7

paulhammond opened this issue Nov 15, 2010 · 6 comments
Labels

Comments

@paulhammond
Copy link

With public_suffix_service 0.7.0 parsing a standard domain works well:

ruby-1.8.7-p249 > PublicSuffixService.parse('example.com')
 => #<PublicSuffixService::Domain:0x101905cc0 @tld="com", @trd=nil, @sld="example"> 

Adding trailing punctuation correctly returns a DomainInvalid error:

ruby-1.8.7-p249 > PublicSuffixService.parse('example.com,')
PublicSuffixService::DomainInvalid: `example.com,' is not a valid domain
ruby-1.8.7-p249 > PublicSuffixService.parse('example.com:')
PublicSuffixService::DomainInvalid: `example.com:' is not a valid domain

But if the trailing punctuation is a period, the error returned is instead DomainNotAllowed

ruby-1.8.7-p249 > PublicSuffixService.parse('example.com.')
PublicSuffixService::DomainNotAllowed: `example.com.' is not allowed according to Registry policy
ruby-1.8.7-p249 > PublicSuffixService.parse('*.example.com.')
PublicSuffixService::DomainNotAllowed: `*.example.com.' is not allowed according to Registry policy

It's sometimes useful to handle miskeyed input data (DomainInvalid) differently to domains that shouldn't exist (DomainInvalid). For example, in an application I'm working on we ignore DomainInvalid because some of the hostnames are on private networks (eg: host.bigcompany). Without extra error checking code our application will fail to handle a common data input mistake.

Also, example.com. is actually a valid hostname - the trailing . implies a fully qualified domain name.

@weppos
Copy link
Owner

weppos commented Nov 15, 2010

It's sometimes useful to handle miskeyed input data (DomainInvalid) differently to domains that shouldn't exist (DomainInvalid).

I'm not sure I understood your point. Could you please add a more concrete example?

Also, example.com. is actually a valid hostname - the trailing . implies a fully qualified domain name.

You're right, but this kind of validation goes beyond the purpose of the public suffix list.

@paulhammond
Copy link
Author

It's sometimes useful to handle miskeyed input data (DomainInvalid) differently to domains that shouldn't exist (DomainInvalid).

I'm not sure I understood your point. Could you please add a more concrete example?

I'm working on improving some code that validates domains where we want to exclude known public suffixes like 'co.uk' but allow things like 'department.bigcompany' for internal-only TLDs inside a company. To do this we're pre-filtering domains, sending it to PublicSuffixService.parse() and catching some of the exceptions raised. Thanks to your code it handles everything we've thrown at it, except for trailing periods.

In debugging the problem I got the exceptions confused, forgetting that PublicSuffixService.parse('host.nonexistant') will throw DomainInvalid not DomainNotAllowed - as a result that whole paragraph doesn't make sense. Sorry for the confusion.

Still, I think it's a bug that this code raises DomainNotAllowed:

PublicSuffixService.parse('example.com.')

Doing this suggests that '.com.' is a known TLD, but 'example' isn't allowed to be registered underneath it, which isn't true.

Tracing the code shows that RuleList.default.find() is finding the com rule, but calling allow('example.com.') on that rule returns false. One piece of the code is flexible about trailing periods, the other isn't.

It seems to me it should either raise DomainInvalid or be valid, depending on whether you think parsing 'example.com.' is within the scope of this library. Raising DomainNotAllowed seems inconsistent and confusing.

@weppos
Copy link
Owner

weppos commented Nov 18, 2010

Raising DomainNotAllowed seems inconsistent and confusing.

I totally agree.

It seems to me it should either raise DomainInvalid or be valid, depending on whether you think parsing 'example.com.'

This is the key point. I need to understand whether the library should be clever enough to handle this, or if I prefer to follow the basic principles of the Public Suffix List.

It seems to me you believe PublicSuffixService should handle it. Is it true?
My only concern is to avoid users' confusion.

@paulhammond
Copy link
Author

I thought about this and I think PublicSuffixService should not handle example.com. but this should be noted somewhere in the documentation.

I think this is the behavior most people would expect - either because they're unaware that domains can in some cases have trailing periods, or because they're aware of it and want to normalize their domains before checking them against the Public Suffix List (which is what the browsers appear to do).

Also, as you mentioned above, doing domain validation is beyond the scope of the PublicSuffixService, just as you'd expect a developer to remove a trailing comma before passing a domain to the library, it's reasonable for them to remove a trailing period too.

@weppos
Copy link
Owner

weppos commented Dec 5, 2010

Add support for Fully Qualified Domain Names (closed by 4b3f695)

@weppos
Copy link
Owner

weppos commented Dec 5, 2010

Due to the way how the library works, it was much more flexible to add the support for FQDN other than raising some kind of error. The release 0.8.0 correctly detects and parses FQDN.

Thanks for your feedback.

camilo pushed a commit to camilo/public_suffix_service that referenced this issue May 26, 2011
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants