Skip to content
This repository has been archived by the owner on Nov 20, 2019. It is now read-only.

TLDExtract not properly parsing hostname #47

Open
leem32 opened this issue Oct 3, 2019 · 1 comment
Open

TLDExtract not properly parsing hostname #47

leem32 opened this issue Oct 3, 2019 · 1 comment

Comments

@leem32
Copy link

leem32 commented Oct 3, 2019

I'm running some domain names through TLDExtract and came across a domain not being properly parsed.

The URL is called blogspot.com

$url = 'blogspot.com';
$domain = tld_extract($url);
var_dump($domain);

Returns: 
object(LayerShifter\TLDExtract\Result)[9]
  private 'subdomain' => null
  private 'hostname' => string 'blogspot.com' (length=12)
  private 'suffix' => null

Weirdly the URL 'flogspot.com' works fine and returns:

object(LayerShifter\TLDExtract\Result)[9]
  private 'subdomain' => null
  private 'hostname' => string 'flogspot' (length=8)
  private 'suffix' => string 'com' (length=3)

The URL logspot.com also works and returns:

object(LayerShifter\TLDExtract\Result)[9]
  private 'subdomain' => null
  private 'hostname' => string 'logspot' (length=7)
  private 'suffix' => string 'com' (length=3)

Any idea why the TLD in 'blogspot.com' is not being added to the suffix? Is this a bug?

@leem32
Copy link
Author

leem32 commented Oct 3, 2019

I see blogspot.com is in the public_suffix_list.dat. What's going on here? Can't Layershifter parse any of the URL's in that list? Any workarounds?

https://github.com/publicsuffix/list/blob/6f2b9e75eaf65bb75da83677655a59110088ebc5/public_suffix_list.dat#L5884

@leem32 leem32 changed the title Layershifter not properly parsing hostname TLDExtract not properly parsing hostname Oct 3, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant