Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Purl cannot parse some Blogspot sudomains. #38

Closed
Aristona opened this issue Oct 25, 2014 · 11 comments
Closed

Purl cannot parse some Blogspot sudomains. #38

Aristona opened this issue Oct 25, 2014 · 11 comments

Comments

@Aristona
Copy link

Hi,

While those work fine:

http://anil.blogspot.com.tr :: $purl->subdomain is "anil"
http://anil.blogspot.com :: $purl->subdomain is "anil"

Those domain extensions doesn't work:

http://anil.blogspot.co.uk :: $purl->subdomain is NULL
http://anil.blogspot.nl :: $purl->subdomain is NULL
http://anil.blogspot.in :: $purl->subdomain is NULL

Could be a bug.

Is there any way I can extend domain extensions list for Blogspot and allow those domain extensions too?

@peter279k
Copy link
Contributor

This issue is happened by this line and this is the php-domain-parser package.
You should submit the issue for the php-domain-parser.

After upgrade that package version to 3.0 version, this issue has been existed.
Here is my code I run:

<?php

require_once 'vendor/autoload.php';

$pslManager = new Pdp\PublicSuffixListManager();
$parser = new Pdp\Parser($pslManager->getList());

$host = 'http://anil.blogspot.co.uk';
$url = $parser->parseUrl($host);
var_dump($url);

@jwage jwage closed this as completed May 13, 2018
@jwage
Copy link
Owner

jwage commented May 13, 2018

Should we stop using that package? I don't think it is being maintained.

@peter279k
Copy link
Contributor

@jwage , do you mean we stop using the php-domain-parser package?

@jwage
Copy link
Owner

jwage commented May 13, 2018

Yes, we would have to bring what it does in house. I'm not sure if it makes sense.

@peter279k
Copy link
Contributor

peter279k commented May 13, 2018

We can use another parser package to replace the php-domain-parser package.
How about refering this one?
What do you think?

Thanks.

@jwage
Copy link
Owner

jwage commented May 14, 2018

@peter279k That package is fully featured? I think we only use php-domain-parser for a very small piece of functionality so I don't think it makes sense to pull in a big library like that. What if we just pull the small bit of functionality we depend on php-domain-parser for back in to Purl?

@jeremykendall
Copy link
Collaborator

PHP Domain Parser is under active maintenance. If you’re having issues with parsing please make sure the public suffix list cache is up to date. If that doesn’t solve the issue please open an issue against the parser so we can get it taken care of. Thanks!

@jeremykendall
Copy link
Collaborator

By parsing, the domain parser only helps identify component parts of a URL: sub domain, domain, and registrable domain. It is definitely not a full featured URL parser. It’s intended as a complement to purl.

@peter279k
Copy link
Contributor

@jeremykendall, thank you for your reply.
I appreciate that explanation and I think the domain, and sub domain should be implemented by purl.
How about opening another issue about this, @jwage ?

@jeremykendall
Copy link
Collaborator

@peter279k I highly recommend doing a bit of research about URL parsing before deciding to attempt to implement it yourself. Regex won't cut it. I'm not saying you shouldn't, and switching from php-domain-parser might be the best for the project, but it's a topic that's fairly complex and should be well considered before making a decision to reimplement the functionality.

@jwage
Copy link
Owner

jwage commented May 14, 2018

@jeremykendall Thanks for chiming in. @peter279k Lets open an issue on php-domain-parser and see if we can get it fixed there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants