Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump jeremykendall/php-domain-parser to version 6 #79

Closed
tacman opened this issue Jul 14, 2022 · 10 comments
Closed

Bump jeremykendall/php-domain-parser to version 6 #79

tacman opened this issue Jul 14, 2022 · 10 comments

Comments

@tacman
Copy link
Contributor

tacman commented Jul 14, 2022

I'm trying to integrate this into an app that is already locked to psr/log:^3.

jeremykendall/php-domain-parser version 5 is locked to psr/log:^1

There are 2 tests that fail when bumping to the new version. I can fix and submit a PR.

@tacman
Copy link
Contributor Author

tacman commented Jul 14, 2022

Sigh. Version 6 no longer including a manager, so using this requires a cache and mechanism for fetching and storing the rules.

It's not too difficult, but it's not a trivial syntax change either (though there are some namespace changes).

Thoughts on how to proceed?

@spekulatius
Copy link
Owner

Hey @tacman

yeah, I remember there were some structural changes in the package. Do you think you can get it done? Namespaces can usually be replaced easily most of the time.

Cheers,
Peter

@tacman
Copy link
Contributor Author

tacman commented Jul 15, 2022

OK. There are 2 approaches. The easiest is to download the rules file, add it to the repo, and then load it. Of course, the rules will become stale.

The better approach require a dependency on a cache. Then we can fetch the rules like this, which will update the rules every 24 hours:

    public function getTldCollection(): Rules
    {
        $cache = new FilesystemAdapter(); // or some other cache.

        $rules = $cache->get('pdp_rules', function (ItemInterface $item) {
            // The callable will only be executed on a cache miss.
            $item->expiresAfter(3600 * 24);
            $response = $this->client->request(
                'GET',
                PsrStorageFactory::PUBLIC_SUFFIX_LIST_URI
            );
            return $response->getContent();
        });

        $publicSuffixList = Rules::fromString($rules);
        return $publicSuffixList;

I'll go ahead and implement this to make it functional, but I'm not sure how to code it so that the user can inject whatever cache they already have in their application.

@tacman
Copy link
Contributor Author

tacman commented Jul 15, 2022

Since you've tagged this as a new version, can we also bump to PHP8?

@tacman
Copy link
Contributor Author

tacman commented Jul 15, 2022

Also, I'd like to migrate this to psr-4, and separate the classes into their own files. Or perhaps you should do that, it's likely a BC.

@tacman
Copy link
Contributor Author

tacman commented Jul 15, 2022

I started down the rabbit hole...

If phpscraper needs a cache for the domain parse, a CacheInterface cache should probably be injected. But that means the phpscaper should itself be a service that's injected, rather than called with new phpscaper().

Alternatively, we can add a CacheAwareInterface, and add the cache via a method call.

Alas, I'm not as expert in this as I'd like to be!

@spekulatius
Copy link
Owner

The question of the cache was stopped me too. I was actually thinking of storing a file/set of files somewhere to avoid handling the questions of integration, especially with simple VanillaPHP projects (where PHPscraper comes in handy for me most).

@spekulatius
Copy link
Owner

Hey @tacman,

Have you made progress implementing a cache? I've seen this commit and was wondering if we can get a framework agnostic-solution working. I'd still try to avoid injecting a CacheInterface as it is framework dependent. Happy to hear your thoughts!

Cheers,
Peter

@spekulatius
Copy link
Owner

Alternatively either spatie/url or thephpleague/uri could replace jeremykendall/php-domain-parser. I still need to confirm if the libs are suitable for the job tho.

@spekulatius spekulatius added this to the 1.0 milestone Oct 14, 2022
spekulatius added a commit that referenced this issue Oct 28, 2022
@spekulatius
Copy link
Owner

For now we are using league/uri for URL processing, with this the subdomain-specific filtering has been dropped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants