A fast and memory-efficient PHP library to query the Mozilla public suffix list.
composer require souplette/fusbup
use Souplette\FusBup\PublicSuffixList;
$psl = new PublicSuffixList();
// get the eTLD (short for Effective Top-Level Domain) of a domain
assert($psl->getEffectiveTLD('foo.co.uk') === 'co.uk');
// check if a domain is an eTLD
assert($psl->isEffectiveTLD('fukushima.jp'));
// split a domain into it's private and eTLD parts
assert($psl->splitEffectiveTLD('www.foo.co.uk') === ['www.foo', 'co.uk']);
use Souplette\FusBup\PublicSuffixList;
$psl = new PublicSuffixList();
// get the registrable part (eTLD+1) of a domain
assert($psl->getRegistrableDomain('www.foo.co.uk') === 'foo.co.uk');
// split a domain into it's private and registrable parts.
assert($psl->splitRegistrableDomain('www.foo.co.uk') === ['www', 'foo.co.uk']);
The PublicSuffixList
class implements the
RFC6265 algorithm
for matching a cookie domain against a request domain.
use Souplette\FusBup\PublicSuffixList;
$psl = new PublicSuffixList();
// check if a cookie domain is applicable to a hostname
$requestDomain = 'my.domain.com'
$cookieDomain = '.domain.com';
assert($psl->isCookieDomainAcceptable($requestDomain, $cookieDomain));
// cookie are rejected if their domain is an eTLD:
assert(false === $psl->isCookieDomainAcceptable('foo.com', '.com'))
All PublicSuffixList
methods that return domains
return them in their normalized ASCII form.
use Souplette\FusBup\PublicSuffixList;
use Souplette\FusBup\Utils\Idn;
$psl = new PublicSuffixList();
assert($psl->getRegistrableDomain('☕.example') === 'xn--53h.example');
// use Idn::toUnicode() to convert them back to unicode if needed:
assert(Idn::toUnicode('xn--53h.example') === '☕.example');
The public suffix list contains about 10 000 rules as of 2023.
In order to be maximally efficient for all uses cases,
the PublicSuffixList
class can use two search algorithms
with different performance characteristics.
The first one (and the default) uses a DAFSA compiled to a binary string (this is the algorithm used in the Gecko and Chromium engines). The second one uses a compressed suffix tree compiled to PHP code.
Here is a summary of their respective pros and cons:
- DAFSA
- 👍 more memory efficient (this is just a 50Kb string in memory)
- 👍 faster to load (around 20μs on a SSD)
- 👎 slower to search (in the order of 100 000 ops/sec)
- Suffix tree
- 👎 less memory efficient (about 4Mb in memory)
- 👎 slower to load (around 4ms without opcache, 500μs when using opcache preloading)
- 👍 faster to search (in the order of 1 000 000 ops/sec)
Note that in both cases, the database will be lazily loaded.
Well, it depends on your use case but based on the aforementioned characteristics I would say: stick to the default (DAFSA) algorithm unless your app is going to make more than a few hundreds searches per seconds.
Both algorithm can be used by passing the appropriate loader to the PublicSuffixList
constructor.
use Souplette\FusBup\Loader\DafsaLoader;
use Souplette\FusBup\PublicSuffixList;
$psl = new PublicSuffixList(new DafsaLoader());
// since DafsaLoader is the default, the following is equivalent:
$psl = new PublicSuffixList();
use Souplette\FusBup\Loader\SuffixTreeLoader;
use Souplette\FusBup\PublicSuffixList;
$psl = new PublicSuffixList(new SuffixTreeLoader());
You should also configure opcache to preload the database:
In your php.ini
:
opcache.enabled=1
opcache.preload=/path/to/my/preload-script.php
In your preload script:
opcache_compile_file('/path/to/vendor/ju1ius/fusbup/Resources/psl.php');