Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Improve performance of SessionHandler::getSpiderID()
99f2805 already optimized this to avoid the need of calling ->getSpiderID() for logged-in users, but guest sessions still call ->getSpiderID() on every request to look up the legacy session. This commit massively improves the performance of ->getSpiderID() for all cases, but especially for requests where no spider can be matched. The latter previously required a full O(n) search across the spider list and thus was the worst case situation. This worst case situation likely happened for the vast majority of guest requests. But even cases where a spider can be matched will benefit from this. The improvements are achieved by two things: 1. The size of the cache that needs to be read and unserialized is reduced from 87k to 17k. 2. Instead of searching linearly through the list of spiders, needing to implicitly call ->__get() twice for each, the matching is performed by an optimized regular expression that effectively implements a prefix tree. If this regular expression matches, then the spiderID will be efficiently looked up in an array that is keyed by the matched string. Numbers for 10,000 calls to ->getSpiderID() on my computer running PHP 8.1: - Google Bot: From 0.44s down to 0.14s. - Firefox 98: From 1.05s down to 0.07s.
- Loading branch information