Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Transliteration::toASCII / Measure Benchmarks? #5212

Open
andreas-gruenwald opened this issue Nov 7, 2019 · 2 comments

Comments

@andreas-gruenwald
Copy link
Contributor

@andreas-gruenwald andreas-gruenwald commented Nov 7, 2019

Feature Request

On URL building, \Pimcore\Tool\Transliteration::toASCII() is used extensivly. Also, the omnipresent File::getValidFilename() is utilizing this method.

In our organisation we recognized that building several 100s of URLs can take up to seconds based on \Pimcore\Tool\Transliteration::toASCII(). While we are aware that the encoding process is quite complex and involves many actions, we think that it might be worth taking a second look at the algorithm, as the outcome can be very beneficial for the performance of Pimcore and its applications.

A good approach would be to gather some benchmarks for the algorithm, e.g. how long does it take to build 5.000 URLs, and then tweak the algorithm and compare various test runs.

@brusch

This comment has been minimized.

Copy link
Member

@brusch brusch commented Nov 7, 2019

Pimcore itself doesn't utilize \Pimcore\Tool\Transliteration::toASCII() for generating URLs in the core - at least since version 5 (added UTF-8 support for URLs). File::getValidFilename() is very rarely used in the core, as far as I've seen only where the performance doesn't matter so much (not for frontend functionalities).

But yes, this function is probably not the fastest, it depends on https://www.drupal.org/project/transliteration where you can also find further details about the background of this library as well as the advantages it has over using transliterator_transliterate().
In the long run it would definitely make sense to get rid of this library, but currently there are some issues with

For details on this topic please have a look at #3175

@brusch

This comment has been minimized.

Copy link
Member

@brusch brusch commented Nov 7, 2019

Note: it turned out that transliterator_transliterate() is the performance bottleneck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.