osml10n_get_streetname_from_tags seems to be slower since v2.5.7 #40

otbutz · 2020-02-04T15:38:41Z

The following query takes about 3.3 seconds when executed with version v.2.5.7 or v2.5.8 in PostgreSQL 12.1 / Postigs 3.0:

SELECT localized_streetname
FROM planet_osm_line
WHERE "way" && ST_SetSRID('BOX3D(1222380.956336539 6339381.37785938,1233387.888409604 6350388.309932444)'::box3d, 3857)

If i drop the extension and reinstall v2.5.6 instead, the same query finishes within 400ms.

Edit: Discussion on the psql-bugs mailinglist: https://www.postgresql.org/message-id/20200204162901.s5hbfrl2ylb3jjsq%40alap3.anarazel.de (BUG #16241)

giggls · 2020-02-04T17:17:54Z

There are no big changes since v2.5.6. I just replaced the Thai transcription library and merged some miner street abbreviation stuff.

See v2.5.6...HEAD

otbutz · 2020-02-05T10:21:43Z

Did a quick bisect and 251b39b seems to be the cause of the performance regression.

otbutz · 2020-02-05T10:54:02Z

I think the number of regexes might be too high now: https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/regexp.c#L70-L97

giggls · 2020-02-05T11:18:52Z

So the problem is likely in osml10n_street_abbrev_fr. Try removing line 55 and 74 and see what happens. Still strange because this function adds only a few more regular expressions.

otbutz · 2020-02-05T12:24:26Z

osml10n_street_abbrev_en got extended by 12 regexp_replace calls as well. This is a total of 22 additional calls.

Reverting the changes to osml10n_street_abbrev_en reduces the query duration by 1.4 seconds and after restoring the previous version of osml10n_street_abbrev_fr the query is back to about ~400ms.

giggls · 2020-02-05T14:11:51Z

I would expect a moderate increase in runtime not something this extreme!
I have the suspicion that these look ahead regular expression is the root of this evil behavior.
Please try if street_abbrv.sql from commit 6f62e4e works faster. This should do more or less the same, but with a simple if then else logic instead of RE.

otbutz · 2020-02-05T14:56:28Z

I would expect a moderate increase in runtime not something this extreme!

My understanding of what is happening here:

Postgres is able to cache the last 32 regular expressions with a LRU cache eviction strategy. As the function needs to create more regular expression than the cache is able to keep, we end up with a cache hitrate of 0%.

Quoting regexp.c

With move-to-front, a reusable pattern is guaranteed to stay in the cache as long as it's used at least once in every MAX_CACHED_RES uses.

Example scenario:

LRU cache with 5 slots
The following regular expressions are used in a function:
A B C D E F G H

After compiling H the cache looks like this:
H G F E D

If the function is executed again, A is not cached anymore and needs to be recompiled:
A H G F E

The same goes for B:
B A H G F

And this goes on for all other regular expressions in the function...

My example query yields ~11k result rows which means that the number of recompilations should be somewhere near 11000 * x with x being the number of regular expressions.

If the function uses less than 32 regular expressions, everything stays cached.

giggls · 2020-02-05T16:22:07Z

Hm I am somewhat unwilling to reduce functionality and recompiling PostgreSQL is not an option.
However it might be possible to replace regexp_replace by generic replace function in some cases and merge multiple calls to regexp_replace into one in others.

otbutz · 2020-02-06T08:10:30Z

Hm I am somewhat unwilling to reduce functionality and recompiling PostgreSQL is not an option.

Perfectly understandable.

However it might be possible to replace regexp_replace by generic replace function in some cases and merge multiple calls to regexp_replace into one in others.

That sounds like a good plan. IMHO the biggest problem is the try-all approach in functions like osml10n_street_abbrev_all or osml10n_street_abbrev_latin. Reaching the regex cache limit is inevitable here if further cases are added in future releases.

Would it be possible to leverage the country_grid_osm table to derive the necessary osml10n_street_abbrev_xy calls? This would greatly increase the number of cache hits as streets in a single tile are very likely part of a single country. It would require coordinates being passed as parameters though.

giggls · 2020-02-06T09:45:52Z

Jepp unfortunately not a sustainable solution but a workaround (which might not even work).

I actually avoided using country_osm_grid for more obvious performance reasons than those we are discussion here.

Doing this would require an additional spatial Operation for the Generation of almost every single Object Name thus I seriously doubt that doing this would really be faster.

I wonder if it is possible to store pre-compiled regular expressions inside a database table and using them from there.

In PL/Python there is a global Object which would be able to hold such stuff thus moving to PL/Python could be an Option if PL/pgSQL does not provide something like this.

Unfortunately my current approach for l10n seems to have a couple of limitations. In the meantime I think that doing l10n during the import phase would probably be the better place.

chatelao · 2020-02-06T10:47:13Z

Do we have an estimation how fast Python would be in this case? Will porting the logic result in a speed gain?

giggls · 2020-02-06T10:49:08Z

It would be possible to save pre-compiled regular expression in the global dictionary thus caching will at least work in this case.

otbutz · 2020-02-06T10:53:23Z

Doing this would require an additional spatial Operation for the Generation of almost every single Object Name thus I seriously doubt that doing this would really be faster.

My idea would have been to use the bounding box applied by Mapnik to obtain the country codes in a subquery. This way we would only need a single additional spatial operation per tile. But i have no idea how fine grained the bounding box injection in Mapnik is.

I wonder if it is possible to store pre-compiled regular expressions inside a database table and using them from there.

I doubt that Postgres has a (de)serialization mechanism for compiled regular expressions but it would definitely solve the problem.

In PL/Python there is a global Object which would be able to hold such stuff thus moving to PL/Python could be an Option if PL/pgSQ does not provide something like this.

The thai transcript extension already requires python support. I don't see a problem if we would make it a requirement for osml10n as well. IMHO this is the best approach as extension users(or the german osm style do be more precise) don't have to deal with API changes.

Unfortunately my current approach for l10n seems to have a couple of limitations. In the meantime I think that doing l10n during the import phase would probably be the better place.

Could this be done with an extended tag transform script? But this would make testing a lot harder.

otbutz · 2020-02-06T10:56:13Z

Do we have an estimation how fast Python would be in this case?

The regex implementation of CPython should be on-par with the speed of its Postgres equivalent.

Will porting the logic result in a speed gain?

If all regular expressions stay cached: yes

giggls · 2020-02-06T11:37:31Z

OK, lets try this approach then.

I might be able to work on this at the upcoming Karlsruhe Hack Weekend
https://wiki.openstreetmap.org/wiki/Karlsruhe_Hack_Weekend_February_2020

Likely some code can be extracted from this Repo:
https://github.com/giggls/osml10n-python

otbutz · 2020-02-06T11:43:24Z

If we end up switching to Python for osml10n, it would probably be a good idea to drop osml10n_thai_transcript and include its logic into osml10n.

giggls · 2020-02-06T11:50:35Z

Certainly is. I would also like to add cantonese transcription but unfortunately this library seems to be not fast enough.

#35

otbutz · 2020-02-06T11:52:34Z

unfortunately this library seems to be not fast enough.

20 seconds startup time. Yikes.

chatelao · 2020-02-06T14:01:43Z

Does anybody have written performance tests we could add and use to compare before / after?

otbutz · 2020-02-07T09:02:13Z

Does anybody have written performance tests we could add and use to compare before / after?

SELECT osml10n_street_abbrev_all(concat('Some street avenue ', i::text)) AS abbrev
FROM generate_series(0, 10000) s(i);

Note: i had to use concat otherwise Postgres was smart enough to reuse the value for all rows.

v2.5.6: ~400-500ms
v2.5.8: 13s(!)

This is a lot slower than my initial values because we don't have any NULL values which can be skipped.

giggls · 2020-02-07T09:55:15Z

Sorry I had the other repository private, should be public now. Was a proof of concept Reimplementation in python of an older Version of the 2.x code.

So the first thing which should be done is updating
https://github.com/giggls/osml10n-python/blob/master/osml10n/street_abbrev.py

otbutz · 2020-02-07T11:40:57Z

I guess that osml10n_translit.cpp and kanjitranscript.c could be replaced with Python code?

giggls · 2020-02-07T12:47:21Z

Yes, see https://github.com/giggls/osml10n-python/blob/master/README.md

giggls · 2020-02-07T12:48:23Z

Sorry https://github.com/giggls/osml10n-python/blob/master/INSTALL.md

otbutz · 2020-02-14T15:58:20Z

I might be able to work on this at the upcoming Karlsruhe Hack Weekend
https://wiki.openstreetmap.org/wiki/Karlsruhe_Hack_Weekend_February_2020

Happy hacking 😉

I am not a Python developer, but I might be able to help out with smaller PRs and testing.

frodrigo · 2020-05-14T09:55:49Z

A split may be applied between Latin and non Latin rule set to reduce the number of regex to apply. Eg. by looking at the alphabet of the first char of the string.
In the same idea regex can be set under IF on first letter to avoid running regex if it does not match.

No linked to this issue, but speed up can also be done by switching from plpgsql to plain SQL function. (about 2x in speed from previous speed from quick test).

giggls · 2020-05-14T10:14:31Z

Well in the meantime I think, that all of this is curing the symptoms not cause.

Doing l10b inside osm2pgsql instead of PostgreSQL queries is probably the way to go.

Using the new Flex backend I would need to translate my mechanisms to lua code. What I already have is a proof of concept daemon which does Latin transcription written in python.

The current plan is to connect to this daemon from a lua script running inside osm2pgsql.

The main disadvantage of this approach is that the target languages are then kind of hard-coded into the resulting database but this is not usually a problem because most applications do not need more than a handful of them.

If anybody of wants to help implementing this approach please do contact me! Somebody more familiar to Lua than me would be nice, because I so have next to no experience using this language.

giggls · 2020-05-14T10:16:40Z

Aah, there is something I forgot. I would be happy to apply pull requests "curing the symptoms" of the current code :)

frodrigo · 2020-05-14T10:22:01Z

Thank I will try to make a PR.
Note I'm not using this from osm2pgsql but from OpenMapTiles.

giggls · 2020-05-14T10:24:30Z

Hm AFAIR OpenMapTiles is using imposm which has no included script-language engine for tag translation.

otbutz · 2020-05-14T10:52:42Z

The main disadvantage of this approach is that the target languages are then kind of hard-coded into the resulting database but this is not usually a problem because most applications do not need more than a handful of them.

I can see two additional disadvantages:

testing and minor updates would require a complete reimport to take effect
lua is not that popular which might hit project collaboration

The pl/python approach still seems like the best choice because it solves the regex caching issues which are the root of the performance problem here.

…to overpass the limit of cached regex giggls#40

…to overpass the limit of cached regex giggls#40 Rewrite osml10n_street_abbrev_fr and osml10n_street_abbrev_en to use less regex Remove regex from osml10n_street_abbrev_ru and osml10n_street_abbrev_uk

Split osml10n_street_abbrev_all in function of the detected alphabet to overpass the limit of cached regex #40

giggls · 2020-05-25T10:22:50Z

Can you please verify that the current master branch fixes this issue?

otbutz · 2020-06-02T08:16:53Z

SELECT localized_streetname
FROM planet_osm_line
WHERE "way" && ST_SetSRID('BOX3D(1222380.956336539 6339381.37785938,1233387.888409604 6350388.309932444)'::box3d, 3857)

v2.5.6: 400ms
v2.5.8: 3.3s
v2.5.9: 90ms

SELECT osml10n_street_abbrev_all(concat('Some street avenue ', i::text)) AS abbrev
FROM generate_series(0, 10000) s(i);

v2.5.6: 400-500ms
v2.5.8: 13s
v2.5.9: 30ms

giggls · 2020-06-02T08:21:20Z

For QA and just to make shure, that my tests catch all important issues. Did you check if the output is identical?

otbutz · 2020-06-02T08:31:01Z

Hard to say as the bbox query yields about 6.4k rows with german road names mixed with NULL values. But i will try to compare the output with a server which uses v2.5.6

giggls · 2020-06-02T08:37:36Z

SELECT distinct localized_streetname                                                                                                   
FROM planet_osm_line
WHERE "way" && ST_SetSRID('BOX3D(1222380.956336539 6339381.37785938,1233387.888409604 6350388.309932444)'::box3d, 3857)
AND name is not NULL;

Not usable for measurement of performance any more, but for comparing output.

otbutz · 2020-06-02T08:43:32Z

Looks good to me judging from a few samples.

frodrigo added a commit to frodrigo/mapnik-german-l10n that referenced this issue May 14, 2020

Split osml10n_street_abbrev_all in function of the detected alphabet …

02dd30a

…to overpass the limit of cached regex giggls#40

frodrigo added a commit to frodrigo/mapnik-german-l10n that referenced this issue May 14, 2020

Split osml10n_street_abbrev_all in function of the detected alphabet …

d1d53de

…to overpass the limit of cached regex giggls#40

giggls added a commit that referenced this issue May 14, 2020

Merge pull request #41 from frodrigo/split_osml10n_street_abbrev_all

909ba16

Split osml10n_street_abbrev_all in function of the detected alphabet to overpass the limit of cached regex #40

giggls mentioned this issue May 25, 2020

Some thoughts about the future of this project #53

Open

otbutz closed this as completed Jun 3, 2020

osml10n_get_streetname_from_tags seems to be slower since v2.5.7 #40

osml10n_get_streetname_from_tags seems to be slower since v2.5.7 #40

Comments

otbutz commented Feb 4, 2020 • edited

giggls commented Feb 4, 2020

otbutz commented Feb 5, 2020

otbutz commented Feb 5, 2020

giggls commented Feb 5, 2020

otbutz commented Feb 5, 2020

giggls commented Feb 5, 2020

otbutz commented Feb 5, 2020 • edited

giggls commented Feb 5, 2020

otbutz commented Feb 6, 2020

giggls commented Feb 6, 2020 • edited

chatelao commented Feb 6, 2020

giggls commented Feb 6, 2020

otbutz commented Feb 6, 2020 • edited

otbutz commented Feb 6, 2020 • edited

giggls commented Feb 6, 2020 • edited

otbutz commented Feb 6, 2020 • edited

giggls commented Feb 6, 2020

otbutz commented Feb 6, 2020

chatelao commented Feb 6, 2020

otbutz commented Feb 7, 2020 • edited

giggls commented Feb 7, 2020 • edited

otbutz commented Feb 7, 2020

giggls commented Feb 7, 2020

giggls commented Feb 7, 2020

otbutz commented Feb 14, 2020

frodrigo commented May 14, 2020

giggls commented May 14, 2020

giggls commented May 14, 2020

frodrigo commented May 14, 2020 • edited

giggls commented May 14, 2020

otbutz commented May 14, 2020

giggls commented May 25, 2020 • edited

otbutz commented Jun 2, 2020

giggls commented Jun 2, 2020

otbutz commented Jun 2, 2020

giggls commented Jun 2, 2020

otbutz commented Jun 2, 2020

otbutz commented Feb 4, 2020 •

edited

otbutz commented Feb 5, 2020 •

edited

giggls commented Feb 6, 2020 •

edited

otbutz commented Feb 6, 2020 •

edited

otbutz commented Feb 6, 2020 •

edited

giggls commented Feb 6, 2020 •

edited

otbutz commented Feb 6, 2020 •

edited

otbutz commented Feb 7, 2020 •

edited

giggls commented Feb 7, 2020 •

edited

frodrigo commented May 14, 2020 •

edited

giggls commented May 25, 2020 •

edited