Skip to content
This repository has been archived by the owner on Jul 14, 2022. It is now read-only.

osml10n_get_streetname_from_tags seems to be slower since v2.5.7 #40

Closed
otbutz opened this issue Feb 4, 2020 · 37 comments
Closed

osml10n_get_streetname_from_tags seems to be slower since v2.5.7 #40

otbutz opened this issue Feb 4, 2020 · 37 comments

Comments

@otbutz
Copy link

otbutz commented Feb 4, 2020

The following query takes about 3.3 seconds when executed with version v.2.5.7 or v2.5.8 in PostgreSQL 12.1 / Postigs 3.0:

SELECT localized_streetname
FROM planet_osm_line
WHERE "way" && ST_SetSRID('BOX3D(1222380.956336539 6339381.37785938,1233387.888409604 6350388.309932444)'::box3d, 3857)

If i drop the extension and reinstall v2.5.6 instead, the same query finishes within 400ms.

Edit: Discussion on the psql-bugs mailinglist: https://www.postgresql.org/message-id/20200204162901.s5hbfrl2ylb3jjsq%40alap3.anarazel.de (BUG #16241)

@giggls
Copy link
Owner

giggls commented Feb 4, 2020

There are no big changes since v2.5.6. I just replaced the Thai transcription library and merged some miner street abbreviation stuff.

See v2.5.6...HEAD

@otbutz
Copy link
Author

otbutz commented Feb 5, 2020

Did a quick bisect and 251b39b seems to be the cause of the performance regression.

@otbutz
Copy link
Author

otbutz commented Feb 5, 2020

I think the number of regexes might be too high now: https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/regexp.c#L70-L97

@giggls
Copy link
Owner

giggls commented Feb 5, 2020

So the problem is likely in osml10n_street_abbrev_fr. Try removing line 55 and 74 and see what happens. Still strange because this function adds only a few more regular expressions.

@otbutz
Copy link
Author

otbutz commented Feb 5, 2020

osml10n_street_abbrev_en got extended by 12 regexp_replace calls as well. This is a total of 22 additional calls.

Reverting the changes to osml10n_street_abbrev_en reduces the query duration by 1.4 seconds and after restoring the previous version of osml10n_street_abbrev_fr the query is back to about ~400ms.

@giggls
Copy link
Owner

giggls commented Feb 5, 2020

I would expect a moderate increase in runtime not something this extreme!
I have the suspicion that these look ahead regular expression is the root of this evil behavior.
Please try if street_abbrv.sql from commit 6f62e4e works faster. This should do more or less the same, but with a simple if then else logic instead of RE.

@otbutz
Copy link
Author

otbutz commented Feb 5, 2020

I would expect a moderate increase in runtime not something this extreme!

My understanding of what is happening here:

Postgres is able to cache the last 32 regular expressions with a LRU cache eviction strategy. As the function needs to create more regular expression than the cache is able to keep, we end up with a cache hitrate of 0%.

Quoting regexp.c

With move-to-front, a reusable pattern is guaranteed to stay in the cache as long as it's used at least once in every MAX_CACHED_RES uses.

Example scenario:

LRU cache with 5 slots
The following regular expressions are used in a function:
A B C D E F G H

After compiling H the cache looks like this:
H G F E D

If the function is executed again, A is not cached anymore and needs to be recompiled:
A H G F E

The same goes for B:
B A H G F

And this goes on for all other regular expressions in the function...

My example query yields ~11k result rows which means that the number of recompilations should be somewhere near 11000 * x with x being the number of regular expressions.

If the function uses less than 32 regular expressions, everything stays cached.

@giggls
Copy link
Owner

giggls commented Feb 5, 2020

Hm I am somewhat unwilling to reduce functionality and recompiling PostgreSQL is not an option.
However it might be possible to replace regexp_replace by generic replace function in some cases and merge multiple calls to regexp_replace into one in others.

@otbutz
Copy link
Author

otbutz commented Feb 6, 2020

Hm I am somewhat unwilling to reduce functionality and recompiling PostgreSQL is not an option.

Perfectly understandable.

However it might be possible to replace regexp_replace by generic replace function in some cases and merge multiple calls to regexp_replace into one in others.

That sounds like a good plan. IMHO the biggest problem is the try-all approach in functions like osml10n_street_abbrev_all or osml10n_street_abbrev_latin. Reaching the regex cache limit is inevitable here if further cases are added in future releases.

Would it be possible to leverage the country_grid_osm table to derive the necessary osml10n_street_abbrev_xy calls? This would greatly increase the number of cache hits as streets in a single tile are very likely part of a single country. It would require coordinates being passed as parameters though.

@giggls
Copy link
Owner

giggls commented Feb 6, 2020

Jepp unfortunately not a sustainable solution but a workaround (which might not even work).

I actually avoided using country_osm_grid for more obvious performance reasons than those we are discussion here.

Doing this would require an additional spatial Operation for the Generation of almost every single Object Name thus I seriously doubt that doing this would really be faster.

I wonder if it is possible to store pre-compiled regular expressions inside a database table and using them from there.

In PL/Python there is a global Object which would be able to hold such stuff thus moving to PL/Python could be an Option if PL/pgSQL does not provide something like this.

Unfortunately my current approach for l10n seems to have a couple of limitations. In the meantime I think that doing l10n during the import phase would probably be the better place.

@chatelao
Copy link
Contributor

chatelao commented Feb 6, 2020

Do we have an estimation how fast Python would be in this case? Will porting the logic result in a speed gain?

@giggls
Copy link
Owner

giggls commented Feb 6, 2020

It would be possible to save pre-compiled regular expression in the global dictionary thus caching will at least work in this case.

@otbutz
Copy link
Author

otbutz commented Feb 6, 2020

Doing this would require an additional spatial Operation for the Generation of almost every single Object Name thus I seriously doubt that doing this would really be faster.

My idea would have been to use the bounding box applied by Mapnik to obtain the country codes in a subquery. This way we would only need a single additional spatial operation per tile. But i have no idea how fine grained the bounding box injection in Mapnik is.

I wonder if it is possible to store pre-compiled regular expressions inside a database table and using them from there.

I doubt that Postgres has a (de)serialization mechanism for compiled regular expressions but it would definitely solve the problem.

In PL/Python there is a global Object which would be able to hold such stuff thus moving to PL/Python could be an Option if PL/pgSQ does not provide something like this.

The thai transcript extension already requires python support. I don't see a problem if we would make it a requirement for osml10n as well. IMHO this is the best approach as extension users(or the german osm style do be more precise) don't have to deal with API changes.

Unfortunately my current approach for l10n seems to have a couple of limitations. In the meantime I think that doing l10n during the import phase would probably be the better place.

Could this be done with an extended tag transform script? But this would make testing a lot harder.

@otbutz
Copy link
Author

otbutz commented Feb 6, 2020

Do we have an estimation how fast Python would be in this case?

The regex implementation of CPython should be on-par with the speed of its Postgres equivalent.

Will porting the logic result in a speed gain?

If all regular expressions stay cached: yes

@giggls
Copy link
Owner

giggls commented Feb 6, 2020

OK, lets try this approach then.

I might be able to work on this at the upcoming Karlsruhe Hack Weekend
https://wiki.openstreetmap.org/wiki/Karlsruhe_Hack_Weekend_February_2020

Likely some code can be extracted from this Repo:
https://github.com/giggls/osml10n-python

@otbutz
Copy link
Author

otbutz commented Feb 6, 2020

If we end up switching to Python for osml10n, it would probably be a good idea to drop osml10n_thai_transcript and include its logic into osml10n.

@giggls
Copy link
Owner

giggls commented Feb 6, 2020

Certainly is. I would also like to add cantonese transcription but unfortunately this library seems to be not fast enough.

#35

@otbutz
Copy link
Author

otbutz commented Feb 6, 2020

unfortunately this library seems to be not fast enough.

20 seconds startup time. Yikes.

@chatelao
Copy link
Contributor

chatelao commented Feb 6, 2020

Does anybody have written performance tests we could add and use to compare before / after?

@otbutz
Copy link
Author

otbutz commented Feb 7, 2020

Does anybody have written performance tests we could add and use to compare before / after?

SELECT osml10n_street_abbrev_all(concat('Some street avenue ', i::text)) AS abbrev
FROM generate_series(0, 10000) s(i);

Note: i had to use concat otherwise Postgres was smart enough to reuse the value for all rows.

  • v2.5.6: ~400-500ms
  • v2.5.8: 13s(!)

This is a lot slower than my initial values because we don't have any NULL values which can be skipped.

@giggls
Copy link
Owner

giggls commented Feb 7, 2020

Sorry I had the other repository private, should be public now. Was a proof of concept Reimplementation in python of an older Version of the 2.x code.

So the first thing which should be done is updating
https://github.com/giggls/osml10n-python/blob/master/osml10n/street_abbrev.py

@otbutz
Copy link
Author

otbutz commented Feb 7, 2020

I guess that osml10n_translit.cpp and kanjitranscript.c could be replaced with Python code?

@giggls
Copy link
Owner

giggls commented Feb 7, 2020

@giggls
Copy link
Owner

giggls commented Feb 7, 2020

@otbutz
Copy link
Author

otbutz commented Feb 14, 2020

I might be able to work on this at the upcoming Karlsruhe Hack Weekend
https://wiki.openstreetmap.org/wiki/Karlsruhe_Hack_Weekend_February_2020

Happy hacking 😉

I am not a Python developer, but I might be able to help out with smaller PRs and testing.

@frodrigo
Copy link
Contributor

A split may be applied between Latin and non Latin rule set to reduce the number of regex to apply. Eg. by looking at the alphabet of the first char of the string.
In the same idea regex can be set under IF on first letter to avoid running regex if it does not match.

No linked to this issue, but speed up can also be done by switching from plpgsql to plain SQL function. (about 2x in speed from previous speed from quick test).

@giggls
Copy link
Owner

giggls commented May 14, 2020

Well in the meantime I think, that all of this is curing the symptoms not cause.

Doing l10b inside osm2pgsql instead of PostgreSQL queries is probably the way to go.

Using the new Flex backend I would need to translate my mechanisms to lua code. What I already have is a proof of concept daemon which does Latin transcription written in python.

The current plan is to connect to this daemon from a lua script running inside osm2pgsql.

The main disadvantage of this approach is that the target languages are then kind of hard-coded into the resulting database but this is not usually a problem because most applications do not need more than a handful of them.

If anybody of wants to help implementing this approach please do contact me! Somebody more familiar to Lua than me would be nice, because I so have next to no experience using this language.

@giggls
Copy link
Owner

giggls commented May 14, 2020

Aah, there is something I forgot. I would be happy to apply pull requests "curing the symptoms" of the current code :)

@frodrigo
Copy link
Contributor

frodrigo commented May 14, 2020

Thank I will try to make a PR.
Note I'm not using this from osm2pgsql but from OpenMapTiles.

@giggls
Copy link
Owner

giggls commented May 14, 2020

Hm AFAIR OpenMapTiles is using imposm which has no included script-language engine for tag translation.

@otbutz
Copy link
Author

otbutz commented May 14, 2020

The main disadvantage of this approach is that the target languages are then kind of hard-coded into the resulting database but this is not usually a problem because most applications do not need more than a handful of them.

I can see two additional disadvantages:

  • testing and minor updates would require a complete reimport to take effect
  • lua is not that popular which might hit project collaboration

The pl/python approach still seems like the best choice because it solves the regex caching issues which are the root of the performance problem here.

frodrigo added a commit to frodrigo/mapnik-german-l10n that referenced this issue May 14, 2020
frodrigo added a commit to frodrigo/mapnik-german-l10n that referenced this issue May 14, 2020
frodrigo added a commit to frodrigo/mapnik-german-l10n that referenced this issue May 14, 2020
…to overpass the limit of cached regex giggls#40

Rewrite osml10n_street_abbrev_fr and osml10n_street_abbrev_en to use less regex
Remove regex from osml10n_street_abbrev_ru and osml10n_street_abbrev_uk
giggls added a commit that referenced this issue May 14, 2020
Split osml10n_street_abbrev_all in function of the detected alphabet to overpass the limit of cached regex #40
@giggls
Copy link
Owner

giggls commented May 25, 2020

Can you please verify that the current master branch fixes this issue?

@otbutz
Copy link
Author

otbutz commented Jun 2, 2020

SELECT localized_streetname
FROM planet_osm_line
WHERE "way" && ST_SetSRID('BOX3D(1222380.956336539 6339381.37785938,1233387.888409604 6350388.309932444)'::box3d, 3857)
  • v2.5.6: 400ms
  • v2.5.8: 3.3s
  • v2.5.9: 90ms
SELECT osml10n_street_abbrev_all(concat('Some street avenue ', i::text)) AS abbrev
FROM generate_series(0, 10000) s(i);
  • v2.5.6: 400-500ms
  • v2.5.8: 13s
  • v2.5.9: 30ms

@giggls
Copy link
Owner

giggls commented Jun 2, 2020

For QA and just to make shure, that my tests catch all important issues. Did you check if the output is identical?

@otbutz
Copy link
Author

otbutz commented Jun 2, 2020

Hard to say as the bbox query yields about 6.4k rows with german road names mixed with NULL values. But i will try to compare the output with a server which uses v2.5.6

@giggls
Copy link
Owner

giggls commented Jun 2, 2020

SELECT distinct localized_streetname                                                                                                   
FROM planet_osm_line
WHERE "way" && ST_SetSRID('BOX3D(1222380.956336539 6339381.37785938,1233387.888409604 6350388.309932444)'::box3d, 3857)
AND name is not NULL;

Not usable for measurement of performance any more, but for comparing output.

@otbutz
Copy link
Author

otbutz commented Jun 2, 2020

Looks good to me judging from a few samples.

@otbutz otbutz closed this as completed Jun 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants