-
Notifications
You must be signed in to change notification settings - Fork 417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expansion fails with some addresses #351
Comments
|
Correction - it fails when using country-based database as well. There seem to be cases when it does and when it doesn't. So, while the issue is there, it maybe not resolved by using smaller database for address parser. |
|
Just confirmed it on Ubuntu 16.04 as well. From further experimentation and looking at the log (LOG_LEVEL_DEBUG), it looks to be locked in |
|
Since I didn't have this issue earlier, I looked into recent commits of transliteration.c and found that reversal of 2290b09 (for that file) fixes the problem. Without fix:
With "fix"
Same issue is with expansion of "Sopruse sild / Нарвскии мост Дружбы" that was hanging before "fix". Hence the question, what was the reason for adding |
|
I don't really understand this code, but working backwards from the logs "at&t", "at&t" and "Хозтовары" I came up with the following patch: Test suite passes and breaks the infinite loop. Beyond that, I have no idea if this is correct as I barely understand the code I just changed. |
|
@kidmeier - thank you very much! Have the same issue as you - I don't really understand the code. Let's see if @albarrentine can get a look at it and tell if its a right approach. |
Hi,
I am working on incorporating the latest libpostal version into my geocoder. Changes of API from .37 to 1.0 are accounted for and it seems to work on mobile as well, after splitting address parser into country based datasets (#132).
As a part of the geocoder data import, I run all address parts through
libpostal_expand_address, as in https://github.com/rinigus/geocoder-nlp/blob/master/importer/src/main.cpp#L537. As a result, all addresses from the planet are expanded and, later, used in the search. You could also consider it as a test for expansion code of libpostal. So, I am hitting some examples which fail.Let's start with the first one:
Somewhere in Cyprus, Russian word is used leading to hanging of expansion function when used with the datasets provided by libpostal (not country-based ones):
Backtrace:
By running it further, I get similar backtraces. The memory gets allocated more and more (could get to 10+GB RAM, have also seen 100+GB RAM) until its killed by the kernel.
When using country specific datasets (CY or RU), all works nicely and gets normalized as expected.
I am using 43795a3 version, but I think it would be similar issue for 1.0.0 as well. Running on Linux, Gentoo.
PS: I was checking out the awesome libpostal, and saw something that could be improved <--- could not agree more :)
The text was updated successfully, but these errors were encountered: