Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demo assumes "U" is abbreviation for unit #163

Open
orangejulius opened this issue Jul 11, 2018 · 5 comments
Open

Demo assumes "U" is abbreviation for unit #163

orangejulius opened this issue Jul 11, 2018 · 5 comments

Comments

@orangejulius
Copy link
Member

orangejulius commented Jul 11, 2018

Here's a fun one. It looks like some possibly too naive abbreviation handling is used, at least in the demo. Here, Washington, D.C.'s U street is written incorrectly as "unit street" in the upper left

image

@missinglink
Copy link
Member

missinglink commented Jul 11, 2018

libpostal seems to expand it this way:

./libpostal "U Street Northwest"
unit street northwest
u street northwest

I'm not sure why both variants are not being saved in the database:

sqlite3 /data/interpolation/street.db 'SELECT * FROM names WHERE id = 21452588'
34489668|21452588|unit street northwest

If we were to save both, I'm not sure how we could know to pick the second one for label generation.

The good news is the conflation matching is working great! :) ... so searching for U st returns the correct result.

The bad news is the result is returning the wrong label :(

For Pelias specifically, this issue can be avoided by using the name returned from the layer=street elasticsearch hit for label generation.

@missinglink
Copy link
Member

missinglink commented Jul 11, 2018

looks like we already do this for Pelias:

source_result.name.default = `${interpolation_result.properties.number} ${source_result.name.default}`;

https://github.com/pelias/api/blob/master/middleware/interpolate.js

@orangejulius
Copy link
Member Author

Just found another funny and confusing case of this: apparently libpostal always expands SE to european company.

In the Portland metro area, lots of streets start with Southeast and I was very very confused why every street seemed to be european company...

@missinglink
Copy link
Member

missinglink commented Mar 7, 2020

This is the same issue as #234

In Germany I'm seeing this issue as compound street names such as foostraße are being expanded and then can't be found using the original form.

We should revisit this and make sure all versions are indexed and the original form is preserved for display.

@missinglink
Copy link
Member

IIRC the original assumption was that since the analysis was symmetrical (ie. libpostal for both indexing and search) that it would be ok, seems that assumption might not be true, or we're not using libpostal at search-time?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants