Skip to content
This repository has been archived by the owner. It is now read-only.

Search street is case sensitive in Russian #2758

Closed
openstreetmap-trac opened this issue Jul 23, 2021 · 8 comments
Closed

Search street is case sensitive in Russian #2758

openstreetmap-trac opened this issue Jul 23, 2021 · 8 comments

Comments

@openstreetmap-trac
Copy link

@openstreetmap-trac openstreetmap-trac commented Jul 23, 2021

Reporter: Dmitriy.Ovdienko[at]gmail.com
[Submitted to the original trac issue database at 6.12am, Monday, 1st March 2010]

Try search "" and "". It is street in Kyiv. Full name is " ."

@openstreetmap-trac
Copy link
Author

@openstreetmap-trac openstreetmap-trac commented Jul 23, 2021

Author: Dmitriy.Ovdienko[at]gmail.com
[Added to the original trac issue at 4.16pm, Friday, 17th May 2013]

"6, , " does not work.
However "6, , " does work.

Loading

@openstreetmap-trac
Copy link
Author

@openstreetmap-trac openstreetmap-trac commented Jul 23, 2021

Author: Dmitriy.Ovdienko[at]gmail.com
[Added to the original trac issue at 11.20am, Thursday, 2nd January 2014]

4 years old bug... I was sure it is fixed. I believe it is core component and search should be tolerant to user typos as much as possible.

Loading

@openstreetmap-trac
Copy link
Author

@openstreetmap-trac openstreetmap-trac commented Jul 23, 2021

Author: saintam1
[Added to the original trac issue at 11.46am, Friday, 14th November 2014]

It seems to revolve specifically around the handling of the letter (the cyrillic "G"). It looks like Nominatim does not know that "" (unicode 0x0433) is lowercase for "" (0x0413).

Searching for William Gladstone St. in Sofia ([[http://www.openstreetmap.org/way/230377106|way 230377106]]), all of these variations work correctly:

  • [[http://nominatim.openstreetmap.org/search.php?q=.++%2C+|. , ]] (title case, matches the way name exactly)
  • [[http://nominatim.openstreetmap.org/search.php?q=%D1%83%D0%BB.+%D1%83%D0%B8%D0%BB%D1%8F%D0%BC+%D0%93%D0%BB%D0%B0%D0%B4%D1%81%D1%82%D0%BE%D0%BD%2C+%D1%81%D0%BE%D1%84%D0%B8%D1%8F|. , ]] (all lower except the )
  • [[http://nominatim.openstreetmap.org/search.php?q=%D0%A3%D0%9B.+%D0%A3%D0%98%D0%9B%D0%AF%D0%9C+%D0%93%D0%9B%D0%90%D0%94%D0%A1%D0%A2%D0%9E%D0%9D%2C+%D0%A1%D0%9E%D0%A4%D0%98%D0%AF|. , ]] (all upper)

Note that the above use a mixture of upper and lower case, but they all have an uppercase "". If you take any of them, however, and simply replace the uppercase "" with a lowercase "", they all fail:

  • [[http://nominatim.openstreetmap.org/search.php?q=.++%2C+|. , ]]
  • [[http://nominatim.openstreetmap.org/search.php?q=.++%2C+|. , ]]
  • [[http://nominatim.openstreetmap.org/search.php?q=.++%2C+|. , ]]

I searched through https://github.com/twain47/Nominatim and couldn't find at a cursory glance where the casing is handled. Is there somewhere a manually defined, hardcoded list of upper/lower character mappings, that perhaps has a typo in it?

Loading

@openstreetmap-trac
Copy link
Author

@openstreetmap-trac openstreetmap-trac commented Jul 23, 2021

Author: saintam1
[Added to the original trac issue at 1.05pm, Friday, 14th November 2014]

FWIW I had a look at [[https://github.com/twain47/Nominatim/blob/master/module/utfasciitable.h|utfasciitable.h]], and it looks OK to me. If I understand correctly how it works (look up the unicode codepoint in UTFASCIILOOKUP, and use the value there as an index in UTFASCII), all cyrillic characters in the 0x410-0x044F range, which includes both upper and lower case, map to lowercase ASCII transliterations. So there's no anomaly around the "" character here, both "" and "" map to "g".

Loading

@openstreetmap-trac
Copy link
Author

@openstreetmap-trac openstreetmap-trac commented Jul 23, 2021

Author: Dmitriy.Ovdienko[at]gmail.com
[Added to the original trac issue at 11.04pm, Tuesday, 7th July 2015]

I guess mapping of the "" and "" is wrong.
I've attached corrected file.

Loading

@openstreetmap-trac
Copy link
Author

@openstreetmap-trac openstreetmap-trac commented Jul 23, 2021

Author: Dmitriy.Ovdienko[at]gmail.com
[Added to the original trac issue at 11.33pm, Tuesday, 7th July 2015]

Fixed i->I transition. See attached v2 file.

Loading

@openstreetmap-trac
Copy link
Author

@openstreetmap-trac openstreetmap-trac commented Jul 23, 2021

Author: lonvia
[Added to the original trac issue at 6.21pm, Tuesday, 4th October 2016]

Fixed by osm-search/Nominatim#219

Loading

@openstreetmap-trac
Copy link
Author

@openstreetmap-trac openstreetmap-trac commented Jul 23, 2021

Author: Dmitriy.Ovdienko[at]gmail.com
[Added to the original trac issue at 9.54am, Wednesday, 5th October 2016]

Next step is to make search more typo friendly (like google) :)

Loading

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant