-
-
Notifications
You must be signed in to change notification settings - Fork 852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Slovenia countrywide addresses #2980
Conversation
Can someone please helpe me with the error reported on it says
Is position 3609 decimal or hex? If I interpret the position 3609 as decimal it is e19 in hex:
there is no 0xc8 byte, but 0x39 (digit "9") And neither is there such byte in hex position 3609:
There is 0x8d in that position, second byte of letter "č", in utf-8 encoded as 0xc4 0x8d |
@stefanb the scripts look great! The issue seems to be that the csv in the zip file is not a valid UTF8 (my text editor complains it is not). All the cases I was able to spot concern house numbers (e.g. lines 440, 2122, 2129). Not sure, but if the house numbers include "Č", it seems like the most likely issue (since all the addresses preceding to the offending one have "C" as the letter). |
Hey @stefanb, it looks like the problem lines have two different encodings.
The house number field "77\xc8" is not UTF-8 (ISO-8859-2 will decode that to "77Č" if that's what was intended), but the rest of the line can be decoded as such. OpenAddresses only allows one encoding per file, so might need to convert the house number to UTF-8 in the preprocessing script with e.g. |
I already use iconv to convert the source windows 1250 encoding to utf-8. But yes, I did not anticipate complex characters in house numbers, so I didn't add it there as I did for all other text fields |
Yes, there seem to be a house number "77Č" :)
I love it when whole new countries res in. Thank you @stefanb! |
Looks great, thanks @stefanb! |
Most current data can be checked at:
http://raba.openstreetmap.si/openaddresses/si-addresses-2017-05-07.zip (CC-BY Geodetska uprava Republike Slovenije, adapted for openaddresses.io)
fixes #2926