New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leave leading 0's in US housenumbers #524

Open
dianashk opened this Issue Mar 23, 2017 · 8 comments

Comments

Projects
None yet
4 participants
@dianashk
Contributor

dianashk commented Mar 23, 2017

We started removing these leading 0's a while back, but it looks like that may have been a bad idea because our friends in Portland just reported the following issue.

This line from the OpenAddresses importer looks like a bug for Portland: record.NUMBER = _.trimStart(record.NUMBER, '0');
https://github.com/pelias/openaddresses/blob/master/lib/streams/cleanupStream.js#L22

In Portland, there are Zero-Leading Addresses that are different from non-Zero-Leading addresses. Said another way, there could be two separate residences ~1 mile apart, that have almost the exact same address, with the only differentiation being a leading-zero on the number (not really a number in that case). Trimming off leading zeros from address numbers is not a valid practice for addresses in Portland.

see: https://en.wikipedia.org/wiki/Portland,_Oregon#Neighborhoods “East-West addresses in this area are denoted with a leading zero (instead of a minus sign). This means 0246 SW California St. is not the same as 246 SW California St. Many mapping programs are unable to distinguish...”

screen shot 2017-03-23 at 2 01 45 pm

@dianashk dianashk added this to the MOD Sandbox milestone Mar 23, 2017

@dianashk dianashk added the bug label Mar 23, 2017

@zgoda

This comment has been minimized.

Show comment
Hide comment
@zgoda

zgoda Sep 25, 2017

Could #651 be related to this? Not house number but postal code gets stripped of leading "0" in search.

zgoda commented Sep 25, 2017

Could #651 be related to this? Not house number but postal code gets stripped of leading "0" in search.

@missinglink

This comment has been minimized.

Show comment
Hide comment
@missinglink

missinglink Sep 25, 2017

Member

It's possible that this is related.

note: we also have an elasticsearch token filter named removeAllZeroNumericPrefix which also strips leading zeros:

"removeAllZeroNumericPrefix" :{
  "type" : "pattern_replace",
  "pattern" : "^(0*)",
  "replacement" : ""
},
Member

missinglink commented Sep 25, 2017

It's possible that this is related.

note: we also have an elasticsearch token filter named removeAllZeroNumericPrefix which also strips leading zeros:

"removeAllZeroNumericPrefix" :{
  "type" : "pattern_replace",
  "pattern" : "^(0*)",
  "replacement" : ""
},
@zgoda

This comment has been minimized.

Show comment
Hide comment
@zgoda

zgoda Sep 25, 2017

Well, that's quite unfortunate but Pelias finds some postal codes with leading zeros, for example searching for "07-300" returns 07300, Saint-Barthelemy-Le-Plain, France:

https://mapzen.com/search/explorer/?query=search&text=07-300 and https://mapzen.com/search/explorer/?query=search/structured&postalcode=07-300

What's missing is "07-300" postal code http://www.geonames.org/postalcode-search.html?q=07-300&country=

zgoda commented Sep 25, 2017

Well, that's quite unfortunate but Pelias finds some postal codes with leading zeros, for example searching for "07-300" returns 07300, Saint-Barthelemy-Le-Plain, France:

https://mapzen.com/search/explorer/?query=search&text=07-300 and https://mapzen.com/search/explorer/?query=search/structured&postalcode=07-300

What's missing is "07-300" postal code http://www.geonames.org/postalcode-search.html?q=07-300&country=

@missinglink

This comment has been minimized.

Show comment
Hide comment
@missinglink

missinglink Sep 25, 2017

Member

Yes, it seemed logical from the mathematical perspective that additional preceding 0's have no semantic value.

It turns out, although uncommon, that some countries and cities assign a semantic value to additional zeros, treating them more like strings than numbers.

Some work will need to be done to remove the 0 trimming code and test the effect of that change internationally.

Member

missinglink commented Sep 25, 2017

Yes, it seemed logical from the mathematical perspective that additional preceding 0's have no semantic value.

It turns out, although uncommon, that some countries and cities assign a semantic value to additional zeros, treating them more like strings than numbers.

Some work will need to be done to remove the 0 trimming code and test the effect of that change internationally.

@zgoda

This comment has been minimized.

Show comment
Hide comment
@zgoda

zgoda Sep 25, 2017

It's not that uncommon, at least in Poland. Country capital has "00-" to "04-", and some "05-".

zgoda commented Sep 25, 2017

It's not that uncommon, at least in Poland. Country capital has "00-" to "04-", and some "05-".

@missinglink

This comment has been minimized.

Show comment
Hide comment
@missinglink

missinglink Sep 25, 2017

Member

@zgoda let's not conflate the street number and postalcode issues, I don't believe we strip 0's from postalcodes.

Member

missinglink commented Sep 25, 2017

@zgoda let's not conflate the street number and postalcode issues, I don't believe we strip 0's from postalcodes.

@zgoda

This comment has been minimized.

Show comment
Hide comment
@zgoda

zgoda Sep 25, 2017

OK, so I assume this is not related. Thank you.

zgoda commented Sep 25, 2017

OK, so I assume this is not related. Thank you.

@orangejulius

This comment has been minimized.

Show comment
Hide comment
@orangejulius

orangejulius Feb 14, 2018

Member

This change has made it to our production branches and will be reflected in any new builds.

Here's a comparison of Mapzen Search (not updated, going away), geocode.earth(change will be reflected in the next build), TriMet's Pelias (the next build should get this change), and a build on my local machine from the other day. Note the additional OA record with the correct address on the localhost build all the way to the right.
image

Member

orangejulius commented Feb 14, 2018

This change has made it to our production branches and will be reflected in any new builds.

Here's a comparison of Mapzen Search (not updated, going away), geocode.earth(change will be reflected in the next build), TriMet's Pelias (the next build should get this change), and a build on my local machine from the other day. Note the additional OA record with the correct address on the localhost build all the way to the right.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment