Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improved support for streets with no suffix, such as "broadway" #141

Merged
merged 1 commit into from
Aug 10, 2021

Conversation

missinglink
Copy link
Member

@missinglink missinglink commented Aug 9, 2021

this PR resolves the issue discussed in #140

it works by:

  • modifying the dictionaries as discussed in the issue description
  • adding a new classification for street proper names
  • updating the street composite classifier config to re-classify street_proper_name+street_suffix and street_proper_name as street
  • added a bunch of tests and tuned the dictionaries accordingly

no regressions, cc @kochis
resolves #140

@missinglink missinglink requested a review from Joxit August 9, 2021 12:26
@missinglink
Copy link
Member Author

missinglink commented Aug 9, 2021

The build is was failing because Circle-CI has been removed from this project but it still thinks it's active.
I've just removed the webhooks for CircleCI and TravisCI, so it's fixed for next time, this one will require a manual admin override.

@missinglink missinglink force-pushed the streets_with_no_suffix branch 2 times, most recently from c5b5b6d to 479e4d4 Compare August 9, 2021 12:37
@missinglink
Copy link
Member Author

missinglink commented Aug 9, 2021

@Joxit are "Esplanade/Esplanades" common street types in French?
If not we could remove these too:

git grep -i esplanade
...
resources/libpostal/dictionaries/fr/street_types.txt:esplanade|esp
resources/libpostal/dictionaries/fr/street_types.txt:esplanades|esps

another one we might consider removing?

resources/whosonfirst/dictionaries/locality/name:fra_x_preferred.txt:broadway

@Joxit
Copy link
Member

Joxit commented Aug 9, 2021

Yes, Eslplanade is a common street/place type in French, always in the singular (without s)
Some usage examples:

Esplanade de la Liberté
Esplanade du Géneral de Gaulle
Esplanade Méditerranée

Broadway can be safely removed 😄

@missinglink
Copy link
Member Author

missinglink commented Aug 9, 2021

Removed locality/name:fra_x_preferred = Broadway via rebase.

@missinglink
Copy link
Member Author

Removed libpostal/dictionaries/fr/street_types.txt = esplanades|esps via rebase.

@@ -112,6 +112,11 @@ const testcase = (test, common) => {
assert(`Paris 75000, France`, [
{ locality: 'Paris' }, { postcode: '75000' }, { country: 'France' }
])

// https://github.com/pelias/parser/pull/141#issuecomment-895230721
assert(`Esplanade de la Liberté`, [{ street: 'Esplanade de la Liberté' }])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for those test cases, I had actually broken the street prefix Esplanade for French addresses 😱, but managed to fix them again after adding the examples 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅 nice !

@missinglink
Copy link
Member Author

I think this is good to go @Joxit? It comes with a fair few test cases.

Copy link
Member

@Joxit Joxit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@missinglink missinglink merged commit 90481c5 into master Aug 10, 2021
@missinglink missinglink deleted the streets_with_no_suffix branch August 10, 2021 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Special handling of streets with no suffix
2 participants