Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize city names #10

Closed
martj42 opened this issue Jun 16, 2021 · 3 comments
Closed

Standardize city names #10

martj42 opened this issue Jun 16, 2021 · 3 comments
Labels

Comments

@martj42
Copy link
Owner

martj42 commented Jun 16, 2021

Currently, some cities exist under different spellings in the dataset e.g. Kiev and Kyïv or Cádiz and Cadiz. One needs to be picked. Possibly whichever one English Wikipedia uses. As a reminder to myself - the preferred spelling for the capital of Ukraine seems to be Kyiv.

@martj42
Copy link
Owner Author

martj42 commented Feb 13, 2022

A bunch were fixed in 1b49e17.

@kossoff
Copy link

kossoff commented Nov 15, 2022

Hi!

English Wikipedia is good example, but some languages have some problems. Best way is to have vocabulary with frequently used names of cities and base used in your dataset.

@martj42
Copy link
Owner Author

martj42 commented Nov 15, 2022

All cities should now have unique names.

@martj42 martj42 closed this as completed Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants