Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address normalization issues #68

Closed
VladimirAlexiev opened this issue Feb 22, 2022 · 1 comment
Closed

Address normalization issues #68

VladimirAlexiev opened this issue Feb 22, 2022 · 1 comment
Labels
semantics This is an issue inherited from the source CCTS model

Comments

@VladimirAlexiev
Copy link

(Follow-up from #43)

Address has some fields that suffer from normalization issues

  • cityName, countrySubDivisionName, countryName are perhaps redundant given the respective Id fields in Address
  • The implicit relation between these Ids should be made explicit in a gazetteer, not repeated in every address.
    Otherwise it's possible for two addresses to hold inconsistent info.

Counter-arguments:

  • UNCEFACT perhaps cannot assume the existence and use of a global gazetteer of all cities (although there are several: Geonames, Wikidata, OSM).
  • If such assumption/choice cannot be made, then cityId has no global meaning, so two different UNCEFACT databases/messages can use them to refer to different datasets
  • Eg cityName can be filled, but cityId can be missing: then cityName should stay in Address. Same for the other 2 levels
@nissimsan nissimsan added the semantics This is an issue inherited from the source CCTS model label Mar 4, 2022
@VladimirAlexiev
Copy link
Author

For the record: as of today, no changes are made to the ontology. The redundancy / potential inconsistency issue remains.
UNCEFACT has a geographic gazetteer: UNLOCODE (although it doesn't have all possible cities / inhabited places).

I understand the business issues behind the redundancy, so I won't plead reopening this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semantics This is an issue inherited from the source CCTS model
Projects
None yet
Development

No branches or pull requests

3 participants