New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wikidata translations lead to troublesome labels #1547
Comments
This is perhaps more workable for POIs than for other things like places and natural features, which may have very different naming conventions across languages. Moreover, parts of the OSM community insist on only tagging I think it’s fair to say that OpenMapTiles is a little further on the spectrum of data consumers that care about the end user experience, which would implement these fallbacks, whereas the other featured tile layers on osm.org have traditionally been focused solely on mapper feedback.
You would be well-justified in changing the English label to just “Leeuwarden”. The only reason it has the “railway station” suffix is that the label was imported from the English Wikipedia, where article titles have to be unique, and hasn’t been cleaned up yet. But Wikidata does welcome mechanical edits to clean up its labels. (A similar example came up in a forum discussion where someone had proposed requiring data consumers to rely on Wikidata labels in the first instance, which would’ve been a step too far in my opinion.) |
If they do they'll be tagged on OpenStreetMap correctly. For places, using
For the Netherlands I am not aware of these issues. Is this really a big problem internationally? International names for the bigger places are maintained accurately, and for smaller places often don't exist (which is natural). It doesn't say アムステルダム anywhere on the signage for Amsterdam, but
I'm not convinced importing missing names with lots of errors is significantly improving the user experience. It will also lead to cases where the origin of the name is unclear in case of errors ("is OpenStreetMap wrong or is the name magically imported from some other source?"). Using names from Wikidata might improve Wikidata, but does it benefit OpenStreetMap and its users?
The problem is that mappers now have to maintain two sets of name tags in order to prevent faulty information from leaking through. That is quite an additional burden! In this case it would also be wrong: there is no real It will also inevitably lead to conflicts with other Wikidata consumers, like sister project Wikipedia. These often have different policies for naming things. One specific case which has bugged me for years is how the Dutch Wikipedia refers to the village of Grou near me as Dutch Wikipedia refuses to change the name used on their lemma, because they have a policy (set in stone as these are on Wikipedia) which points to a specific language institution as the arbiter for such names. That institute publishes a list of Dutch and Frisian names, and the village is referred by its archaic name there without noting its obsolescence, so any attempt to change the Wikipedia page get reverted. The only reason this name is correct on Wikidata is because the editors concerned haven't looked there yet. Another example closer to you: how would the Dutch refer to San José? With or without the acute accent? Orthographically, the This goes even further for Frisian, the minority language spoken in my Dutch province. It too has the |
If you’re right, that would be wonderful, although there’s still the issue of not being able to determine the language of
This is exactly my point. It would be profoundly confusing and ill-advised for a purely mapper-oriented tile layer/stylesheet like openstreetmap-carto to pull in Wikidata labels, especially if there’s no indication of the source.1 But from the perspective of a user of a consumer application that depends on OpenMapTiles, does it really matter where the incorrect name comes from, as long as it can be fixed easily? Even more OSM-oriented clients of OpenMapTiles, such as OSM Americana, have decided that a rising tide lifts all boats and use the Putting an OpenMapTiles-powered, openstreetmap-carto-inspired style in openstreetmap.org, as in openstreetmap/openstreetmap-website#4042, does blur the line between the two audiences. I think it would be reasonable to expect this particular style to not use Wikidata labels, for the sake of mapper feedback. MapTiler could do that by skipping this step when generating tiles specifically for openstreetmap.org. But enforcing that decision on other styles for other audiences would be a lot less reasonable in my opinion.
English isn’t even consistent about it. 😅 Wikidata does a great job of clarifying the situation, if I may say so myself, although OpenMapTiles is only using labels, not name statements. Preferring name statements over labels would improve the translation quality in some cases.
That’s probably true, but not necessarily because Wikidata prefers Wikipedia article titles. I’m unfamiliar with the Frisian Wikipedia community, but I can say that there’s no love lost between the English Wikipedia and Wikidata over the issue of labels and descriptions. (Wikipedia essentially forked Wikidata in that regard.) The best practice on Wikidata would be to set the Frisian label to something, even if it’s the same as the native name. To affirm there’s no distinct Frisian name, you’d add a “no value” Frisian-language name statement to the item, though that’s pretty rare. Footnotes
|
openmaptiles/openmaptiles-tools#437 onthegomap/planetiler#679 |
I noticed in openstreetmap/openstreetmap-website#4042 that OpenMapTiles is being considered for inclusion on the main openstreetmap.org website. Great! It desperately needs a new high quality general purpose layer.
One jarring thing I've noticed browsing https://osm.openmaptiles.org/#map=17/53.21006/5.77774&layers=V is that when the user's preferred language is missing in the OpenStreetMap entities, it gets pulled in through its Wikidata link if it has one. In my case my browser is set to prefer English, so browsing the map of my home town where Frisian and Dutch are used for most names, I've noticed some worrying discrepancies.
For example the railway station which should just be called
Leeuwarden
is now labelled asLeeuwarden railway station
, and indeed, that is what Wikidata lists as its English name. This is wrong of course; the station's name does not include the descriptive 'railway station' suffix, and the OpenStreetMap entity omits this in all tagged languages. The correct behaviour for such local features is to usename
in the absence ofname:en
, which is what anyone writing in English would do if the scripts are the same and no transliteration is needed.This is just an example of a type of which I am seeing a lot just scanning the map, and it make me worry about a fundamental choice being made here which has a potentially huge impact.
One aspect is that OpenStreetMap mappers make a significant effort to correctly label things on the map, especially names. For points-of-interest with international appeal this invariably means a large list of translations. For local entities however, translated names often don't exist, and that's fine. The problem with drawing missing translations from Wikidata is that it is a different project, with different rules (and policies) regarding naming things, different user accounts, and different priorities. So as a mapper who cares about correct names, I am now faced with a dilemma.
I could go and edit Wikidata and remove such non-translations like
Leeuwarden railway station
orLeeuwarden Northern General Cemetery
(the latter is literally made up based on the local Dutch name by someone with no knowledge of Dutch or the local names). This will likely get me in conflict with people who desire these translations for other projects. This doesn't feel right, and I've noticed in discussions about the use of Wikidata in the OpenStreetMap community that I am not alone in this. I.e., linking to Wikidata via thewikidata
key is desirable, but pulling in metadata perhaps not so much.I see that it is nice to be able to pull in translations where these are missing, but I wonder if doing this at the level of the renderer is the right place.
The text was updated successfully, but these errors were encountered: