Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load Languages from Wikidata #111

Closed
wants to merge 5 commits into from
Closed

Conversation

EdJoPaTo
Copy link
Contributor

currently WmLanguageCode is based on the sites. I updated the scripts to load both the sites and the languages from Wikidata. As Wikidata probably knows all the languages from other Wikimedia projects it seems like a good source of languages for me.

I also changed some code related to languages and removed the shortLang method. The shortLang reduced the options usable for methods like getEntities as you couldn’t specify de-ch earlier as it was automatically reduced to de. This seems like a bug to me but I might miss something?

All the test cases still work except one which I think was a bug: getSitelinkData().lang included _ which only sites include. The language like it’s used in URLs and the lang arguments uses the -. The getSitelinkUrl also depends on it so I think this was another bug and the test case was also wrong.

Fixes #107

Uses a helper from #106 so it should be merged first. Only the last commit is unique to this PR.

Its exported currently so removing it would be a breaking change.

This method can not make any type guarantees other than string.
So fix the types but dont remove it.
removing it would be breaking which this shouldnt be.

Mark it as deprecated to be removed with the next major release.
@EdJoPaTo
Copy link
Contributor Author

hm… maybe the type Language should be named LanguageCode as it doesn't represent a Language and rather a short language code to it.

I'm not sure if the Wm… (=Wikimedia?) prefix is good.
On first thought I would remove it as its already from a library related to that.

But when thinking further this library is generic to Wikibase. The languages are specific to Wikimedia or Wikidata. So other Wikibase instances could have their independent set of languages. Which would mean that the language shouldn't be a specific type at all and rather a string? (Probably an internal type Language = string like Url is currently for easier code readability)

Exporting the list / type would still be useful. Maybe it's even a future thing to have a generic language type and wdk for example specifies it explicitly with the actual Language type. But that's probably a bigger refactoring again.

Any thoughts on this? (Maybe from @maxlath or @mshd?)

@EdJoPaTo
Copy link
Contributor Author

wikidata-lang provides way more language information. Not sure if wikibase-sdk should contain the basic Wikidata information too?

Conflicts:
	package.json
	tests/sitelinks_helpers.ts
@maxlath
Copy link
Owner

maxlath commented Jun 10, 2023

hm… maybe the type Language should be named LanguageCode as it doesn't represent a Language and rather a short language code to it.

+1 for LanguageCode

@maxlath
Copy link
Owner

maxlath commented Jun 10, 2023

Having a hard coded language shortlist was a nice to have from the time this lib was wikidata specific, but it doesn't provide that much value, from what I can gather: let's try type LanguageCode = string?

@EdJoPaTo
Copy link
Contributor Author

superseded by #120

@EdJoPaTo EdJoPaTo closed this Jul 18, 2023
@EdJoPaTo EdJoPaTo deleted the language branch July 31, 2023 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Entity Language Type is incorrect
2 participants