Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Glottocode errors in a few lexicore datasets #35

Closed
XachaB opened this issue Nov 6, 2021 · 7 comments
Closed

Glottocode errors in a few lexicore datasets #35

XachaB opened this issue Nov 6, 2021 · 7 comments

Comments

@XachaB
Copy link

XachaB commented Nov 6, 2021

Hi,

I spotted a few glottocodes which seem unknown of glottolog, listed below. I am making a single issue here, rather than 8 issues each on a single dataset bug tracker.

When it was clear which was the correct glottocode (exact same variety name, similar glottocode which makes it look like a type), I provide the correct code too:

dataset given_glottocode correct_glottocode language_name
dunnaslian sema1250 semn1250 Semnam Malau
dunnaslian teim1246 temi1246 Temiar Perak
dunnaslian monn1258 monn1253 Mon
kesslersignificance nucl1201 nucl1301 Turkish
polyglottaafricana maka1261 makh1261 Makhuwa-Meetto
saenkoromance vall1248   Vallader_Romansh
saenkoromance cagl1238   Campidanese
servamalagasy meri1291 meri1243 Merina
sidwellbahnaric kass1248   Kasseng
transnewguineaorg cent2257   Proto-Central-Sogeram
zgraggenmadang sali1249 saki1249 maia-saki
zgraggenmadang para1207 para1307 parawen

I'll make a few PRs when I did find a correction.

Shouldn't cldfbench have checked this automatically and flagged the codes as incorrect ? If not, that would probably be a useful check to add.

XachaB added a commit to XachaB/dunnaslian that referenced this issue Nov 6, 2021
Suggested corrections for incorrect glottocodes. See lexibank/lexibank-analysed#35
XachaB added a commit to XachaB/kesslersignificance that referenced this issue Nov 6, 2021
XachaB added a commit to XachaB/polyglottaafricana that referenced this issue Nov 6, 2021
XachaB added a commit to XachaB/servamalagasy-1 that referenced this issue Nov 6, 2021
XachaB added a commit to XachaB/zgraggenmadang that referenced this issue Nov 6, 2021
@XachaB
Copy link
Author

XachaB commented Nov 6, 2021

Note: I am only making changes to the languages files, in order to do this quickly on the web interface. If PRs are accepted, the cldf datasets need to be regenerated by someone who has the whole cldfbench environment ready :)

@SimonGreenhill
Copy link

Thanks @XachaB -- I've handled dunnaslian, zgraggenmadang, servamalagasy.

@LinguList
Copy link
Contributor

Thanks. All datasets need of course to be re-released later.

@MuffinLinwist
Copy link

Thanks, @XachaB, for pointing this out. I'm checking all of the cases listed here and preparing a PR for the new release.

@MuffinLinwist
Copy link

I already addressed these wrong Glottocodes and am preparing a PR with the fixes. The only one pending is the Glottocode for Proto-Central-Sogeram (which needs another fix because it's downloading the language metadata from a url in the lexibank script). I wrote an issue about it on the repo and can be access here.

I'll reference this issue once the PR is open and close it once the PR is merged.

@MuffinLinwist
Copy link

All of these datasets are fixed now. @chrzyki, perhaps we can close this issue if everything is fit. I can, then, go ahead and proceed with #47.

@chrzyki
Copy link
Contributor

chrzyki commented Mar 1, 2024

Awesome, thanks for taking care of this @MuffinLinwist. Double-checked the repos again and everything looks good!

@chrzyki chrzyki closed this as completed Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants