Glottocode errors in a few lexicore datasets #35

XachaB · 2021-11-06T18:19:50Z

Hi,

I spotted a few glottocodes which seem unknown of glottolog, listed below. I am making a single issue here, rather than 8 issues each on a single dataset bug tracker.

When it was clear which was the correct glottocode (exact same variety name, similar glottocode which makes it look like a type), I provide the correct code too:

dataset	given_glottocode	correct_glottocode	language_name
dunnaslian	sema1250	semn1250	Semnam Malau
dunnaslian	teim1246	temi1246	Temiar Perak
dunnaslian	monn1258	monn1253	Mon
kesslersignificance	nucl1201	nucl1301	Turkish
polyglottaafricana	maka1261	makh1261	Makhuwa-Meetto
saenkoromance	vall1248		Vallader_Romansh
saenkoromance	cagl1238		Campidanese
servamalagasy	meri1291	meri1243	Merina
~~sidwellbahnaric~~	~~kass1248~~		~~Kasseng~~
transnewguineaorg	cent2257		Proto-Central-Sogeram
~~zgraggenmadang~~	~~sali1249~~	~~saki1249~~	~~maia-saki~~
~~zgraggenmadang~~	~~para1207~~	~~para1307~~	~~parawen~~

I'll make a few PRs when I did find a correction.

Shouldn't cldfbench have checked this automatically and flagged the codes as incorrect ? If not, that would probably be a useful check to add.

Suggested corrections for incorrect glottocodes. See lexibank/lexibank-analysed#35

see lexibank/lexibank-analysed#35

XachaB · 2021-11-06T18:30:45Z

Note: I am only making changes to the languages files, in order to do this quickly on the web interface. If PRs are accepted, the cldf datasets need to be regenerated by someone who has the whole cldfbench environment ready :)

SimonGreenhill · 2021-11-07T09:52:36Z

Thanks @XachaB -- I've handled dunnaslian, zgraggenmadang, servamalagasy.

LinguList · 2021-11-07T09:57:02Z

Thanks. All datasets need of course to be re-released later.

MuffinLinwist · 2024-02-20T15:34:52Z

Thanks, @XachaB, for pointing this out. I'm checking all of the cases listed here and preparing a PR for the new release.

MuffinLinwist · 2024-02-20T17:41:15Z

I already addressed these wrong Glottocodes and am preparing a PR with the fixes. The only one pending is the Glottocode for Proto-Central-Sogeram (which needs another fix because it's downloading the language metadata from a url in the lexibank script). I wrote an issue about it on the repo and can be access here.

I'll reference this issue once the PR is open and close it once the PR is merged.

MuffinLinwist · 2024-03-01T14:54:47Z

All of these datasets are fixed now. @chrzyki, perhaps we can close this issue if everything is fit. I can, then, go ahead and proceed with #47.

chrzyki · 2024-03-01T15:44:52Z

Awesome, thanks for taking care of this @MuffinLinwist. Double-checked the repos again and everything looks good!

XachaB added a commit to XachaB/dunnaslian that referenced this issue Nov 6, 2021

Corrects incorrect glottocodes

cbb0c48

Suggested corrections for incorrect glottocodes. See lexibank/lexibank-analysed#35

XachaB mentioned this issue Nov 6, 2021

Corrects incorrect glottocodes lexibank/dunnaslian#5

Merged

XachaB added a commit to XachaB/kesslersignificance that referenced this issue Nov 6, 2021

Correct incorrect glottocode for Turkish

d32ddd4

see lexibank/lexibank-analysed#35

XachaB mentioned this issue Nov 6, 2021

Correct incorrect glottocode for Turkish SequenceComparison/kesslersignificance#1

Merged

XachaB added a commit to XachaB/polyglottaafricana that referenced this issue Nov 6, 2021

Correcting glottocode for Makhuwa-Meetto

70fa420

see lexibank/lexibank-analysed#35

XachaB mentioned this issue Nov 6, 2021

Correcting glottocode for Makhuwa-Meetto lexibank/polyglottaafricana#9

Merged

XachaB added a commit to XachaB/servamalagasy-1 that referenced this issue Nov 6, 2021

Change Merina glottocode from meri1291 to meri1243

5327064

see lexibank/lexibank-analysed#35

XachaB mentioned this issue Nov 6, 2021

Change Merina glottocode from meri1291 to meri1243 digling/servamalagasy#6

Merged

XachaB added a commit to XachaB/zgraggenmadang that referenced this issue Nov 6, 2021

Change Parawen glottocode para1207 -> para1307

99629f0

see lexibank/lexibank-analysed#35

XachaB mentioned this issue Nov 6, 2021

Change Parawen glottocode para1207 -> para1307 lexibank/zgraggenmadang#8

Merged

XachaB mentioned this issue Nov 12, 2021

Sound correspondences: using Lexstat lexibank/lexitools#19

Open

FredericBlum mentioned this issue Feb 19, 2024

Wrong Language Families or Glottocodes in CLDF Datasets #47

Closed

MuffinLinwist mentioned this issue Feb 20, 2024

Wrong Glottocode for Proto-Central-Sogeram lexibank/transnewguineaorg#23

Closed

This was referenced Feb 26, 2024

Fixing Glottocode lexibank/saenkoromance#1

Closed

Fixing glottocode lexibank/sidwellbahnaric#1

Closed

Fixing glottocode lexibank/zgraggenmadang#9

Closed

chrzyki closed this as completed Mar 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Glottocode errors in a few lexicore datasets #35

Glottocode errors in a few lexicore datasets #35

XachaB commented Nov 6, 2021 •

edited by chrzyki

Loading

XachaB commented Nov 6, 2021

SimonGreenhill commented Nov 7, 2021

LinguList commented Nov 7, 2021

MuffinLinwist commented Feb 20, 2024

MuffinLinwist commented Feb 20, 2024

MuffinLinwist commented Mar 1, 2024

chrzyki commented Mar 1, 2024

Glottocode errors in a few lexicore datasets #35

Glottocode errors in a few lexicore datasets #35

Comments

XachaB commented Nov 6, 2021 • edited by chrzyki Loading

XachaB commented Nov 6, 2021

SimonGreenhill commented Nov 7, 2021

LinguList commented Nov 7, 2021

MuffinLinwist commented Feb 20, 2024

MuffinLinwist commented Feb 20, 2024

MuffinLinwist commented Mar 1, 2024

chrzyki commented Mar 1, 2024

XachaB commented Nov 6, 2021 •

edited by chrzyki

Loading