Skip to content

Files

Latest commit

4a8be63 · Apr 18, 2024

History

History

metadata

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
Apr 18, 2024
Apr 18, 2024
Apr 18, 2024

GlotScript Resource

Current Version = V0.1

Check history for other versions.

How to load

# ! wget https://raw.githubusercontent.com/cisnlp/GlotScript/main/metadata/GlotScript.tsv
df = pd.read_csv('GlotScript.tsv', na_filter= False, sep='\t')

Format

  • MAIN or CORE: Given a language l identified by an ISO639 code, we categorize a script for l as MAIN if this is supported by at least two of the three sources.

  • AUXILIARY (aux): If only one metadata source agrees on a script and not the other, the script is placed in the auxiliary category specific to that source. Wiki-aux, LREC2800-aux, and SIL-aux are used for Wikipedia, LREC_2800, and SIL, respectively. SIL2-aux is exclusively used for discrepancies between ScriptSource and LangTag.

License

This dataset is available under the CC BY-SA 4.0 license, permitting modification and redistribution.

Sources