Skip to content

thunderpoot/isogloss

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python

🗺️ isogloss

ISO 639 and IETF Language Code Lookup Tool

isogloss is a Python-based command-line tool designed for looking up language details based on ISO 639 codes and IETF (BCP-47) language tags. It provides comprehensive information about languages, including their names, native names, and additional details associated with each code or tag.

Elsewhere, the word isogloss means a boundary line on a map denoting the regional use of a particular linguistic characteristic, but in this case it just seemed to fit.

Features

  • Lookup language details using ISO 639-1, 639-2/B, 639-2/T, or 639-3 codes.
  • Lookup language details by language name.
  • Lookup language details using IETF BCP-47 language tags (e.g., en-GB, en-US, sv-SE).

Installation

Clone the repository to your local machine:

git clone https://github.com/thunderpoot/isogloss.git

Usage

The script can be run directly from the command line. Below are some examples of how to use it:

To look up information by ISO 639 code:

$ isogloss -c swe
{
  "639-1": "sv",
  "Scope": "Individual",
  "Type": "Living",
  "Native name(s)": "svenska",
  "Other name(s)": "",
  "639-2/T": "swe",
  "639-2/B": "",
  "639-5": "",
  "639-3": "swe",
  "Name(s)": "Swedish"
}

To look up information by language name:

$ isogloss -n "egyptian arabic"
{
    "Egyptian Arabic": "arz"
}

Example of lookup via native name:

$ isogloss -n 日本語
{
    "\u65e5\u672c\u8a9e Nihongo": "jpn"
}

Example of multiple results being found:

$ isogloss -n norwegian
{
    "Norwegian Nynorsk": "nno",
    "Nynorsk, Norwegian": "nno",
    "Bokm\u00e5l, Norwegian": "nob",
    "Norwegian Bokm\u00e5l": "nob",
    "Norwegian": "nor",
    "Norwegian Sign Language": "nsl",
    "Traveller Norwegian": "rmg"
}

Language names are normalised, allowing for case-insensitive and accent-insensitive matching when searching:

$ isogloss -n espanol
{
    "Judeo-espa\u00f1ol": "lad",
    "espa\u00f1ol": "spa"
}

To look up information by IETF language tag:

$ isogloss -i fr-FR
{
    "Language": {
        "639-1": "fr",
        "Scope": "Individual",
        "Type": "Living",
        "Native name(s)": "fran\u00e7ais",
        "Other name(s)": "",
        "639-2/T": "fra",
        "639-2/B": "fre",
        "639-5": "",
        "639-3": "fra",
        "Name(s)": "French"
    },
    "Region": "France"
}

For help, refer to the manfile:

man isogloss

Files

  • data/consolidated_langs.json: Contains language data in JSON format used for the lookup.
  • data/region_names.json: Contains region data in JSON format used for the IETF tag lookup.

Contributing

Contributions, issues, and feature requests are welcome!

Author

Written by T E Vaughan

Standards

Language codes conform to the ISO 639 standard (introduced in RFC 4646), and country codes conform to the ISO 3166-1 alpha-2 standard (first defined in RFC 1766).

License

This project is MIT licensed.