Skip to content

mcyph/iso_tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 

Repository files navigation

ISO 639 Tools

About

A set of utilities to provide information about languages, by their <a href="https://en.wikipedia.org/wiki/ISO_639">ISO 639</a> code.

Usage

The ISOTools class provides tools for the normalisation of ISO 639 codes.

There are multiple different standards for language codes in software:

When writing software such as LangLynx, I found myself needing to provide information about languages and dialects which weren't possible with any of these standards:

  • ISO code: supports normalisation
  • Script code:
  • Variant code:
  • Territory code:

I didn't add the codeset/character set, as I think it can be reasonable to expect the use of Unicode across all of my applications.

from iso_tools.ISOTools import ISOTools

print(ISOTools.verify_iso('en'))

# Remove specific properties
print(ISOTools.get_L_removed('en_Latn', [SCRIPT]))
print(ISOTools.removed('ja_Jpan', LANG | SCRIPT))

# Miscellaneous
print(ISOTools.get_lang_props('en'))
print(ISOTools.guess_omitted_info('en'))

# Locale->ISO
print(ISOTools.locale_to_iso('en-US'))
print(ISOTools.locale_to_iso('en-GB'))

# Remove implied/unecessary information (or recover it)
print(ISOTools.pack_iso('ja_Jpan'))
print(ISOTools.remove_unneeded_info('ja_Jpan'))
print(ISOTools.unpack_iso('ja'))

# Split/join ISO code information
print(ISOTools.join(part3='en', script='Latn'))
print(ISOTools.split('ja'))
print(ISOTools.split_multiple('ja_-_en'))

# Fileame escape/de-escape
print(ISOTools.filename_escape('en'))
print(ISOTools.filename_split('en'))

# URL escape/de-escape
print(ISOTools.url_join('ja', 'en'))
print(ISOTools.url_split('ja_-_en'))
print(ISOTools.url_escape('ja_Jpan'))
print(ISOTools.url_unescape('ja_Jpan'))

There is also access to the original ISO 639-3 language data indexed by three-character codes, allowing both conversions

from iso_tools.iso_codes.ISOCodes import ISOCodes

# Convert ISO 639-1/ISO 639-2 codes to ISO 639-3
ISOCodes.to_part3('en')

# Get information about a specific ISO code
ISOCodes.get_D_iso('eng')

# Get other names for an ISO code
ISOCodes.get_D_alternate_names('eng')

# ???
ISOCodes.get_D_iso_639('eng')

# ???
ISOCodes.get_D_lang_codes('eng')

# Get whether there are macro codes (ADD AN EXPLANATION/EXAMPLES!)
ISOCodes.get_L_macros('eng')
ISOCodes.get_L_rev_macros('eng')

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages