Skip to content

highfestiva/bcp47.py

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BCP47

Language tags are not your everyday ISO standard, but instead composed of an ISO-639 language code, and an ISO-3166 country/region code (and occationally an ISO-15924 script tag for the written language).

The file is generated from Microsoft's seminal piece, [MS-LCID].pdf.

Easy installation

$ pip install bcp47

Example

>>> import bcp47

>>> 'dje' in bcp47.tags and 'es-DO' in bcp47.tags
True

>>> [v for k,v in bcp47.languages.items() if 'English' in k]
['en', 'en-AS', 'en-AI', 'en-AG', 'en-AU', 'en-AT', 'en-BS', 'en-BB', 'en-BE', 'en-BZ', 'en-BM', 'en-BW', 'en-IO', ...]

Discontentment

This package only lists the most common language codes. If you want a package to parse, validate and simplify full BCP47 language tags, have a look at langcodes or langtags.

The BCP47 standard is 84 pages catering to specificity (such as de-CH-1996 and zh-CN-a-myext-x-private) while this package currently does not. Instead a highly pragmatic approach is used (some say overly simplified) where only the most common 900 or so language codes are listed, such as fo-DK and iu-Cans-CA.

Microsoft's language codes are used to ensure some level of pragmatism, KISS. Validation you will have to do yourself, see above for a trivial example.

Enjoy at the best of your ability!