Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling private subtags #15

Open
koenedaele opened this issue Aug 29, 2016 · 3 comments
Open

Handling private subtags #15

koenedaele opened this issue Aug 29, 2016 · 3 comments
Labels

Comments

@koenedaele
Copy link
Member

Currently we don't support private subtags. Eg. x-flemish is a valid tag that we don't allow (although it's not a particular good tag and it should rather be nl-BE).

This doesn't seem to be supported by the JS implementation either.

@amcgregor
Copy link

I might suggest expanding this to not just mention "private subtags", but "any subtags".

In [1]: tags.tag('en-CA-u-tz-cator-cu-CAD').subtags                                       
Out[1]: 
[{"subtag": "en", "record": {"Type": "language", "Subtag": "en", "Description": ["English"], "Added": "2005-10-16", "Suppress-Script": "Latn"}, "type": "language"},
 {"subtag": "ca", "record": {"Type": "region", "Subtag": "CA", "Description": ["Canada"], "Added": "2005-10-16"}, "type": "region"}]

Where my Unicode extension timezone and currency at? (This to standardize all localization concerns for a user account or active session into a single textual serialization, including preferences like "what day does the week start on", "are negative numbers prefixed by a - or wrapped in parenthesis?", or "I'd like to use the Jewish lunar calendar, please", etc.)

@koenedaele
Copy link
Member Author

I assume this isn't supported by the JS lib we ported either (https://github.com/mattcg/language-tags)?

I know there's another python lib (https://github.com/LuminosoInsight/langcodes) that deals with language codes, based on a quick read of their docs, they handle extensions better.

We mainly use this library to validate language codes in RDF language encoded strings, for which it's worked ok. But if the other library handles alle the edge case we don't, we might just retire this one and switch to the other library. Have you tried that one?

@amcgregor
Copy link

amcgregor commented Oct 17, 2019

After evaluating about a dozen packages, some of which essentially implement nothing (yes, that's an entire package just for one dictionary), I'm resigned to the fact I'm going to have to write a proper BCP47 + CLDR implementation myself. Edited to add: yup, examined langcodes, it fails, too. The entire "Unicode extension" (everything from u- on) is treated as a single monolithic unknown subtag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants