-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fails on 'es-419' #10
Comments
I've tested the tag 'es-419' and the library recognizes it. from language_tags import tags
tag = tags.tag('es-419')
print(tag.valid)
> True
print(tag)
> es-419
print(tag.descriptions)
> [u'Latin American Spanish']
print(tag.data)
> {'record': {u'Added': u'2005-07-15', u'Tag': u'es-419', u'Type': u'redundant', u'Description': [u'Latin American Spanish']}, 'tag': 'es-419'} Which operating system are you using? |
Thanks for the quick reply. I'm using this on both OS X and Ubuntu. So I think this is my mistake, but I am still a bit confused. The 'es-419' tag behaves very differently to the 'es-es' tag (for example), which is why I thought it was failing (see below). If you could shed any light that would be great. In the meantime I'll go and take a longer look at the docs for the JS version. Thanks again. from language_tags import tags
tag = tags.tag('es-419')
tag.type
> u'redundant'
tag.descriptions
> [u'Latin American Spanish']
tag.data
> {'record': {u'Added': u'2005-07-15', u'Tag': u'es-419', u'Type': u'redundant', u'Description': [u'Latin American Spanish']}, 'tag': 'es-419'}
tag.language.description
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> AttributeError: 'NoneType' object has no attribute 'description'
tag = tags.tag('es-es')
tag.type
> 'tag'
tag.descriptions
> []
tag.data
> {'tag': 'es-es'}
tag.language.description
> [u'Spanish', u'Castilian'] |
Thanks for posting the code. I think it behaves differently because it's a redundant tag. But I do agree that it's not the cleanest solution at the moment. We'll have a further look to see how we can improve handling this case. |
Yeah, so I've done some more reading to make sure I understand the topic (http://www.w3.org/International/articles/language-tags/). It feels like the sub-tags should still be available, even if the type is set to redundant as some places are still using those tags. The example I have for es-419 is Google are using it in the rel-alternate-hreflang annotations in the source of pages on the Google Play store (e.g. https://play.google.com/store/apps/details?id=com.babbel.mobile.android.en&hl=en). |
@cahytinne How does this work in the original JS version? Are they also ignoring the subtags if a tags is redundant? |
@koenedaele @cahytinne I just checked this, and it seems they behave the same way as the Python version. I'm not sure it makes sense though as it essentially prevents someone from passing tags in the wild. > tag = tags('es-es')
{ data: { tag: 'es-es' } }
> tag.subtags()
[ { data:
{ subtag: 'es',
record: [Object],
type: 'language' } },
{ data:
{ subtag: 'es',
record: [Object],
type: 'region' } } ]
>
>
> tag = tags('es-419')
{ data:
{ tag: 'es-419',
record:
{ Type: 'redundant',
Tag: 'es-419',
Description: [Object],
Added: '2005-07-15' } } }
> tag.subtags()
[] |
We did more or less port the js version to python as directly as we could, so it makes sense they're the same. Can anyone see a good reason why the subtags aren't being generated with redundant (and possible other types) tags? Might be interesting to involve the author of the original JS library as well. |
From RFC 5646:
If I read this correctly, it means that a redundant tag used to be a separate tag, but now it's just a regular tag that can be composed with it's relevant subtags. Which would indicate that |
From looking at the data with @cahytinne it seems that some redundant tags do have a preferred value (eg. So, if a redundant tag has a preferred value, use that. If not, allow splitting of the tag into subtags? The grandfathered tags seem to be tags that are mostly invalid (eg. |
I feel like it should be possible to split into subtags even if they are redundant. Preventing splitting them may make sense for writing scenarios, but if I am trying to read tags (as per the Google Play store) not being able to parse the tag makes the library useless. |
Hey, interesting discussion! Is this this consensus:
If so, I don't think it would be too difficult for me to implement in the JS version. |
I think that makes sense, then you have the best of both worlds for both reading and writing. |
Seems like the most sensible option to me. So, looks like we all agree. Cool. I think we can implement this change later this week. |
Change looks ok to me. @TomAnthony or @mattcg Do either of you two have any comments? If not, i'll merge that branch with master. |
It looks good to me, I think. :) Awesome work guys, thanks a lot for the quick turn around. |
You're welcome. I'm going to wait to hear from @mattcg and then I'll merge. |
Thanks, @cahytinne. The only thing I'm not so sure about is subtags() returning the subtags from the preferred value if it has one. Maybe the user actually does want the valid subtags from the redundant tag and not the preferred value subtags. Also, I think with this change calling format() after subtags() and then format() again will yield different results, as subtags() is changing the underlying data. |
I'd just eliminate the check for the preferred value and go with the logic from comment #10 (comment). |
Yeah, that is a good point. Automatically pulling the preferred value is probably a bad idea for parsing old tags. |
Ok, I removed the check for the preferred value. It is indeed an improvement. If the subtags of the preferred value are needed, then another tag can be created with the value of the preferred value of the redundant/grandfathered tag. |
If we go with this way, can we add an example of this usage to the docs as well? |
The library fails to recognise 'es-419' as a value, but it is an official value for IETF language tags. I don't entirely understand the different between IETF and IANA, but feel like this value should be recognised?
The text was updated successfully, but these errors were encountered: