Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Transliterated name forms #271

Merged
merged 3 commits into from
Jul 10, 2014
Merged

proposal: Transliterated name forms #271

merged 3 commits into from
Jul 10, 2014

Conversation

jdsumsion
Copy link
Contributor

Here is what I learned about transliteration of name forms, and a proposal for how to deal with them in the lang attribute.

@rbarrynay
Copy link

When we don't know the lang of the string standards (in StdLocale) has switched from using "i-default" to using "und" as the lang tag which is in the BCP47 spec as meaning undetermined. The use of the 'i-xxxx' stuff is not guaranteed to be carried forward in future versions of BCP47.

I totally favor tacking on the script type to the original language after transliterating a string. That is definitely in the spirit of BCP47.

@jdsumsion
Copy link
Contributor Author

@rbarrynay, does your comment imply that FamilySearch's platform api should now be returning "und" for name langtags when the script is unspecified?

I only saw one reference to "und" in BCP47 spec at 4.1.5 and its use was discouraged. Instead, it was recommended that no lang tag be used at all. I know that "i-default" is in the grandfathered/irregular category, but its use seems to be more in line with "use this if you need to supply a langtag but it has not been specified".

Anyway, just looking for guidance and convergence on what is the unspecified langtag for gedcomx.

@stoicflame
Copy link
Member

The proposed modifications look good to me, but I'm looking forward to hearing back from @rbarrynay regarding the use of "und".

@rbarrynay
Copy link

Standards had decided not to use BCP47 tags (defined in http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry) that are grandfathered.

The scope: special language tags of mis (uncoded languages), mul (multiple langages), und (undetermined), and zxx (no linguistic content) have superseded the use of i-default.

Concerning the use of the und tag, BCP47 4.1.5 section states:

"This subtag SHOULD NOT be used unless a language tag is required and language information is not available or cannot be determined. Omitting the language tag (where permitted) is preferred."

Standards interprets that to mean, if the client doesn't know the language or the script then a simple empty string would do as a locale string. However, the locale string parser requires a language if any other information (like region or script) needs to be expressed. So if one doesn't know the language but DOES know the script one would use the string "und-Latn".

@jdsumsion
Copy link
Contributor Author

I see, that clears things up for me.

Does this imply that this is how the FamilySearch platform currently works for names?

@rbarrynay
Copy link

Not sure about the FamilySearch Platform. I do know this is how Standards is supposed to be working and Platform is calling though to Standards.

@jdsumsion jdsumsion changed the title Transliterated name forms proposal: Transliterated name forms Jun 24, 2014
stoicflame added a commit that referenced this pull request Jul 10, 2014
@stoicflame stoicflame merged commit 7c74ebc into FamilySearch:master Jul 10, 2014
@stoicflame
Copy link
Member

Closing this out; it has sat long enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants