Parse language information #3

voxpelli opened this Issue Jul 13, 2016 · 5 comments


None yet

3 participants


After discussion on IRC, opening an issue here for the language parsing brainstorming that's happened on the wiki:

voxpelli commented Aug 14, 2016 edited

Adding some issue references for related issues to make discovery easier:

w3c/Micropub#34 (comment) about similar syntax for parsing img alt text as discussed in #2 in this repo

BigBlueHat commented Sep 22, 2016 edited

This is worth digging into for examples and variations on use of the lang attribute in HTML5 (etc) and the fallback list/process to figure out the containing documents language--which may be as "far away" as the HTTP header values.

Here's a clear example that shows some of the "gotchas":

Bad example: <a lang="es" title="Spanish" href="">Español</a>


Good example: <span title="Spanish"><a lang="es" href="">Español</a></span>

Hope that's helpful. It's research I was doing while discussing w3c/webmention#57


tantek commented Sep 22, 2016 edited


Good example:

<span title="Spanish">
<a lang="es" href="">Español</a>

Seems like it could be improved with:

Better(?) example:

<span title="Spanish" lang="en">
<a lang="es" hreflang="es" href="">Español</a>

Assuming that the document at "" is also in Spanish.

BigBlueHat commented Sep 22, 2016 edited

@tantek could you code "fence" those so the markup's viewable?

What I'm seeing in the console, though, does clarify the URL's meaning, but doesn't deal with title if that was in English. For example:

<html lang="en">
Bestest(?) example:
<a title="Not actually in Spanish"
   hreflang="jp" href=""

That covers all the cases I know of...right 😜

tantek commented Sep 22, 2016

I think I did? Took a few edits. markdown-- :P

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment