Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assume /en-US/ on /docs links that might be broken #1461

Conversation

peterbe
Copy link
Contributor

@peterbe peterbe commented Oct 15, 2020

Fixes #1460
Fixes #1458

I think the new tests speak for themselves :)

// When checking it against disk, we'll have to assume a locale.
let hrefNormalized = href.split("#")[0];
if (hrefNormalized.startsWith("/docs/")) {
const thisDocumentLocale = doc.mdn_url.split("/")[1];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking this should always be en-US. If we're working with a non-English document that includes a link with one of these "locale-free" URL's, it seems that we should either assume the URL must at least exist in English (so we just check if it can be found with the en-US locale) or we first check if it exists using the document's locale, and if that fails, we also check if it exists in English. I would say though, that if the URL doesn't include a locale, it better always work in English, so we should just check that. What do you think about that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For active English document, this value will always become en-US (aka. the same as DEFAULT_LOCALE).
If it's translated or plain archived documents, we don't even bother with flaw detection at all.
Some day, in a not so distant future, this text will be replicated for the top 15 locales we're going to submit to Smartling. There'll still only ever be active English that is ever checked in flaw detection.
So all that said, it makes sense to switch to DEFAULT_LOCALE instead of interrogating this doc.mdn_url.split('/').

However, how should it work in the future. It's very unlikely that the machines will translate the insides of HTML attributes. To them, they'll see it as:

This is a paragraph text and <XXXX>here is a link</XXXX>.

So that some day, that text source KS-HTML will be converted, by Smartling, to:

C'est-ci est un paragraphe et <a href="/en-US/docs/Foo">ici est un linque</a>.

And that's wrong. What, maybe, we should have done is to make all hyperlinks agnostic and fluid. If the original HTML source that Smartling has to translate is:

This is a paragraph text and <a href="../../Foo/Bar">here is a link</a>.

...or something, then it'll just work automatically in any language if the translation server only translates stuff that's not inside the tags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, I made the file attachment checker always have to be relative.

<!-- NOT OK -->
<img src="/en-US/docs/Foo/pic.png"/>
<!-- OK -->
<img src="./pic.png"/>
or
<img src="../shared-parent/cousin.png"/>

There's definitely some food for though there. Not entirely sure where to start but worth chewing on at night.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having said all of that, which probably is off-topic, I think changing from doc.mdn_url.split("/")[1] to DEFAULT_LOCALE is actually technically not better. And the current solution leave a tiny window of opportunity if we ever flaw-check other documents that aren't the active English ones.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I can see often template languages passing in arguments as another parameter.
Would those machine translation operate on the rendered HTML?
Or on the source code? In the latter case, it shouldn't matter when you pass in a variable, that get's replaced / evaluated at build time.
Something like <img src="<% locale %>/docs/Foo/pic.png" alt="" /> (because we care about valid HTML, don't we ;-) )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterbe I think it's fine as is, since we only check for flaws in English documents. However, if we ever do check for flaws in non-English documents, my argument is that it still doesn't make sense (when we see a locale-free URL) to just check the document's locale. I think it still makes sense to only check the English locale, and this is because a locale-free URL should exist in all locales (i.e., it better not be for a URL where the slug has been translated), so we should be able to only check the English locale. All of this is most likely pointless, since when we start doing machine-translations, we'll no longer have any translated slugs for non-English documents, so using doc.mdn_url.split("/")[1] vs DEFAULT_LOCALE won't make any difference.

@escattone escattone merged commit b8304fd into mdn:master Oct 21, 2020
@peterbe peterbe deleted the 1460-assume-en-us-on-docs-links-that-might-be-broken branch October 22, 2020 00:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants