New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/i18n headers #67
Fix/i18n headers #67
Conversation
* `absolute_url_regex` handles the matching of absolute urls that are not excluded or lang'd * `relative_url_regex` will no longer regex match absolute urls * `process_documents` will now instantiate the `relative_url_regex` and `absolute_url_regex`, and pass these into the processes responsible for `gsub!`ing the docs. Much faster. * fixed i18n_headers to whitespace avoid matching the default lang (`href=" http://..."` won't match) * cleaned up tests
Absolute URL parsing will stay in
I think I came up with a better solution: the extra whitespace after the first quote avoids the regex matching, but is still valid html, and is properly rendered / linked by browsers. This fixes the <meta http-equiv="Content-Language" content="de">
<link rel="alternate" i18n="en" href=" http://localhost:4000/about/"/>
<link rel="alternate" i18n="es" href="http://localhost:4000/es/about/"/>
<link rel="alternate" i18n="de" href="http://localhost:4000/de/about/"/>
<link rel="alternate" i18n="fr" href="http://localhost:4000/fr/about/"/> Let me know if this works for you. Thanks! |
I think it is worth documenting. |
By the way: have you considered removing the whitespace? |
I added some documentation to the readme explaining the whitespace approach. I'm going to add further instructions to the v.1.3.0 blogpost.
I'd say we should leave it up to the user to install a html minifier. The generated html is still valid, and a jekyll minifier will certainly get the job done anyways. |
If this PR looks good, mark it as reviewed and good |
README.md
Outdated
@@ -90,7 +90,13 @@ becomes | |||
Notice the link `<a href="/fr/menu/">...` directs to the french website. | |||
|
|||
Even if you are falling back to `default_lang` page, relative links built on the *french* site will | |||
still link to *french* pages. | |||
still link to *french* pages.v |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last v looks odd.
url_quoted = config['url'] | ||
url_quoted = Regexp.quote(url_quoted) unless url_quoted.nil? | ||
%r{href=\"(?:#{url_quoted})?#{@baseurl}\/((?:#{regex}[^,'\"\s\/?\.#]+\.?)*(?:\/[^\]\[\)\(\"\'\s]*)?)\"} | ||
%r{href=\"?#{@baseurl}\/((?:#{regex}[^,'\"\s\/?\.#]+\.?)*(?:\/[^\]\[\)\(\"\'\s]*)?)\"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, relative URLs like /en/about
will still be processed, while absolute URLs that include /:lang
are excluded. Is that difference between absolute/relative intentional?
Should this be documented/tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be intentional. The exclusion of the /:lang
from absolute urls is needed to ensure that the i18n_headers are mostly left alone.
You are right thought a relative url like /en/about
will be processed into /en/en/about
which isn't right. Maybe the relative urls should get the same treatment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they're getting the same treatment
README.md
Outdated
If you defined a site `url` in your `_config.yaml`, polyglot will automatically relativize links pointing to your absolute site url. If you don't want them relativized, adding a space explicitly to an href prevents the the absolute url from being relativized. | ||
|
||
processed: `href="http://mywebsite.com/about"` | ||
unprocessed: `href=" http://mywebsite.com/about"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the space trick work for absolute URLs only? Frankly speaking, I think it is easier to understand in a form of "in case you want to keep the URL as is, just prepend it with a whitespace and polyglot would ignore the link".
If whitespace trick works for all kinds of URLs, then the wording should probably be improved. Currently it sounds like the feature is absolute-URL-only.
On the other hand, the following versions build the site for me just fine (except #68).
|
end | ||
|
||
def relativize_absolute_urls(doc, regex, url) | ||
doc.output.gsub!(regex, "href=\"#{url}/#{@active_lang}/" + '\1"') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be
doc.output.gsub!(regex, "href=\"#{url}#{baseurl}/#{@active_lang}/" + '\1"')
@languages.each do |l| | ||
regex += "(?!#{l}\/)" | ||
end | ||
%r{href=\"?#{url}\/((?:#{regex}[^,'\"\s\/?\.#]+\.?)*(?:\/[^\]\[\)\(\"\'\s]*)?)\"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be
%r{href=\"?#{url}#{baseurl}\/((?:#{regex}[^,'\"\s\/?\.#]+\.?)*(?:\/[^\]\[\)\(\"\'\s]*)?)\"}
absolute_url_regex
handles the matching of absolute urls that are not excluded or lang'drelative_url_regex
will no longer regex match absolute urlsprocess_documents
will now instantiate therelative_url_regex
andabsolute_url_regex
,and pass these into the processes responsible for
gsub!
ing the docs. Much faster.href=" http://..."
won't match)