Fix/i18n headers #67

untra · 2017-07-22T17:35:19Z

absolute_url_regex handles the matching of absolute urls that are not excluded or lang'd
relative_url_regex will no longer regex match absolute urls
process_documents will now instantiate the relative_url_regex and absolute_url_regex,
and pass these into the processes responsible for gsub!ing the docs. Much faster.
fixed i18n_headers to whitespace avoid matching the default lang (href=" http://..." won't match)
cleaned up tests

* `absolute_url_regex` handles the matching of absolute urls that are not excluded or lang'd * `relative_url_regex` will no longer regex match absolute urls * `process_documents` will now instantiate the `relative_url_regex` and `absolute_url_regex`, and pass these into the processes responsible for `gsub!`ing the docs. Much faster. * fixed i18n_headers to whitespace avoid matching the default lang (`href=" http://..."` won't match) * cleaned up tests

untra · 2017-07-22T17:43:45Z

Absolute URL parsing will stay in

@vlsi

It makes sense to have some other way to mark unprocessed links.
For instance: [about](http://mywebsite.com/fr/about/) or [about](http://mywebsite.com/fr/about/) or something else. That way polyglot could trim  and skip URL processing for the particular URL be it absolute or relative one.

I think I came up with a better solution:
processed: href="http://mywebsite.com/about"
unprocessed: href=" http://mywebsite.com/about"

the extra whitespace after the first quote avoids the regex matching, but is still valid html, and is properly rendered / linked by browsers. This fixes the i18n_headers, rendering them as:

<meta http-equiv="Content-Language" content="de">
<link rel="alternate" i18n="en" href=" http://localhost:4000/about/"/>
<link rel="alternate" i18n="es" href="http://localhost:4000/es/about/"/>
<link rel="alternate" i18n="de" href="http://localhost:4000/de/about/"/>
<link rel="alternate" i18n="fr" href="http://localhost:4000/fr/about/"/>

Let me know if this works for you. Thanks! 👍

vlsi · 2017-07-22T17:58:29Z

the extra whitespace after the first quote avoids the regex matching, but is still valid html

I think it is worth documenting.

vlsi · 2017-07-22T18:03:16Z

By the way: have you considered removing the whitespace?
That is treat leading whitespace as "nolocalize" flag, and kill the whitespace to make final markup clean.

untra · 2017-07-22T18:13:11Z

I added some documentation to the readme explaining the whitespace approach. I'm going to add further instructions to the v.1.3.0 blogpost.

By the way: have you considered removing the whitespace?
That is treat leading whitespace as "nolocalize" flag, and kill the whitespace to make final markup clean.

I'd say we should leave it up to the user to install a html minifier. The generated html is still valid, and a jekyll minifier will certainly get the job done anyways.

untra · 2017-07-22T18:13:59Z

If this PR looks good, mark it as reviewed and good ✅

vlsi · 2017-07-22T18:18:45Z

README.md

@@ -90,7 +90,13 @@ becomes
 Notice the link `<a href="/fr/menu/">...` directs to the french website.

 Even if you are falling back to `default_lang` page, relative links built on the *french* site will
-still link to *french* pages.
+still link to *french* pages.v


The last v looks odd.

vlsi · 2017-07-22T18:23:34Z

lib/jekyll/polyglot/patches/jekyll/site.rb

-      url_quoted = config['url']
-      url_quoted = Regexp.quote(url_quoted) unless url_quoted.nil?
-      %r{href=\"(?:#{url_quoted})?#{@baseurl}\/((?:#{regex}[^,'\"\s\/?\.#]+\.?)*(?:\/[^\]\[\)\(\"\'\s]*)?)\"}
+      %r{href=\"?#{@baseurl}\/((?:#{regex}[^,'\"\s\/?\.#]+\.?)*(?:\/[^\]\[\)\(\"\'\s]*)?)\"}


As far as I can tell, relative URLs like /en/about will still be processed, while absolute URLs that include /:lang are excluded. Is that difference between absolute/relative intentional?
Should this be documented/tested?

It should be intentional. The exclusion of the /:lang from absolute urls is needed to ensure that the i18n_headers are mostly left alone.

You are right thought a relative url like /en/about will be processed into /en/en/about which isn't right. Maybe the relative urls should get the same treatment.

they're getting the same treatment 👍

vlsi · 2017-07-22T18:25:36Z

README.md

+If you defined a site `url` in your `_config.yaml`, polyglot will automatically relativize links pointing to your absolute site url. If you don't want them relativized, adding a space explicitly to an href prevents the the absolute url from being relativized.
+
+processed: `href="http://mywebsite.com/about"`
+unprocessed: `href=" http://mywebsite.com/about"`


Should the space trick work for absolute URLs only? Frankly speaking, I think it is easier to understand in a form of "in case you want to keep the URL as is, just prepend it with a whitespace and polyglot would ignore the link".

If whitespace trick works for all kinds of URLs, then the wording should probably be improved. Currently it sounds like the feature is absolute-URL-only.

vlsi · 2017-07-22T18:32:50Z

On the other hand, the following versions build the site for me just fine (except #68).
I18N headers are fine, the links between pages are fine.

ruby 2.4.1p111 (2017-03-22 revision 58053)

Using i18n 0.8.6
Using minitest 5.10.3
Using thread_safe 0.3.6
Using public_suffix 2.0.5
Using bundler 1.15.2
Using colorator 1.1.0
Using multipart-post 2.0.0
Using ffi 1.9.18
Using forwardable-extended 2.6.0
Using gemoji 3.0.0
Using mini_portile2 2.2.0
Using rb-fsevent 0.10.2
Using kramdown 1.14.0
Using liquid 3.0.6
Using mercenary 0.3.6
Using rouge 1.11.1
Using safe_yaml 1.0.4
Using jekyll-paginate 1.1.0
Using tzinfo 1.2.3
Using addressable 2.5.1
Using faraday 0.12.2
Using rb-inotify 0.9.10
Using pathutil 0.14.0
Using nokogiri 1.8.0
Using activesupport 4.2.9
Using sawyer 0.8.1
Using sass-listen 4.0.0
Using listen 3.0.8
Using html-pipeline 2.6.0
Using octokit 4.7.0
Using sass 3.5.1
Using jekyll-watch 1.5.0
Using jekyll-gist 1.4.1
Using jekyll-sass-converter 1.5.0
Using jekyll 3.4.3
Using jekyll-feed 0.9.2
Using jekyll-polyglot 1.3.0 from git://github.com/untra/polyglot.git (at fix/i18n_headers@b033972)
Using jekyll-sitemap 0.13.0
Using jemoji 0.8.0
Using minimal-mistakes-jekyll 4.1.0 from /Users/vladimirsitnikov/Documents/code/minimal-mistakes (at vlsi_master@cec37d6)

untra · 2017-07-22T19:06:37Z

lib/jekyll/polyglot/patches/jekyll/site.rb

+    end
+
+    def relativize_absolute_urls(doc, regex, url)
+      doc.output.gsub!(regex, "href=\"#{url}/#{@active_lang}/" + '\1"')


should be

doc.output.gsub!(regex, "href=\"#{url}#{baseurl}/#{@active_lang}/" + '\1"')

untra · 2017-07-22T19:12:46Z

lib/jekyll/polyglot/patches/jekyll/site.rb

+      @languages.each do |l|
+        regex += "(?!#{l}\/)"
+      end
+      %r{href=\"?#{url}\/((?:#{regex}[^,'\"\s\/?\.#]+\.?)*(?:\/[^\]\[\)\(\"\'\s]*)?)\"}


should be

%r{href=\"?#{url}#{baseurl}\/((?:#{regex}[^,'\"\s\/?\.#]+\.?)*(?:\/[^\]\[\)\(\"\'\s]*)?)\"}

untra and others added 5 commits July 22, 2017 09:17

alpha changes

4429d46

gitignore

be6b649

removed site/_site files

2f5067d

initial work on relative urls

2016efe

untra requested a review from vlsi July 22, 2017 17:43

added documentation

b033972

vlsi requested changes Jul 22, 2017

View reviewed changes

cleaned up readme

dceb57d

untra commented Jul 22, 2017

View reviewed changes

fixed missing baseurl from absolute regex

2a26a02

vlsi approved these changes Jul 22, 2017

View reviewed changes

untra merged commit 504a2f7 into master Jul 22, 2017

untra deleted the fix/i18n_headers branch July 22, 2017 19:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/i18n headers #67

Fix/i18n headers #67

untra commented Jul 22, 2017

untra commented Jul 22, 2017

vlsi commented Jul 22, 2017

vlsi commented Jul 22, 2017

untra commented Jul 22, 2017

untra commented Jul 22, 2017

vlsi Jul 22, 2017

vlsi Jul 22, 2017 •

edited

Loading

untra Jul 22, 2017

untra Jul 22, 2017

vlsi Jul 22, 2017

vlsi commented Jul 22, 2017

untra Jul 22, 2017

untra Jul 22, 2017

Fix/i18n headers #67

Fix/i18n headers #67

Conversation

untra commented Jul 22, 2017

untra commented Jul 22, 2017

vlsi commented Jul 22, 2017

vlsi commented Jul 22, 2017

untra commented Jul 22, 2017

untra commented Jul 22, 2017

vlsi Jul 22, 2017

Choose a reason for hiding this comment

vlsi Jul 22, 2017 • edited Loading

Choose a reason for hiding this comment

untra Jul 22, 2017

Choose a reason for hiding this comment

untra Jul 22, 2017

Choose a reason for hiding this comment

vlsi Jul 22, 2017

Choose a reason for hiding this comment

vlsi commented Jul 22, 2017

untra Jul 22, 2017

Choose a reason for hiding this comment

untra Jul 22, 2017

Choose a reason for hiding this comment

vlsi Jul 22, 2017 •

edited

Loading