New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/i18n headers #67

Merged
merged 8 commits into from Jul 22, 2017

Conversation

Projects
None yet
2 participants
@untra
Owner

untra commented Jul 22, 2017

  • absolute_url_regex handles the matching of absolute urls that are not excluded or lang'd
  • relative_url_regex will no longer regex match absolute urls
  • process_documents will now instantiate the relative_url_regex and absolute_url_regex,
    and pass these into the processes responsible for gsub!ing the docs. Much faster.
  • fixed i18n_headers to whitespace avoid matching the default lang (href=" http://..." won't match)
  • cleaned up tests

untra added some commits Jul 18, 2017

split absolute url parsing from relative url parsing
* `absolute_url_regex` handles the matching of absolute urls that are not excluded or lang'd
* `relative_url_regex` will no longer regex match absolute urls
* `process_documents` will now instantiate the `relative_url_regex` and `absolute_url_regex`,
and pass these into the processes responsible for `gsub!`ing the docs. Much faster.
* fixed i18n_headers to whitespace avoid matching the default lang (`href=" http://..."` won't match)
* cleaned up tests

@untra untra requested a review from vlsi Jul 22, 2017

@untra

This comment has been minimized.

Show comment
Hide comment
@untra

untra Jul 22, 2017

Owner

Absolute URL parsing will stay in

@vlsi

It makes sense to have some other way to mark unprocessed links.
For instance: [about](http://mywebsite.com/fr/about/<!--nopolyglot-->) or [about](<!--nopolyglot-->http://mywebsite.com/fr/about/) or something else. That way polyglot could trim <!--nopolyglot--> and skip URL processing for the particular URL be it absolute or relative one.

I think I came up with a better solution:
processed: href="http://mywebsite.com/about"
unprocessed: href=" http://mywebsite.com/about"

the extra whitespace after the first quote avoids the regex matching, but is still valid html, and is properly rendered / linked by browsers. This fixes the i18n_headers, rendering them as:

<meta http-equiv="Content-Language" content="de">
<link rel="alternate" i18n="en" href=" http://localhost:4000/about/"/>
<link rel="alternate" i18n="es" href="http://localhost:4000/es/about/"/>
<link rel="alternate" i18n="de" href="http://localhost:4000/de/about/"/>
<link rel="alternate" i18n="fr" href="http://localhost:4000/fr/about/"/>

Let me know if this works for you. Thanks! 👍

Owner

untra commented Jul 22, 2017

Absolute URL parsing will stay in

@vlsi

It makes sense to have some other way to mark unprocessed links.
For instance: [about](http://mywebsite.com/fr/about/<!--nopolyglot-->) or [about](<!--nopolyglot-->http://mywebsite.com/fr/about/) or something else. That way polyglot could trim <!--nopolyglot--> and skip URL processing for the particular URL be it absolute or relative one.

I think I came up with a better solution:
processed: href="http://mywebsite.com/about"
unprocessed: href=" http://mywebsite.com/about"

the extra whitespace after the first quote avoids the regex matching, but is still valid html, and is properly rendered / linked by browsers. This fixes the i18n_headers, rendering them as:

<meta http-equiv="Content-Language" content="de">
<link rel="alternate" i18n="en" href=" http://localhost:4000/about/"/>
<link rel="alternate" i18n="es" href="http://localhost:4000/es/about/"/>
<link rel="alternate" i18n="de" href="http://localhost:4000/de/about/"/>
<link rel="alternate" i18n="fr" href="http://localhost:4000/fr/about/"/>

Let me know if this works for you. Thanks! 👍

@vlsi

This comment has been minimized.

Show comment
Hide comment
@vlsi

vlsi Jul 22, 2017

Collaborator

the extra whitespace after the first quote avoids the regex matching, but is still valid html

I think it is worth documenting.

Collaborator

vlsi commented Jul 22, 2017

the extra whitespace after the first quote avoids the regex matching, but is still valid html

I think it is worth documenting.

@vlsi

This comment has been minimized.

Show comment
Hide comment
@vlsi

vlsi Jul 22, 2017

Collaborator

By the way: have you considered removing the whitespace?
That is treat leading whitespace as "nolocalize" flag, and kill the whitespace to make final markup clean.

Collaborator

vlsi commented Jul 22, 2017

By the way: have you considered removing the whitespace?
That is treat leading whitespace as "nolocalize" flag, and kill the whitespace to make final markup clean.

@untra

This comment has been minimized.

Show comment
Hide comment
@untra

untra Jul 22, 2017

Owner

I added some documentation to the readme explaining the whitespace approach. I'm going to add further instructions to the v.1.3.0 blogpost.

By the way: have you considered removing the whitespace?
That is treat leading whitespace as "nolocalize" flag, and kill the whitespace to make final markup clean.

I'd say we should leave it up to the user to install a html minifier. The generated html is still valid, and a jekyll minifier will certainly get the job done anyways.

Owner

untra commented Jul 22, 2017

I added some documentation to the readme explaining the whitespace approach. I'm going to add further instructions to the v.1.3.0 blogpost.

By the way: have you considered removing the whitespace?
That is treat leading whitespace as "nolocalize" flag, and kill the whitespace to make final markup clean.

I'd say we should leave it up to the user to install a html minifier. The generated html is still valid, and a jekyll minifier will certainly get the job done anyways.

@untra

This comment has been minimized.

Show comment
Hide comment
@untra

untra Jul 22, 2017

Owner

If this PR looks good, mark it as reviewed and good

Owner

untra commented Jul 22, 2017

If this PR looks good, mark it as reviewed and good

Show outdated Hide outdated README.md
url_quoted = config['url']
url_quoted = Regexp.quote(url_quoted) unless url_quoted.nil?
%r{href=\"(?:#{url_quoted})?#{@baseurl}\/((?:#{regex}[^,'\"\s\/?\.#]+\.?)*(?:\/[^\]\[\)\(\"\'\s]*)?)\"}
%r{href=\"?#{@baseurl}\/((?:#{regex}[^,'\"\s\/?\.#]+\.?)*(?:\/[^\]\[\)\(\"\'\s]*)?)\"}

This comment has been minimized.

@vlsi

vlsi Jul 22, 2017

Collaborator

As far as I can tell, relative URLs like /en/about will still be processed, while absolute URLs that include /:lang are excluded. Is that difference between absolute/relative intentional?
Should this be documented/tested?

@vlsi

vlsi Jul 22, 2017

Collaborator

As far as I can tell, relative URLs like /en/about will still be processed, while absolute URLs that include /:lang are excluded. Is that difference between absolute/relative intentional?
Should this be documented/tested?

This comment has been minimized.

@untra

untra Jul 22, 2017

Owner

It should be intentional. The exclusion of the /:lang from absolute urls is needed to ensure that the i18n_headers are mostly left alone.

You are right thought a relative url like /en/about will be processed into /en/en/about which isn't right. Maybe the relative urls should get the same treatment.

@untra

untra Jul 22, 2017

Owner

It should be intentional. The exclusion of the /:lang from absolute urls is needed to ensure that the i18n_headers are mostly left alone.

You are right thought a relative url like /en/about will be processed into /en/en/about which isn't right. Maybe the relative urls should get the same treatment.

This comment has been minimized.

@untra

untra Jul 22, 2017

Owner

they're getting the same treatment 👍

@untra

untra Jul 22, 2017

Owner

they're getting the same treatment 👍

Show outdated Hide outdated README.md
@vlsi

This comment has been minimized.

Show comment
Hide comment
@vlsi

vlsi Jul 22, 2017

Collaborator

On the other hand, the following versions build the site for me just fine (except #68).
I18N headers are fine, the links between pages are fine.

ruby 2.4.1p111 (2017-03-22 revision 58053)

Using i18n 0.8.6
Using minitest 5.10.3
Using thread_safe 0.3.6
Using public_suffix 2.0.5
Using bundler 1.15.2
Using colorator 1.1.0
Using multipart-post 2.0.0
Using ffi 1.9.18
Using forwardable-extended 2.6.0
Using gemoji 3.0.0
Using mini_portile2 2.2.0
Using rb-fsevent 0.10.2
Using kramdown 1.14.0
Using liquid 3.0.6
Using mercenary 0.3.6
Using rouge 1.11.1
Using safe_yaml 1.0.4
Using jekyll-paginate 1.1.0
Using tzinfo 1.2.3
Using addressable 2.5.1
Using faraday 0.12.2
Using rb-inotify 0.9.10
Using pathutil 0.14.0
Using nokogiri 1.8.0
Using activesupport 4.2.9
Using sawyer 0.8.1
Using sass-listen 4.0.0
Using listen 3.0.8
Using html-pipeline 2.6.0
Using octokit 4.7.0
Using sass 3.5.1
Using jekyll-watch 1.5.0
Using jekyll-gist 1.4.1
Using jekyll-sass-converter 1.5.0
Using jekyll 3.4.3
Using jekyll-feed 0.9.2
Using jekyll-polyglot 1.3.0 from git://github.com/untra/polyglot.git (at fix/i18n_headers@b033972)
Using jekyll-sitemap 0.13.0
Using jemoji 0.8.0
Using minimal-mistakes-jekyll 4.1.0 from /Users/vladimirsitnikov/Documents/code/minimal-mistakes (at vlsi_master@cec37d6)
Collaborator

vlsi commented Jul 22, 2017

On the other hand, the following versions build the site for me just fine (except #68).
I18N headers are fine, the links between pages are fine.

ruby 2.4.1p111 (2017-03-22 revision 58053)

Using i18n 0.8.6
Using minitest 5.10.3
Using thread_safe 0.3.6
Using public_suffix 2.0.5
Using bundler 1.15.2
Using colorator 1.1.0
Using multipart-post 2.0.0
Using ffi 1.9.18
Using forwardable-extended 2.6.0
Using gemoji 3.0.0
Using mini_portile2 2.2.0
Using rb-fsevent 0.10.2
Using kramdown 1.14.0
Using liquid 3.0.6
Using mercenary 0.3.6
Using rouge 1.11.1
Using safe_yaml 1.0.4
Using jekyll-paginate 1.1.0
Using tzinfo 1.2.3
Using addressable 2.5.1
Using faraday 0.12.2
Using rb-inotify 0.9.10
Using pathutil 0.14.0
Using nokogiri 1.8.0
Using activesupport 4.2.9
Using sawyer 0.8.1
Using sass-listen 4.0.0
Using listen 3.0.8
Using html-pipeline 2.6.0
Using octokit 4.7.0
Using sass 3.5.1
Using jekyll-watch 1.5.0
Using jekyll-gist 1.4.1
Using jekyll-sass-converter 1.5.0
Using jekyll 3.4.3
Using jekyll-feed 0.9.2
Using jekyll-polyglot 1.3.0 from git://github.com/untra/polyglot.git (at fix/i18n_headers@b033972)
Using jekyll-sitemap 0.13.0
Using jemoji 0.8.0
Using minimal-mistakes-jekyll 4.1.0 from /Users/vladimirsitnikov/Documents/code/minimal-mistakes (at vlsi_master@cec37d6)
@vlsi

vlsi approved these changes Jul 22, 2017

@untra untra merged commit 504a2f7 into master Jul 22, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@untra untra deleted the fix/i18n_headers branch Jul 22, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment