New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whitespace issues #216

Closed
fw42 opened this Issue Jun 26, 2013 · 71 comments

Comments

Projects
None yet
@fw42
Member

fw42 commented Jun 26, 2013

We need to do something about the amount of generated whitespace in rendered Liquid templates. We have several issues and pull requests about that right now (#215, #214, #194, #171, #162) and no suitable fix.

I want to open this for discussion because it's not only annoying for third-party users but also considerably slowing down a lot of Shopify storefronts because of megabytes of whitespace resulting from big nested loops.

A few ideas have shown up:

  • Add Erb like tags (this slows down parsing a lot, as can be seen in #214).
  • Automatically collapse all white-space (this will break stuff where whitespace is intended, email templates for instance).
  • Create a filter for collapsing white space which can manually be wrapped around loops (not sure if this will make things much faster, but at least the output will be smaller - on the other hand, it has to be done by hand for each template).
  • Automatically collapse whitespace in loops if (and only if) the rendered output contains only whitespace (this will catch some of our worst-cases, but won't fix the problem entirely).
  • ...

Comments?

@boourns @hornairs @csfrancis @dylanahsmith @camilo @arthurnn
cc @jcrawford @nickpearson @avalanche123 @ragalie

@ghost ghost assigned fw42 Jun 26, 2013

@nickpearson

This comment has been minimized.

Contributor

nickpearson commented Jun 26, 2013

The ERB-like trims only slow down parsing if the trimming is done as part of the parsing regexes (as I originally did in #214). The newer implementation in #215 trims before parsing and is significantly faster (with almost no performance hit). Also, see my comment at the end there for a way to reduce the performance impact even further by only trimming when there is something to trim.

Unfortunately for Shopify, this means that template developers would have to update their templates to use {%-, -%}, etc., and most are probably not likely to do so. If it's always HTML, then perhaps Shopify could either (a) auto-replace {% with {%- and %} with -%} before passing the template to liquid (assuming the ERB trim mode gets merged), or (b) use an HTML compactor after the content has been passed through Liquid.

@fw42

This comment has been minimized.

Member

fw42 commented Jun 26, 2013

Maybe make your behaviour for "{%- ... -%}" the default one and introduce some way to disable it (globally for that template) when necessary (for emails, etc.)?

@fw42

This comment has been minimized.

Member

fw42 commented Jun 26, 2013

To illustrate this with an example:

This template

{% for 1 in (1..1000) %}
  {% if false %} whatever {% endif %}
{% endfor %}

generates almost 4K of whitespace, where it should (imho, at least in the majority of cases) generate exactly 0 bytes of output (and in fact does, when applying #215 and using those tags).

@fw42

This comment has been minimized.

Member

fw42 commented Jun 26, 2013

@hornairs suggested that during parsing, we look for children of code blocks ({% ... %}, like for loops) and if all of them are "code tags" too (i.e., don't generate output) and none of them are non-Liquid stuff or "output tags" ({{ ... }}), we mark them to automatically collapse all whitespace inside that block during rendering. The additional parsing overhead should be negligible I think.

This won't solve the problem entirely, but will be mostly backwards compatible and get rid of a lot of our worst cases. Optionally, we might merge #215 (or some variant of that) as an opt-in solution.

@tobi, do you have a strong opinion on this?

Other comments anyone?

@nickpearson

This comment has been minimized.

Contributor

nickpearson commented Jun 26, 2013

Do you think it would be better to make this a global setting or to accept an options hash in Template#parse here and here? Or both, where the global setting acts as the default but can be overridden by passing :trim => false to parse?

@tobi

This comment has been minimized.

Member

tobi commented Jun 26, 2013

I think we should do whatever PHP does.

  • tobi
    CEO Shopify

On Wed, Jun 26, 2013 at 11:31 AM, Florian Weingarten <
notifications@github.com> wrote:

@hornairs https://github.com/hornairs suggested that during parsing, we
look for children of code blocks ({% ... %}, like for loops) and if all
of them are "code tags" too (i.e., don't generate output) and none of them
are non-Liquid stuff or "output tags" ({{ ... }}), we mark them to
automatically collapse all whitespace inside that block during rendering.
The additional parsing overhead should be negligible I think.

This won't solve the problem entirely, but will be mostly backwards
compatible and get rid of a lot of our worst cases. Optionally, we might
merge #215 #215 (or some
variant of that) as an opt-in solution.

@tobi https://github.com/tobi, do you have a strong opinion on this?

Other comments anyone?


Reply to this email directly or view it on GitHubhttps://github.com//issues/216#issuecomment-20055754
.

@fw42

This comment has been minimized.

Member

fw42 commented Jun 26, 2013

@tobi I think switching to PHP's behaviour might break backwards compatibility for us too much. People might rely on our current behaviour (think <pre> tags and emails).

@fw42

This comment has been minimized.

Member

fw42 commented Jun 27, 2013

Two new approaches to parts of the problem: #217 and #218. The first one a bit less likely to break stuff, the second one a bit more general/aggressive. Comments welcome.

@airhorns

This comment has been minimized.

Contributor

airhorns commented Jun 27, 2013

PHP's behaviour is a good baseline and would go a ways to solving the problem, but not completely:

start
<?php for ($i = 1; $i <= 10; $i++) { ?>
   <?php true; ?>
   <?php true; ?>
   <?php true; ?>
<?php } ?>
end

gives

oration ~  ➜  php test.php
start
                                                            end

In liquid:

start
{% for i in (1..10) %}
  {% assign foo = i %}
{% endfor %}
end

gives

start





















end
@byroot

This comment has been minimized.

Contributor

byroot commented Jul 9, 2013

Actually in this particular example PHP do not perform better than Liquid.

If you look closely at the output PHP in your example its:

"start\n                                                                                          end\n\n"

and Liquid:

"start\n\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n\nend\n\n"

So 101 bytes versus 52.

But the comparison is not very fair, lets use an example more similar to Liquid one:

start
<?php for ($i = 1; $i <= 10; $i++) { ?>
   <?php true; ?>
<?php } ?>
end

You end up with:

"start\n                              end\n\n"

So 41 bytes.

But if you put a single space right after <?php true; ?> you'll get:

"start\n    \n    \n    \n    \n    \n    \n    \n    \n    \n    \nend\n\n"

Just like liquid.

The PHP behavior being "strip one line-ending directly following a php tag".
It should be trivial to implement and It will definitely save some bytes, but nothing in comparison of the indentation, and it will still break emails formatting.

You may save more by automatically converting 2/4 spaces in tab.

My 2 cents.

@bgotink

This comment has been minimized.

bgotink commented Jan 29, 2014

Off-topic: I love you guys! This templating engine is incredibly easy to use, even for people without any programming skills. I'm using this via Jekyll by the way ;)

On topic now:

The output

"start\n\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n\nend\n\n"

is what I would expect as output for

start
{% for i in (1..10) %}
  {% assign foo = i %}
{% endfor %}
end

The whitespace is present in the code. How is Liquid to know it is not intentional?

A solution for this whitespace issue would be to write the code on a single line without any whitespace between tags:

{% for i in (1..10) %}{% assign foo = i %}{% endfor %}

This is quite unreadable for large amounts of code though, so why don't you simply allow a tag to stretch multiple lines?

By changing

TagStart                    = /\{\%/
TagEnd                      = /\%\}/
[...]
VariableStart               = /\{\{/
VariableEnd                 = /\}\}/
VariableIncompleteEnd       = /\}\}?/

in lib/liquid.rb#L27 to

TagStart                    = /\{\%\s*?/
TagEnd                      = /\s*?\%\}/
[...]
VariableStart               = /\{\{\s*?/
VariableEnd                 = /\s*?\}\}/
VariableIncompleteEnd       = /\s*?\}\}?/

You would allow us to use the following syntax:

{%
  for post in site.posts %}{%
    if page.title %}{{
      page.title }}{%
    endif %}{%
  endfor
%}

Note that this code will generate a list of the titles, but without any spaces between them, e.g.

Post1Post2Post3

This small change

      page.title }}
{%

will alleviate this issue and result in

Post1
Post2
Post3



P.S. 1: This is completely backwards compatible, that is, old liquid code will still result in the exact same output.



P.S. 2: I've added the non-greedy \s*? instead of \s* as the latter broke some tests and error outputs, e.g.

bram:~/Workspaces/Liquid (master *=) $ rake test --trace
** Invoke test (first_time)
** Execute test
** Invoke base_test (first_time)
** Execute base_test
-- LAX ERROR MODE
Run options: 

# Running tests:

[117/408] ErrorHandlingTest#test_warnings = 0.00 s                                                           
  1) Failure:
test_warnings(ErrorHandlingTest) [/Users/bram/Workspaces/Liquid/test/liquid/error_handling_test.rb:99]:
<"Expected id but found end_of_string in \"{{ hello. }}\""> expected but was
<"Expected id but found end_of_string in \"{{hello. }}\"">.

All tests pass using Ruby 2.0 on OS X.




P.S. 3: I've run the benchmarks:
before

/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/bin/ruby ./performance/benchmark.rb lax
Rehearsal ------------------------------------------------
parse:         4.250000   0.010000   4.260000 (  4.265140)
parse & run:  10.320000   0.020000  10.340000 ( 10.342346)
-------------------------------------- total: 14.600000sec

                   user     system      total        real
parse:         4.180000   0.000000   4.180000 (  4.182315)
parse & run:  10.470000   0.020000  10.490000 ( 10.480743)

/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/bin/ruby ./performance/benchmark.rb strict
Rehearsal ------------------------------------------------
parse:         4.860000   0.010000   4.870000 (  4.864935)
parse & run:  10.890000   0.010000  10.900000 ( 10.905390)
-------------------------------------- total: 15.770000sec

                   user     system      total        real
parse:         4.860000   0.000000   4.860000 (  4.864340)
parse & run:  10.900000   0.010000  10.910000 ( 10.908346)

after

/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/bin/ruby ./performance/benchmark.rb lax
Rehearsal ------------------------------------------------
parse:         4.320000   0.000000   4.320000 (  4.320118)
parse & run:  10.390000   0.030000  10.420000 ( 10.426773)
-------------------------------------- total: 14.740000sec

                   user     system      total        real
parse:         4.230000   0.010000   4.240000 (  4.238727)
parse & run:  10.390000   0.010000  10.400000 ( 10.402394)

/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/bin/ruby ./performance/benchmark.rb strict
Rehearsal ------------------------------------------------
parse:         4.950000   0.010000   4.960000 (  4.956354)
parse & run:  11.000000   0.010000  11.010000 ( 11.007202)
-------------------------------------- total: 15.970000sec

                   user     system      total        real
parse:         4.940000   0.000000   4.940000 (  4.941804)
parse & run:  10.960000   0.020000  10.980000 ( 10.976216)
@fw42

This comment has been minimized.

Member

fw42 commented Jan 29, 2014

Right now, the template

start
{% for i in (1..10) %}
  {% assign foo = i %}
{% endfor %}
end

actually just generates start\n\nend. Liquid recognizes that the assign tag never generates any output and the body of block tags is stripped if it only contains whitespace (or tags that generate no output).

This might change in the future or at least become a non-default behaviour (#226).

@bgotink

This comment has been minimized.

bgotink commented Jan 29, 2014

I hadn't even noticed this change because Jekyll still uses 2.5.5...

Other than that: the problem is not solved in most usecases, as this optimization does not work if there's a chance that the block won't be whitespace:

irb(main):001:0> $:.unshift File.join(File.dirname(__FILE__), *%w{ .. lib })
=> ["./../lib", "/Library/Ruby/Site/2.0.0", "/Library/Ruby/Site/2.0.0/x86_64-darwin13", "/Library/Ruby/Site/2.0.0/universal-darwin13", "/Library/Ruby/Site", "/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/vendor_ruby/2.0.0", "/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/vendor_ruby/2.0.0/x86_64-darwin13", "/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/vendor_ruby/2.0.0/universal-darwin13", "/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/vendor_ruby", "/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0", "/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/x86_64-darwin13", "/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/universal-darwin13"]
irb(main):002:0> require 'liquid'
=> true
irb(main):003:0> puts Liquid::VERSION
3.0.0
=> nil
irb(main):004:0> Liquid::Template.parse("{% for i in (1..10) %}\n  {% assign foo = i %}\n{% endfor %}").render(nil)
=> ""
irb(main):005:0> Liquid::Template.parse("{% for i in (1..10) %}\n  {% assign foo = i %}\n\n{% endfor %}").render(nil)
=> ""
irb(main):006:0> Liquid::Template.parse("{% for i in (1..10) %}\n  {% assign foo = i %}{% if i == 2 %}foo{%endif%}\n{% endfor %}").render(nil)
=> "\n  \n\n  foo\n\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n"
irb(main):007:0> Liquid::Template.parse("{% for i in (1..10) %}\n  {% assign foo = i %}{% if i == 12 %}foo{%endif%}\n{% endfor %}").render(nil)
=> "\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n\n  \n"
@fw42

This comment has been minimized.

Member

fw42 commented Jan 29, 2014

#226 would optimize this more effectively:

irb(main):013:0> require "liquid";
irb(main):014:0* t = Liquid::Template.parse("{% for i in (1..10) %}\n  {% assign foo = i %}{% if i == 2 %}foo{%endif%}\n{% endfor %}");
irb(main):015:0* t.registers[:strip_whitespace] = true;
irb(main):016:0* t.render
=> "  foo\n"
@Phrogz

This comment has been minimized.

Phrogz commented Mar 19, 2014

FWIW, as a once-user of Erb, I expected the suggestion by @bgotink to work, to be able to write:

{% for a in b %}{%
  for c in d
     %}{{a}}-{{c}}{%
  endfor
%}{% endfor %}
@ghost

This comment has been minimized.

ghost commented Apr 24, 2014

It would be lovely to have a {% spaceless %} tag. This would let us to remove whitespace between HTML tags where we want this to happen (or even in all HTML document).

This has been successfully done in Twig template engine for PHP.

@doktorbro

This comment has been minimized.

doktorbro commented Jun 3, 2014

@fw42 As many Jekyll users report whitespace issues, I think I inform you about my HTML compressor for Jekyll. It removes all relevant whitespaces.The compressor needs no plugins, it works on Github Pages too.

So Jekyll users have an easy working solution and you don’t need to fight whitespaces in the parser.

@sparanoid

This comment has been minimized.

sparanoid commented Jun 5, 2014

Here's another use case:

    {% capture entry_link %}
      {% if site.data.var.link_blog and post.link %}
        {{ post.link }}
      {% else %}
        {{ post.url | prepend: site.data.var.base | prepend: site.data.var.url }}
      {% endif %}
    {% endcapture %}

In my Jekyll atom feed:

      <link rel="alternate" type="text/html" href="{{ entry_link }}" />

This generates:

      <link rel="alternate" type="text/html" href=" http://sparanoid.com/note/example/ " />

This will parse the wrong link in many RSS readers to something like:

href="http://sparanoid.com/ http://sparanoid.com/note/example/"

and jekyll-compress-html or other compressor won't stripe whitespace for this situation.

However here's my workaround:

    {% capture entry_link %}
      {% if site.data.var.link_blog and post.link %}
        <link rel="alternate" type="text/html" href="{{ post.link }}" />
      {% else %}
        <link rel="alternate" type="text/html" href="{{ post.url | prepend: site.data.var.base | prepend: site.data.var.url }}" />
      {% endif %}
    {% endcapture %}

I think there should be some better solution for that.

@jasimmk

This comment has been minimized.

jasimmk commented Jul 6, 2014

Why don't we try to introduce {%- if something %} like Jinja2 handles the case? http://jinja.pocoo.org/docs/templates/#whitespace-control

IMHO its the best and easy way to handle whitespaces

@fw42

This comment has been minimized.

Member

fw42 commented Jul 6, 2014

I don't really want to introduce any new syntax or tags, this should work automatically, imho.

@bman-80stees

This comment has been minimized.

bman-80stees commented Jul 11, 2014

I would really like Shopify to automatically have the whitespace removed that is created where ever the liquid code resides on the template.
I've used

<previous html>
{% for image in product.images 
%}{% if image contains .col 
%}<img src="{{ image.url  }}">{% else 
%}<img src="{{ featured_image.url }}">{% endif 
%}
<more html>

This essentially removes the whitespace, but is not that clear to read. If you're not careful, then you could miss an opening or closing liquid tag.

@dyspop

This comment has been minimized.

dyspop commented Jun 8, 2016

if you're generating .md with liquid you should realize that that wasn't a carefully considered pair. after you get over that reality then you could maybe think outside the box. if you know which whitespace characters they are you can capture the output and reprocess it with a replace.

{% capture temp %}whatever code creates whitespace{% endcapture %} {{ temp | replace: ' ', '' }}

or something similar...

@narthur

This comment has been minimized.

narthur commented Jun 8, 2016

@dyspop Yes, I do something similar:

{% capture temp %}
   ... do stuff
{% endcapture %}

{% assign temp = temp | strip %}

As Shopify's docs state, the strip filter "strips tabs, spaces, and newlines (all whitespace) from the left and right side of a string."

Still, it's a bit kludgy and I do hope someone will implement a no-whitespace set of tags soon.

@waza123x

This comment has been minimized.

waza123x commented Jun 8, 2016

ok thanks

this helped:

{% capture my_title %}
   ... my super title code very big 9999 lines of liquid code
{% endcapture %}

<title>{{ my_title | strip }}</title>

@nickpearson

This comment has been minimized.

Contributor

nickpearson commented Jun 8, 2016

Until {%- or something is implemented, you can control whitespace by placing {%, %}, {{, and }} on different lines than the tags/output they contain. See my comment and @dylanahsmith's reply for more.

@waza123x

This comment has been minimized.

waza123x commented Jun 8, 2016

ye thanks, that is nice too!


a{{
  'b'
}}c{%
  if true
%}d{%
  endif 
%}e
@sondr3

This comment has been minimized.

sondr3 commented Jun 8, 2016

@narthur I realize things might be more complicated than it appears, but three years for something like this is a bit extreme in my opinion.

@fw42

This comment has been minimized.

Member

fw42 commented Jun 8, 2016

three years for something like this is a bit extreme in my opinion.

Please note that this is not an issue that we are actively working on. Yes, this GitHub issue is 3 years old. But that does not mean that we have people working on this for 3 years and still can't figure it out.

We realize that this is annoying and personally I hate that we have this problem in the first place. But unfortunately it's not really a priority for Shopify and not that big of a problem for Shopify's use cases of Liquid (the reality is that we have more important and more impactful stuff to work on), so we decided not to dedicate resources to this.

That decision might change in the future. Until then, we are happy to consider and review all external contributions and pull requests.

@johngilesyoder

This comment has been minimized.

johngilesyoder commented Jun 9, 2016

Chiming in to say that this is absolute bullcrap. In fact, this attitude/approach will likely be the tipping point for me to move on to a different ecommerce platform as a developer. Exceedingly tone-deaf decision. And that response from the individual who OPENED the ticket.

@jcrawford

This comment has been minimized.

jcrawford commented Jun 9, 2016

I was CC'd on this ticket and I created #162 a long time ago. I am guessing that is why this new ticket was created. If Shopify doesn't want to work with their customers or open source code users then just stop using their products. I have since moved on from using Liquid but that is not to say that I wouldn't use it again in the future.

Maybe a fork needs to be created and take a different path since the only things that will get fixed are Shopify important issues :)

@parkr

This comment has been minimized.

Contributor

parkr commented Jun 9, 2016

Just out of curiosity, why is whitespace such an important issue that Shopify developers should drop what they're doing and work on it? It seems very inconsequential to me, personally.

@fw42

This comment has been minimized.

Member

fw42 commented Jun 9, 2016

And that response from the individual who OPENED the ticket.

I opened this issue because I do believe that this should indeed be fixed, or at least discussed, which is what we did (and as I said, I'm still happy to discuss any potential solutions that you might want to suggest). The reality is though that there is also 5000 other things that also need to be fixed or implemented and this one here is pretty low on our list of priorities. I realize that that's really annoying for you (since you probably don't care about those 5000 other things) but that's how it is right now.

@jcrawford

This comment has been minimized.

jcrawford commented Jun 9, 2016

@parkr I had an issue with it because it was adding a lot of whitespace, I was using Jekyll, see issue #162 my main concern was that it was increasing the size of the HTML document, thereby increasing bandwidth used for each request.

@narthur

This comment has been minimized.

narthur commented Jun 9, 2016

@jcrawford Check this link. It might solve your problem. It references this Jekyll layout that minifies your HTML output.

@jcrawford

This comment has been minimized.

jcrawford commented Jun 9, 2016

@narthur that would definitely have solved my problem :) Who knows maybe I will keep that in mind and try Jeckyll again, Thank You!

@timwis

This comment has been minimized.

timwis commented Jun 9, 2016

I just want to chime in and say that having extra whitespace in my documents makes JSON documents look really strange and HTML documents look less neat and tidy. On the whole it has impacted my life very little. It does not make me angry that the feature isn't available. If I come across a real need for it, I'll either use the {% capture %} many work-arounds mentioned above or figure out how to implement it and submit a pull request.

@nemoDreamer

This comment has been minimized.

nemoDreamer commented Jun 9, 2016

My issue with the white-space was not bandwidth, etc... but not being able to use partials for inline substitution when combined w/ Github Flavored Markup (which respects newlines). So having This is my {icon_link} here would have created this, because of the normal newline at the end of my partial:

This is my <a href="..."><img ... /><span>Facebook</span></a><br />
here

But there's probably a GFM or Jekyll-specific workaround to ignore newlines temporarily.

@kleuter

This comment has been minimized.

kleuter commented Jul 30, 2016

Please fix it.

@baberuth22

This comment has been minimized.

baberuth22 commented Aug 5, 2016

Why is this still open after 3 years? My issue 1 space added to the end of any include. I am trying to use the include as a variable for shortcode usage.

@fw42

This comment has been minimized.

Member

fw42 commented Aug 11, 2016

Closed by #773

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment