Whitespace in HTML is a complex topic. Depending on the context, whitespace may be meaningful or not.
When we're writing a prettier plugin—since one of our principles is to prefer correctness over beautiful, unless opted-in—it is important to maintain the semantic meaning of the source file when we're reformating the file.
Failure to do so results in broken themes and unhappy developers.
Unfortunately, it's hard to wrap your head around the entire problem, let alone the entire solution. This doc exists to give you an overview of the problem space before you look at the code and try to piece it out for yourself.
In general, we have two concern:
- Maintaining the lack of whitespace when it is meaningful
- Removing whitespace when it isn't meaningful
Whitespace is a literal space character, tab character or newline character (\n
or \r\n
)
A node is leading whitespace sensitive when adding whitespace before the node changes the semantic meaning. For instance, in <p>hello<em>world</em></p>
, the em
tag is leading whitespace sensitive since adding whitespace before changes the output:
- before: helloworld
- after: hello world
A node is trailing whitespace sensitive when adding whitespace after the node changes the semantic meaning. For instance, in <p><em>hello</em>world</p>
, the em
tag is trailing whitespace sensitive since adding whitespace after changes the output:
- before: helloworld
- after: hello world
Whitespace (or the lack of thereof) between nodes is meaningful when either of the following is true:
- the previous node is trailing whitespace sensitive
- the next node is leading whitespace sensitive.
A node is one of the many AST (Abstract Syntax Tree) nodes that we got from parsing the Lava template.
There are two categories of tools to deal with whitespace or the lack of whitespace.
- HTML tools
- Lava tools
For HTML, the only solution is to "borrow" the sibling (or parent) node's tag delimiters:
<!-- before -->
<p><em>hello</em>world</p>
<!-- after -->
<p>
<em>hello</em
>world
</p>
What we see here is that the TextNode
with value of world
borrowed the em
tag's closing tag end's marker.
For Lava, we can optionally add whitespace stripping characters to the node:
<!-- before -->
<p><em>hello</em>{% echo 'world' %}</p>
<!-- after -->
<p>
<em>hello</em>
{%- echo 'world' %}
</p>
What we see here is that {% echo 'world' %}
Lava tag added the whitespace stripping character -
to the left to maintain the lack of whitespace.
Recalling our two concerns:
- Maintaining the lack of whitespace when it is meaningful
- Removing whitespace when it isn't meaningful
To maintain the lack of whitespace in HTML, we have a rule:
When the lack of whitespace around an HTML node is meaningful, maintain it with tag marker borrowing.
To maintain the lack of whitespace in Lava, we have this rule:
When the lack of whitespace around a Lava node is meaningful, maintain it with whitespace stripping.
When the two rules above are in conflict, we have another rule:
Prefer whitespace stripping over tag marker borrowing.
Removing whitespace when it isn't meaningful only requires us to not include it in the output.
If you're wondering how we determine if a node is whitespace sensitive, see augment-with-whitespace-helpers.ts.
If you're wondering how we do tag marker borrowing, see print/tag.ts.
If you're wondering how we do conditional whitespace stripping, see print/lava.ts.
If you're wondering what kind of whitespace we use between nodes, see print/children.ts.