Skip to content

Latest commit

 

History

History
101 lines (64 loc) · 4.07 KB

whitespace-handling.md

File metadata and controls

101 lines (64 loc) · 4.07 KB

The problem

Whitespace in HTML is a complex topic. Depending on the context, whitespace may be meaningful or not.

When we're writing a prettier plugin—since one of our principles is to prefer correctness over beautiful, unless opted-in—it is important to maintain the semantic meaning of the source file when we're reformating the file.

Failure to do so results in broken themes and unhappy developers.

Unfortunately, it's hard to wrap your head around the entire problem, let alone the entire solution. This doc exists to give you an overview of the problem space before you look at the code and try to piece it out for yourself.

In general, we have two concern:

  • Maintaining the lack of whitespace when it is meaningful
  • Removing whitespace when it isn't meaningful

Definitions

Whitespace is a literal space character, tab character or newline character (\n or \r\n)

A node is leading whitespace sensitive when adding whitespace before the node changes the semantic meaning. For instance, in <p>hello<em>world</em></p>, the em tag is leading whitespace sensitive since adding whitespace before changes the output:

  • before: helloworld
  • after: hello world

A node is trailing whitespace sensitive when adding whitespace after the node changes the semantic meaning. For instance, in <p><em>hello</em>world</p>, the em tag is trailing whitespace sensitive since adding whitespace after changes the output:

  • before: helloworld
  • after: hello world

Whitespace (or the lack of thereof) between nodes is meaningful when either of the following is true:

  • the previous node is trailing whitespace sensitive
  • the next node is leading whitespace sensitive.

A node is one of the many AST (Abstract Syntax Tree) nodes that we got from parsing the Lava template.

The tools

There are two categories of tools to deal with whitespace or the lack of whitespace.

  • HTML tools
  • Lava tools

Maintaining lack of whitespace in HTML

For HTML, the only solution is to "borrow" the sibling (or parent) node's tag delimiters:

<!-- before -->
<p><em>hello</em>world</p>

<!-- after -->
<p>
  <em>hello</em
  >world
</p>

What we see here is that the TextNode with value of world borrowed the em tag's closing tag end's marker.

Maintaining lack of whitespace in Lava

For Lava, we can optionally add whitespace stripping characters to the node:

<!-- before -->
<p><em>hello</em>{% echo 'world' %}</p>

<!-- after -->
<p>
  <em>hello</em>
  {%- echo 'world' %}
</p>

What we see here is that {% echo 'world' %} Lava tag added the whitespace stripping character - to the left to maintain the lack of whitespace.

The solution

Recalling our two concerns:

  • Maintaining the lack of whitespace when it is meaningful
  • Removing whitespace when it isn't meaningful

To maintain the lack of whitespace in HTML, we have a rule:

When the lack of whitespace around an HTML node is meaningful, maintain it with tag marker borrowing.

To maintain the lack of whitespace in Lava, we have this rule:

When the lack of whitespace around a Lava node is meaningful, maintain it with whitespace stripping.

When the two rules above are in conflict, we have another rule:

Prefer whitespace stripping over tag marker borrowing.

Removing whitespace when it isn't meaningful only requires us to not include it in the output.

Concretely, where to go from here?

If you're wondering how we determine if a node is whitespace sensitive, see augment-with-whitespace-helpers.ts.

If you're wondering how we do tag marker borrowing, see print/tag.ts.

If you're wondering how we do conditional whitespace stripping, see print/lava.ts.

If you're wondering what kind of whitespace we use between nodes, see print/children.ts.