Remove `erb_parser` and add `deface` for transforming HTML+ERB #60

marcoroth · 2023-01-22T01:50:03Z

This pull request removes the erb_parser dependency and adds deface for turning HTML+ERB into HTML with <erb> tags.

Here's a summary of what changed:

erb_parser used to transform the input from <div><%= method %></div> to <div><erb loud> method </erb></div> in Phlexing::ErbTransformer. This was replaced with the Deface::Parser
erb_parser used to extract the Ruby code from ERB templates in Phlexing::RubyAnalyzer. This was replaces with our Phlexing::Parser class.
most html variables were renamed to source
a bunch of new tests for, mostly notably for Phlexing::Minifier, Phlexing::Parser and Phlexing::ErbTransformer

The most important reason for switching from erb_parser to deface is the way how deface transforms the HTML+ERB source into HTML.

Example

Input:

<div class="<%= something? "class-1" : "class-2" %>"><%= some_method %></div>

erb_parser transform output:

<div class=\"<erb interpolated=\"true\"> something? &quot;class-1&quot; : &quot;class-2&quot; </erb>\"><erb interpolated=\"true\"> some_method </erb></div>

Which parsed in Nokogiri ends up with this invalid HTML, because <erb> tags (or any tags for that matter) aren't allowed in attributes:

#(Document:0x5f104 { 
  name = "document", 
  children = [ 
    #(Element:0x5f230 { name = "div", attributes = [ #(Attr:0x5f35c { name = "class", value = "" })] })
  ] 
})

deface transform output:
Deface handles this by prefxing the attribute name with data-erb- and escapes the value. With that we are able to detect which HTML attributes not to be interpolated, so we can process the value of the attribute manually:

<div data-erb-class='&lt;%= something? \"class-1\" : \"class-2\" %&gt;'><erb loud> some_method </erb></div>

Parsing this with Nokogiri gives us the thing we are looking for:

#(Document:0x1eb504 {
  name = "document",
  children = [
    #(Element:0x1eb658 {
      name = "div",
      attributes = [ #(Attr:0x1eb7ac { name = "data-erb-class", value = "<%= something? \"class-1\" : \"class-2\" %>" })],
      children = [ #(Element:0x1eba2c { name = "erb", children = [ #(Text " some_method ")] })]
    })
  ]
})

TL;DR:

This allows us to solve issues like #48.

…ce handling

gem/lib/phlexing/parser.rb

+
+      if source =~ html_tag
+        Nokogiri::HTML::Document.parse(source)
+      elsif initial =~ head_tag && source =~ body_tag


gem/lib/phlexing/parser.rb

+        Nokogiri::HTML::Document.parse(source)
+      elsif initial =~ head_tag && source =~ body_tag
+        Nokogiri::HTML::Document.parse(source).css("html").first
+      elsif initial =~ head_tag


marcoroth added 5 commits January 21, 2023 20:12

Swap erb_parser for deface to transform HTML+ERB into HTML

9994c44

Rename html -> source, add more tests, refactorings more whitespa…

08e9f12

…ce handling

Tweak parser and add more tests

0515a11

fix remaining tests

f499b57

Remove remaining bits of erb_parser

5ef2e80

marcoroth added the enhancement New feature or request label Jan 22, 2023

github-advanced-security bot found potential problems Jan 22, 2023

View reviewed changes

marcoroth merged commit 985cb6c into main Jan 22, 2023

marcoroth deleted the swap-html-erb-transformer branch January 22, 2023 01:59

marcoroth mentioned this pull request Jan 22, 2023

Interpolating ERB in HTML attributes #48

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove `erb_parser` and add `deface` for transforming HTML+ERB #60

Remove `erb_parser` and add `deface` for transforming HTML+ERB #60

marcoroth commented Jan 22, 2023 •

edited

Loading

Remove erb_parser and add deface for transforming HTML+ERB #60

Remove erb_parser and add deface for transforming HTML+ERB #60

Conversation

marcoroth commented Jan 22, 2023 • edited Loading

Example

TL;DR:

Remove `erb_parser` and add `deface` for transforming HTML+ERB #60

Remove `erb_parser` and add `deface` for transforming HTML+ERB #60

marcoroth commented Jan 22, 2023 •

edited

Loading