Skip to content

Commit

Permalink
Remove erb_parser and add deface for transforming HTML+ERB (#60)
Browse files Browse the repository at this point in the history
This pull request removes the `erb_parser` dependency and adds `deface`
for turning HTML+ERB into HTML with `<erb>` tags.

Here's a summary of what changed:

* `erb_parser` used to transform the input from `<div><%= method
%></div>` to `<div><erb loud> method </erb></div>` in
`Phlexing::ErbTransformer`. This was replaced with the `Deface::Parser`

* `erb_parser` used to extract the Ruby code from ERB templates in
`Phlexing::RubyAnalyzer`. This was replaces with our `Phlexing::Parser`
class.

* most `html` variables were renamed to `source`

* a bunch of new tests for, mostly notably for `Phlexing::Minifier`,
`Phlexing::Parser` and `Phlexing::ErbTransformer`


The most important reason for switching from `erb_parser` to `deface` is
the way how `deface` transforms the HTML+ERB source into HTML.

### Example

**Input:**
```html+erb
<div class="<%= something? "class-1" : "class-2" %>"><%= some_method %></div>
```

**`erb_parser` transform output:**
```html
<div class=\"<erb interpolated=\"true\"> something? &quot;class-1&quot; : &quot;class-2&quot; </erb>\"><erb interpolated=\"true\"> some_method </erb></div>
```
Which parsed in Nokogiri ends up with this invalid HTML, because `<erb>`
tags (or any tags for that matter) aren't allowed in attributes:
```ruby
#(Document:0x5f104 { 
  name = "document", 
  children = [ 
    #(Element:0x5f230 { name = "div", attributes = [ #(Attr:0x5f35c { name = "class", value = "" })] })
  ] 
})
```

**`deface` transform output:**
Deface handles this by prefxing the attribute name with `data-erb-` and
escapes the value. With that we are able to detect which HTML attributes
not to be interpolated, so we can process the value of the attribute
manually:
```html
<div data-erb-class='&lt;%= something? \"class-1\" : \"class-2\" %&gt;'><erb loud> some_method </erb></div>
```
Parsing this with Nokogiri gives us the thing we are looking for:
```ruby
#(Document:0x1eb504 {
  name = "document",
  children = [
    #(Element:0x1eb658 {
      name = "div",
      attributes = [ #(Attr:0x1eb7ac { name = "data-erb-class", value = "<%= something? \"class-1\" : \"class-2\" %>" })],
      children = [ #(Element:0x1eba2c { name = "erb", children = [ #(Text " some_method ")] })]
    })
  ]
})
```

### TL;DR:

This allows us to solve issues like #48.
  • Loading branch information
marcoroth authored Jan 22, 2023
1 parent 67d419b commit 985cb6c
Show file tree
Hide file tree
Showing 20 changed files with 481 additions and 119 deletions.
12 changes: 7 additions & 5 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ PATH
remote: gem
specs:
phlexing (0.3.0)
erb_parser (~> 0.0.2)
deface (~> 1.9)
html_press (~> 0.8.2)
nokogiri (~> 1.0)
phlex (~> 1.1)
Expand Down Expand Up @@ -117,8 +117,12 @@ GEM
debug (1.7.1)
irb (>= 1.5.0)
reline (>= 0.3.1)
erb_parser (0.0.2)
treetop
deface (1.9.0)
actionview (>= 5.2)
nokogiri (>= 1.6)
polyglot
railties (>= 5.2)
rainbow (>= 2.1.0)
erubi (1.12.0)
execjs (2.8.1)
globalid (1.0.1)
Expand Down Expand Up @@ -230,8 +234,6 @@ GEM
prettier_print (>= 1.2.0)
thor (1.2.1)
timeout (0.3.1)
treetop (1.6.12)
polyglot (~> 0.3)
turbo-rails (1.3.2)
actionpack (>= 6.0.0)
activejob (>= 6.0.0)
Expand Down
6 changes: 3 additions & 3 deletions app/controllers/converters_controller.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,18 @@ def index
end

def create
content = params["input"] || ""
source = params["input"] || ""
whitespace = params["whitespace"] ? true : false
component = params["component"] ? true : false

component_name = params["component_name"].presence || Phlexing::NameSuggestor.suggest(content)
component_name = params["component_name"].presence || Phlexing::NameSuggestor.suggest(source)
component_name = component_name.gsub(" ", "_").camelize.squeeze("Component")

parent_component = params["parent_component"].presence || "Phlex::HTML"
parent_component = parent_component.gsub(" ", "_").camelize

@converter = Phlexing::Converter.new(
content,
source,
whitespace: whitespace,
component: component,
component_name: component_name,
Expand Down
58 changes: 53 additions & 5 deletions gem/Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ PATH
remote: .
specs:
phlexing (0.3.0)
erb_parser (~> 0.0.2)
deface (~> 1.9)
html_press (~> 0.8.2)
nokogiri (~> 1.0)
phlex (~> 1.1)
Expand All @@ -27,22 +27,54 @@ PATH
GEM
remote: https://rubygems.org/
specs:
actionpack (7.0.4.1)
actionview (= 7.0.4.1)
activesupport (= 7.0.4.1)
rack (~> 2.0, >= 2.2.0)
rack-test (>= 0.6.3)
rails-dom-testing (~> 2.0)
rails-html-sanitizer (~> 1.0, >= 1.2.0)
actionview (7.0.4.1)
activesupport (= 7.0.4.1)
builder (~> 3.1)
erubi (~> 1.4)
rails-dom-testing (~> 2.0)
rails-html-sanitizer (~> 1.1, >= 1.2.0)
activesupport (7.0.4.1)
concurrent-ruby (~> 1.0, >= 1.0.2)
i18n (>= 1.6, < 2)
minitest (>= 5.1)
tzinfo (~> 2.0)
ast (2.4.2)
builder (3.2.4)
concurrent-ruby (1.1.10)
crass (1.0.6)
css_press (0.3.2)
csspool-st (= 3.1.2)
json
csspool-st (3.1.2)
erb_parser (0.0.2)
treetop
deface (1.9.0)
actionview (>= 5.2)
nokogiri (>= 1.6)
polyglot
railties (>= 5.2)
rainbow (>= 2.1.0)
erubi (1.12.0)
execjs (2.8.1)
html_press (0.8.2)
htmlentities
multi_css (>= 0.1.0)
multi_js (>= 0.1.0)
htmlentities (4.3.4)
i18n (1.12.0)
concurrent-ruby (~> 1.0)
json (2.6.2)
loofah (2.19.1)
crass (~> 1.0.2)
nokogiri (>= 1.5.9)
maxitest (4.4.0)
minitest (>= 5.0.0, < 5.18.0)
method_source (1.0.0)
minitest (5.16.3)
multi_css (0.1.0)
css_press
Expand All @@ -60,6 +92,21 @@ GEM
polyglot (0.3.5)
prettier_print (1.2.0)
racc (1.6.2)
rack (2.2.6.2)
rack-test (2.0.2)
rack (>= 1.3)
rails-dom-testing (2.0.3)
activesupport (>= 4.2.0)
nokogiri (>= 1.6)
rails-html-sanitizer (1.5.0)
loofah (~> 2.19, >= 2.19.1)
railties (7.0.4.1)
actionpack (= 7.0.4.1)
activesupport (= 7.0.4.1)
method_source
rake (>= 12.2)
thor (~> 1.0)
zeitwerk (~> 2.5)
rainbow (3.1.1)
rake (13.0.6)
regexp_parser (2.6.2)
Expand All @@ -69,8 +116,9 @@ GEM
ruby-progressbar (1.11.0)
syntax_tree (5.2.0)
prettier_print (>= 1.2.0)
treetop (1.6.12)
polyglot (~> 0.3)
thor (1.2.1)
tzinfo (2.0.5)
concurrent-ruby (~> 1.0)
uglifier (2.7.2)
execjs (>= 0.3.0)
json (>= 1.8.0)
Expand Down
18 changes: 9 additions & 9 deletions gem/lib/phlexing/converter.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,33 +2,33 @@

module Phlexing
class Converter
attr_accessor :html, :custom_elements, :options, :analyzer
attr_accessor :source, :custom_elements, :options, :analyzer

def self.convert(html, **options)
new(**options).convert(html)
def self.convert(source, **options)
new(**options).convert(source)
end

def convert(html)
@html = html
analyzer.analyze(html)
def convert(source)
@source = source
analyzer.analyze(source)

code
end

def initialize(html = nil, **options)
def initialize(source = nil, **options)
@custom_elements = Set.new
@analyzer = RubyAnalyzer.new
@options = Options.new(**options)

convert(html)
convert(source)
end

def code
options.component? ? component_code : template_code
end

def template_code
TemplateGenerator.generate(self, html)
TemplateGenerator.generate(self, source)
end

def component_code
Expand Down
30 changes: 20 additions & 10 deletions gem/lib/phlexing/erb_transformer.rb
Original file line number Diff line number Diff line change
@@ -1,25 +1,35 @@
# frozen_string_literal: true

require "erb_parser"
require "deface"

module Phlexing
class ErbTransformer
def self.transform(html)
transformed = html.to_s
def self.transform(source)
transformed = source.to_s
transformed = transform_template_tags(transformed)
transformed = transform_erb_tags(transformed)
transformed = transform_remove_newlines(transformed)
transformed = transform_template_tag(transformed)
transformed = transform_whitespace(transformed)

ErbParser.transform_xml(transformed)
transformed
rescue StandardError
html
source
end

def self.transform_remove_newlines(html)
html.tr("\n", "").tr("\r", "")
def self.transform_remove_newlines(source)
source.tr("\n", "").tr("\r", "")
end

def self.transform_template_tag(html)
html.gsub("<template", "<template-tag").gsub("</template", "</template-tag")
def self.transform_template_tags(source)
source.gsub("<template", "<template-tag").gsub("</template", "</template-tag")
end

def self.transform_erb_tags(source)
Deface::Parser.convert(source).to_html
end

def self.transform_whitespace(source)
source.strip
end
end
end
6 changes: 3 additions & 3 deletions gem/lib/phlexing/formatter.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@

module Phlexing
class Formatter
def self.format(code, max: 80)
SyntaxTree.format(code.to_s, max).strip
def self.format(source, max: 80)
SyntaxTree.format(source.to_s, max).strip
rescue SyntaxTree::Parser::ParseError
code
source
end
end
end
4 changes: 4 additions & 0 deletions gem/lib/phlexing/helpers.rb
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,10 @@ def parens(string)
"(#{string})"
end

def unescape(source)
CGI.unescapeHTML(source)
end

def tag_name(node)
return "template_tag" if node.name == "template-tag"

Expand Down
6 changes: 3 additions & 3 deletions gem/lib/phlexing/minifier.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@

module Phlexing
class Minifier
def self.minify(html)
HtmlPress.press(html.to_s)
def self.minify(source)
HtmlPress.press(source.to_s)
rescue StandardError
html
source
end
end
end
6 changes: 3 additions & 3 deletions gem/lib/phlexing/name_suggestor.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ module Phlexing
class NameSuggestor
using Refinements::StringRefinements

def self.suggest(html)
document = Parser.parse(html)
analyzer = RubyAnalyzer.analyze(html)
def self.suggest(source)
document = Parser.parse(source)
analyzer = RubyAnalyzer.analyze(source)

ivars = analyzer.ivars
locals = analyzer.locals
Expand Down
27 changes: 23 additions & 4 deletions gem/lib/phlexing/parser.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,30 @@

module Phlexing
class Parser
def self.parse(html)
transformed_erb = ErbTransformer.transform(html.to_s)
minified_erb = Minifier.minify(transformed_erb)
def self.parse(source)
initial = source
source = ErbTransformer.transform(source.to_s)
source = Minifier.minify(source)

Nokogiri::HTML.fragment(minified_erb)
# Credit:
# https://github.com/spree/deface/blob/6bf18df76715ee3eb3d0cd1b6eda822817ace91c/lib/deface/parser.rb#L105-L111
#

html_tag = /<html.*?(?:(?!>)[\s\S])*>/
head_tag = /<head.*?(?:(?!>)[\s\S])*>/
body_tag = /<body.*?(?:(?!>)[\s\S])*>/

if source =~ html_tag
Nokogiri::HTML::Document.parse(source)
elsif initial =~ head_tag && source =~ body_tag
Nokogiri::HTML::Document.parse(source).css("html").first
elsif initial =~ head_tag
Nokogiri::HTML::Document.parse(source).css("head").first
elsif source =~ body_tag
Nokogiri::HTML::Document.parse(source).css("body").first
else
Nokogiri::HTML::DocumentFragment.parse(source)
end
end
end
end
6 changes: 6 additions & 0 deletions gem/lib/phlexing/refinements/string_refinements.rb
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,12 @@ def from(position)
self[position, length]
end

# https://github.com/rails/rails/blob/46c45935123e7ae003767900e7d22a6e41995701/activesupport/lib/active_support/core_ext/string/access.rb#L63-L66 def from(position)
def to(position)
position += size if position < 0
self[0, position + 1] || +""
end

# https://github.com/rails/rails/blob/46c45935123e7ae003767900e7d22a6e41995701/activesupport/lib/active_support/core_ext/string/filters.rb#L13-L15
def squish
dup.squish!
Expand Down
24 changes: 14 additions & 10 deletions gem/lib/phlexing/ruby_analyzer.rb
Original file line number Diff line number Diff line change
@@ -1,16 +1,15 @@
# frozen_string_literal: true

require "syntax_tree"
require "erb_parser"

module Phlexing
class RubyAnalyzer
extend Forwardable

attr_accessor :ivars, :locals, :idents

def self.analyze(html)
new.analyze(html)
def self.analyze(source)
new.analyze(source)
end

def initialize
Expand All @@ -20,9 +19,9 @@ def initialize
@visitor = Visitor.new(self)
end

def analyze(html)
html = html.to_s
ruby = extract_ruby_from_erb(html)
def analyze(source)
source = source.to_s
ruby = extract_ruby_from_erb(source)
program = SyntaxTree.parse(ruby)
@visitor.visit(program)

Expand All @@ -31,12 +30,17 @@ def analyze(html)
self
end

def extract_ruby_from_erb(html)
tokens = ErbParser.parse(html).tokens
lines = tokens.map { |tag| tag.is_a?(ErbParser::ErbTag) && !tag.to_s.start_with?("<%#") ? tag.ruby_code.delete_prefix("=") : nil }
def extract_ruby_from_erb(source)
document = Parser.parse(source)
nodes = document.css("erb")

lines = nodes.map { |node| node.text.to_s.strip }
lines = lines.map { |line| line.delete_prefix("=") }
lines = lines.map { |line| line.delete_prefix("-") }
lines = lines.map { |line| line.delete_suffix("-") }

lines.join("\n")
rescue ErbParser::TreetopRunner::ParseError
rescue StandardError
""
end
end
Expand Down
Loading

0 comments on commit 985cb6c

Please sign in to comment.