Skip to content

Commit

Permalink
Escape CDATA inserted into script/style tags so they still parse in R…
Browse files Browse the repository at this point in the history
…EXML and work in browsers.

The output is pretty ugly, but at least these tags will now work without crashing REXML or nuking the script/style. Fixes issue #120.
  • Loading branch information
bhollis committed Jan 3, 2014
1 parent fd09172 commit 90cb184
Show file tree
Hide file tree
Showing 4 changed files with 100 additions and 13 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
* More robust table handling.
* Better handling of lists.
* Fix the "blahtex" math engine on 1.8.7.
* "script" and "style" tags now have their generated CDATA tags escaped so the scripts/styles actually work. #120

0.7.0
-----
Expand Down
19 changes: 14 additions & 5 deletions lib/maruku/input/html_helper.rb
Original file line number Diff line number Diff line change
Expand Up @@ -93,9 +93,10 @@ def eat_this(line)
when :inside_cdata
if @m = CDataEnd.match(@rest)
my_debug "#{@state}: matched #{@m.to_s.inspect}"
@already << @m.pre_match << @m.to_s
@rest = @m.post_match
self.state = %(script style).include?(@tag_stack.last) ? :inside_script_style : :inside_element
@already << @m.pre_match
@already << @m.to_s unless self.state == :inside_script_style
@rest = @m.post_match
else
@already << @rest
@rest = ""
Expand All @@ -110,8 +111,8 @@ def eat_this(line)
my_debug "#{@state}: CDATA: #{@m.to_s.inspect}"
@already << @m.pre_match << @m.to_s
@rest = @m.post_match
self.state = :inside_cdata
end
self.state = :inside_cdata
elsif @m = Tag.match(@rest)
is_closing = !!@m[1]
tag = @m[2]
Expand All @@ -121,7 +122,7 @@ def eat_this(line)
@rest = @m.post_match
# This is necessary to properly parse
# script tags
@already << "]]>" unless @already.rstrip.end_with?("]]>")
@already << script_style_cdata_end(tag) unless @already.rstrip.end_with?(']]>')
self.state = :inside_element
handle_tag false # don't double-add pre_match
else
Expand Down Expand Up @@ -191,7 +192,7 @@ def handle_tag(add_pre_match = true)
if %w(script style).include?(@tag_stack.last)
# This is necessary to properly parse
# script tags
@already << "<![CDATA["
@already << script_style_cdata_start(@tag_stack.last)
self.state = :inside_script_style
end
end
Expand All @@ -218,5 +219,13 @@ def stuff_you_read
def is_finished?
(self.state == :inside_element) and @tag_stack.empty?
end

def script_style_cdata_start(tag)
(tag == 'script') ? "//<![CDATA[\n" : "/*<![CDATA[*/\n"
end

def script_style_cdata_end(tag)
(tag == 'script') ? "\n//]]>" : "\n/*]]>*/"
end
end # html helper
end
54 changes: 54 additions & 0 deletions spec/block_docs/issue120.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
Style tags should be OK with unescaped angle brackets and ampersands. https://github.com/bhollis/maruku/issues/120
NOTE: Commented CDATA is output because we use XHTML - for HTML mode it should be omitted.
*** Parameters: ***
{}
*** Markdown input: ***
<style type="text/css">
p > .highlight {
color: red;
background-image: url('/foo?bar&baz');
}
</style>

<style type="text/css">/*<![CDATA[*/
p > .highlight {
color: red;
background-image: url('/foo?bar&baz');
}
/*]]>*/</style>

<style type="text/css">
/*<![CDATA[*/
p > .highlight {
color: red;
background-image: url('/foo?bar&baz');
}
/*]]>*/
</style>
*** Output of inspect ***

*** Output of to_html ***
<style type='text/css'>/*<![CDATA[*/

p > .highlight {
color: red;
background-image: url('/foo?bar&baz');
}

/*]]>*/</style><style type='text/css'>/*<![CDATA[*/
/*<![CDATA[*/
p > .highlight {
color: red;
background-image: url('/foo?bar&baz');
}
/**/
/*]]>*/</style><style type='text/css'>/*<![CDATA[*/

/*<![CDATA[*/
p > .highlight {
color: red;
background-image: url('/foo?bar&baz');
}
/**/

/*]]>*/</style>
39 changes: 31 additions & 8 deletions spec/block_docs/issue40.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,21 +20,44 @@ NOTE: CDATA is output because we use XHTML - for HTML mode it should be omitted.
var x = true && true;
]]>
</script>

<script>//<![CDATA[
var x = true && true;
//]]></script>

<script>
//<![CDATA[
var x = true && true;
//]]>
</script>
*** Output of inspect ***

*** Output of to_html ***
<script><![CDATA[
<script>//<![CDATA[

var x = true && true;
]]></script>

<script><![CDATA[foo && bar]]></script>
//]]></script><script>//<![CDATA[
foo && bar
//]]></script><script>//<![CDATA[

<script><![CDATA[
var x = true && true;
]]></script>

<script><![CDATA[foo && bar]]></script>
//]]></script><script>//<![CDATA[
foo && bar
//]]></script><script>//<![CDATA[

<script><![CDATA[
var x = true && true;
]]></script>


//]]></script><script>//<![CDATA[
//<![CDATA[
var x = true && true;
//
//]]></script><script>//<![CDATA[

//<![CDATA[
var x = true && true;
//

//]]></script>

0 comments on commit 90cb184

Please sign in to comment.