unescapeHTML crashes with certain *.html_safe inputs #12672

Open
ptsneves opened this Issue Oct 28, 2013 · 22 comments

Comments

Projects
None yet
@ptsneves

I have a situation where I need to call unescapeHTML on a string that was marked as html_safe but when certain strings are set for unescape a crash happens with an error like:

CGI.unescapeHTML('The experimental macro Hello Latex["\"sdsd\""'.html_safe)
TypeError: can't dup NilClass
        from /usr/lib/ruby/1.9.1/cgi/util.rb:56:in `dup'
        from /usr/lib/ruby/1.9.1/cgi/util.rb:56:in `block in unescapeHTML'
        from /var/lib/gems/1.9.1/gems/activesupport-3.2.13/lib/active_support/core_ext/string/output_safety.rb:169:in `gsub'
        from /var/lib/gems/1.9.1/gems/activesupport-3.2.13/lib/active_support/core_ext/string/output_safety.rb:169:in `gsub'

I can reproduce it in a rails console:

txt='The experimental macro Hello Latex["\"sdsd\""'
txt=txt.html_safe
CGI.unescapeHTML(txt)

The string is:

'The experimental macro Hello Latex["\"sdsd\""'

One way to not trigger this problem is to interpolate the SafeBuffer generated by html_safe in a string. Then, no crash occurs. Using the to_s method of SafeBuffer does not work because it returns self.

txt='The experimental macro Hello Latex["\"sdsd\&quote;&quote;'
txt="#{txt.html_safe}"
CGI.unescapeHTML(txt)

My configuration is as follows:
Environment:
Ruby version 1.9.3-p194 (2012-04-20) [x86_64-linux]
Rails version 3.2.13
Database adapter Mysql2

@crazymykl

This comment has been minimized.

Show comment Hide comment
@crazymykl

crazymykl Oct 30, 2013

Contributor

Raising an exception is not quite the same severity as crashing.

I can confirm this also happens on Ruby 2.0.0-p247.

Contributor

crazymykl commented Oct 30, 2013

Raising an exception is not quite the same severity as crashing.

I can confirm this also happens on Ruby 2.0.0-p247.

@ptsneves

This comment has been minimized.

Show comment Hide comment
@ptsneves

ptsneves Oct 30, 2013

I guess the severity is a relative concept then :) For me, a core library that returns an exception like a nillclass and makes my application show an internal error to my users, is not acceptable and quite a severe problem. I don't even know if you can crash managed languages, if you exclude errors in Ruby itself. But i digress and I am certainly a noob in ruby and rails. Although in the old days null things certainly crashed parties.

What is important is that I consider this a bug(undocumented behavior) and severe(core library throwing a generic exception which is not reasonably handlable by the application). The time wasted to hunt this bugger was also significant.

Last but not least thank you for confirming on another version of Ruby. Do you have any hint where the problem might lie. If you could point me in a direction I could try to contribute a patch.

I guess the severity is a relative concept then :) For me, a core library that returns an exception like a nillclass and makes my application show an internal error to my users, is not acceptable and quite a severe problem. I don't even know if you can crash managed languages, if you exclude errors in Ruby itself. But i digress and I am certainly a noob in ruby and rails. Although in the old days null things certainly crashed parties.

What is important is that I consider this a bug(undocumented behavior) and severe(core library throwing a generic exception which is not reasonably handlable by the application). The time wasted to hunt this bugger was also significant.

Last but not least thank you for confirming on another version of Ruby. Do you have any hint where the problem might lie. If you could point me in a direction I could try to contribute a patch.

@notalex

This comment has been minimized.

Show comment Hide comment
@notalex

notalex Nov 13, 2013

Contributor

@ptsneves, thank you for reporting this bug. It would have been harder to trace without your initial effort.

CGI.unescape_html has code similar to the following:

string.gsub(/&(amp|quot);/) { match = $1.dup ... }

The problem is with sub or gsub called in block mode on an ActiveSupport::SafeBuffer object:

'foo and bar'.html_safe.gsub(/(foo).+(bar)/, '\1--\2')  # => 'foo--bar'
'foo and bar'.html_safe.gsub(/(foo).+(bar)/) { "#{ $1 }--#{ $2 }" }  # => "--"

# Works for String objects.
'foo and bar'.gsub(/(foo).+(bar)/) { "#{ $1 }--#{ $2 }" }  # => "foo--bar"

Since $1 is nil, the unescape_html method throws the Exception.

It's an ActiveSupport bug. I will try to work on a patch for this.

Contributor

notalex commented Nov 13, 2013

@ptsneves, thank you for reporting this bug. It would have been harder to trace without your initial effort.

CGI.unescape_html has code similar to the following:

string.gsub(/&(amp|quot);/) { match = $1.dup ... }

The problem is with sub or gsub called in block mode on an ActiveSupport::SafeBuffer object:

'foo and bar'.html_safe.gsub(/(foo).+(bar)/, '\1--\2')  # => 'foo--bar'
'foo and bar'.html_safe.gsub(/(foo).+(bar)/) { "#{ $1 }--#{ $2 }" }  # => "--"

# Works for String objects.
'foo and bar'.gsub(/(foo).+(bar)/) { "#{ $1 }--#{ $2 }" }  # => "foo--bar"

Since $1 is nil, the unescape_html method throws the Exception.

It's an ActiveSupport bug. I will try to work on a patch for this.

@notalex

This comment has been minimized.

Show comment Hide comment
@notalex

notalex Nov 13, 2013

Contributor

This is a known issue with gsub and SafeBuffer objects. #1555.

Contributor

notalex commented Nov 13, 2013

This is a known issue with gsub and SafeBuffer objects. #1555.

@notalex

This comment has been minimized.

Show comment Hide comment
@notalex

notalex Nov 13, 2013

Contributor

@ptsneves, Using CGI.unescapeHTML(txt.to_str) is the currently recommended solution.

Contributor

notalex commented Nov 13, 2013

@ptsneves, Using CGI.unescapeHTML(txt.to_str) is the currently recommended solution.

@acapilleri

This comment has been minimized.

Show comment Hide comment
@acapilleri

acapilleri Dec 10, 2013

Contributor

@amatsuda , It seems that this is the reason of this issue.
any thoughts ?

Contributor

acapilleri commented Dec 10, 2013

@amatsuda , It seems that this is the reason of this issue.
any thoughts ?

@leafac

This comment has been minimized.

Show comment Hide comment
@leafac

leafac Dec 15, 2013

Contributor

@acapilleri, I'm sorry but I don't understand why you think this may be the reason of the issue.

The following script is the reason:

class StringSubclass < String
  def gsub(*args, &block)
    to_str.gsub(*args, &block)
  end
end

StringSubclass.new('abc').gsub(/(b)/) { p $1 } # prints "b", as expected

Could you please explain better?

Contributor

leafac commented Dec 15, 2013

@acapilleri, I'm sorry but I don't understand why you think this may be the reason of the issue.

The following script is the reason:

class StringSubclass < String
  def gsub(*args, &block)
    to_str.gsub(*args, &block)
  end
end

StringSubclass.new('abc').gsub(/(b)/) { p $1 } # prints "b", as expected

Could you please explain better?

@acapilleri

This comment has been minimized.

Show comment Hide comment
@acapilleri

acapilleri Dec 16, 2013

Contributor

@leafac you're right in your example. But in our case we have

> StringSubclass.new('abc').gsub(/(b)/) { p $1.dup }
>TypeError: can't dup NilClass
    from (irb):18:in `dup'
    from (irb):18:in `block in irb_binding'
    from (irb):8:in `gsub'
    from (irb):8:in `gsub'
    from (irb):18

Also if your read the historyof this file there are many commits about security issue.

Contributor

acapilleri commented Dec 16, 2013

@leafac you're right in your example. But in our case we have

> StringSubclass.new('abc').gsub(/(b)/) { p $1.dup }
>TypeError: can't dup NilClass
    from (irb):18:in `dup'
    from (irb):18:in `block in irb_binding'
    from (irb):8:in `gsub'
    from (irb):8:in `gsub'
    from (irb):18

Also if your read the historyof this file there are many commits about security issue.

@dmathieu

This comment has been minimized.

Show comment Hide comment
@dmathieu

dmathieu Dec 16, 2013

Contributor

This is definitely the same issue as #1555. You can't use the $x variables provided in gsub with safe strings.
So you need to do as @notalex suggests: CGI.unescapeHTML(txt.to_str).

We're lacking of documentation on this though.

Contributor

dmathieu commented Dec 16, 2013

This is definitely the same issue as #1555. You can't use the $x variables provided in gsub with safe strings.
So you need to do as @notalex suggests: CGI.unescapeHTML(txt.to_str).

We're lacking of documentation on this though.

@leafac

This comment has been minimized.

Show comment Hide comment
@leafac

leafac Dec 16, 2013

Contributor

@acapilleri, actually my example is wrong. I ran it before posting, of course, but I must have messed something up and I have no idea why! Anyway, I ran the example again, it prints nil, not "b" as I previously stated.

This is actually very good, because it's consistent with all the previous comments on the subject.

Now I understand that due to the way Ruby's magic $<n> variables work, CGI.unescapeHTML(txt.to_str) is the only possible working solution to the issue. Which is a shame, because it breaks the Ruby Core API, but those magic variables are a mess by themselves to begin with. Matz himself said in RubyConf he wouldn't include them if he was doing Ruby all over again.

I volunteer to document this behavior. Where do you think this documentation should exist? activesupport/lib/active_support/core_ext/string/output_safety.rb itself? Should I PR to Rails or DocRails?

Contributor

leafac commented Dec 16, 2013

@acapilleri, actually my example is wrong. I ran it before posting, of course, but I must have messed something up and I have no idea why! Anyway, I ran the example again, it prints nil, not "b" as I previously stated.

This is actually very good, because it's consistent with all the previous comments on the subject.

Now I understand that due to the way Ruby's magic $<n> variables work, CGI.unescapeHTML(txt.to_str) is the only possible working solution to the issue. Which is a shame, because it breaks the Ruby Core API, but those magic variables are a mess by themselves to begin with. Matz himself said in RubyConf he wouldn't include them if he was doing Ruby all over again.

I volunteer to document this behavior. Where do you think this documentation should exist? activesupport/lib/active_support/core_ext/string/output_safety.rb itself? Should I PR to Rails or DocRails?

@acapilleri

This comment has been minimized.

Show comment Hide comment
@acapilleri

acapilleri Dec 16, 2013

Contributor

Honestly I don't know where we can document this, we need to wait for rails core opinion about this issue.

Contributor

acapilleri commented Dec 16, 2013

Honestly I don't know where we can document this, we need to wait for rails core opinion about this issue.

@ptsneves ptsneves added the stale label Apr 23, 2014

@rafaelfranca

This comment has been minimized.

Show comment Hide comment
@rafaelfranca

rafaelfranca May 1, 2014

Member

This issue has been automatically marked as stale because it has not been commented on for at least
three months.

The resources of the Rails team are limited, and so we are asking for your help.

If you can still reproduce this error on the 4-1-stable, 4-0-stable branches or on master,
please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions.

Member

rafaelfranca commented May 1, 2014

This issue has been automatically marked as stale because it has not been commented on for at least
three months.

The resources of the Rails team are limited, and so we are asking for your help.

If you can still reproduce this error on the 4-1-stable, 4-0-stable branches or on master,
please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions.

@leafac

This comment has been minimized.

Show comment Hide comment
@leafac

leafac May 2, 2014

Contributor

@rafaelfranca I'm still able to reproduce the issue in all mentioned version of Rails.

The step remains the same:

CGI.unescapeHTML('The experimental macro Hello Latex[&amp;quot;\&amp;quot;sdsd\&amp;quot;&amp;quot;'.html_safe)

The reason this happens is related to how Ruby works with magic variables such as $1 and it's not supposed to change anytime in near future, as far as I know.

Therefore, the only solution is to document the behavior. Our only doubt is where to do it. activesupport/lib/active_support/core_ext/string/output_safety.rb seems like a good candidate, but we're not sure about it.

Contributor

leafac commented May 2, 2014

@rafaelfranca I'm still able to reproduce the issue in all mentioned version of Rails.

The step remains the same:

CGI.unescapeHTML('The experimental macro Hello Latex[&amp;quot;\&amp;quot;sdsd\&amp;quot;&amp;quot;'.html_safe)

The reason this happens is related to how Ruby works with magic variables such as $1 and it's not supposed to change anytime in near future, as far as I know.

Therefore, the only solution is to document the behavior. Our only doubt is where to do it. activesupport/lib/active_support/core_ext/string/output_safety.rb seems like a good candidate, but we're not sure about it.

@matthewd

This comment has been minimized.

Show comment Hide comment
@matthewd

matthewd May 2, 2014

Member

Well, it can be made to work... but I doubt we'd want to. It certainly doesn't seem that it can be solved nicely.

Perhaps we should deprecate the use of a block with sub/gsub, and have the message tell people they should use to_str. (Hopefully, phrased in such a way that if you're using something like CGI.unescapeHTML, you'll work out you should make the to_str call yourself.)

To address your other question, in case you haven't learned the answer in the meantime: you should make a PR to rails/rails; docrails does not accept pull requests.

I suggest you try working this information into output_safety.rb, and submit it as a PR -- I think it'll be easier to talk about how well if fits there given an example of what it might say.

Sound like a plan?

Member

matthewd commented May 2, 2014

Well, it can be made to work... but I doubt we'd want to. It certainly doesn't seem that it can be solved nicely.

Perhaps we should deprecate the use of a block with sub/gsub, and have the message tell people they should use to_str. (Hopefully, phrased in such a way that if you're using something like CGI.unescapeHTML, you'll work out you should make the to_str call yourself.)

To address your other question, in case you haven't learned the answer in the meantime: you should make a PR to rails/rails; docrails does not accept pull requests.

I suggest you try working this information into output_safety.rb, and submit it as a PR -- I think it'll be easier to talk about how well if fits there given an example of what it might say.

Sound like a plan?

@robin850 robin850 removed the stale label May 2, 2014

@leafac

This comment has been minimized.

Show comment Hide comment
@leafac

leafac May 9, 2014

Contributor

@matthewd I like the idea of deprecating these methods when called with blocks. I attached some code to this issue with that implementation. I can make it a PR if that's approved.

Contributor

leafac commented May 9, 2014

@matthewd I like the idea of deprecating these methods when called with blocks. I attached some code to this issue with that implementation. I can make it a PR if that's approved.

@rails-bot rails-bot added the stale label Aug 19, 2014

@rails-bot

This comment has been minimized.

Show comment Hide comment
@rails-bot

rails-bot Aug 19, 2014

This issue has been automatically marked as stale because it has not been commented on for at least
three months.

The resources of the Rails team are limited, and so we are asking for your help.

If you can still reproduce this error on the 4-1-stable, 4-0-stable branches or on master,
please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions.

This issue has been automatically marked as stale because it has not been commented on for at least
three months.

The resources of the Rails team are limited, and so we are asking for your help.

If you can still reproduce this error on the 4-1-stable, 4-0-stable branches or on master,
please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions.

@juliosantos

This comment has been minimized.

Show comment Hide comment
@juliosantos

juliosantos Sep 10, 2014

Came across this today and thought I'd keep this alive.

Rails 4.0.2
ruby 2.0.0p353 (2013-11-22 revision 43784) [x86_64-darwin13.1.0]

2.0.0-p353 :002 > CGI.unescapeHTML('The experimental macro Hello Latex[&amp;quot;\&amp;quot;sdsd\&amp;quot;&amp;quot;'.html_safe)
TypeError: can't dup NilClass
    from /Users/juliosantos/.rvm/rubies/ruby-2.0.0-p353/lib/ruby/2.0.0/cgi/util.rb:59:in `dup'
    from /Users/juliosantos/.rvm/rubies/ruby-2.0.0-p353/lib/ruby/2.0.0/cgi/util.rb:59:in `block in unescapeHTML'
    from /Users/juliosantos/.rvm/gems/ruby-2.0.0-p353/gems/activesupport-4.0.2/lib/active_support/core_ext/string/output_safety.rb:177:in `gsub'
    from /Users/juliosantos/.rvm/gems/ruby-2.0.0-p353/gems/activesupport-4.0.2/lib/active_support/core_ext/string/output_safety.rb:177:in `gsub'
    from /Users/juliosantos/.rvm/rubies/ruby-2.0.0-p353/lib/ruby/2.0.0/cgi/util.rb:58:in `unescapeHTML'
    from (irb):2
    from /Users/juliosantos/.rvm/gems/ruby-2.0.0-p353/gems/railties-4.0.2/lib/rails/commands/console.rb:90:in `start'
    from /Users/juliosantos/.rvm/gems/ruby-2.0.0-p353/gems/railties-4.0.2/lib/rails/commands/console.rb:9:in `start'
    from /Users/juliosantos/.rvm/gems/ruby-2.0.0-p353/gems/railties-4.0.2/lib/rails/commands.rb:62:in `<top (required)>'
    from bin/rails:4:in `require'
    from bin/rails:4:in `<main>'

Came across this today and thought I'd keep this alive.

Rails 4.0.2
ruby 2.0.0p353 (2013-11-22 revision 43784) [x86_64-darwin13.1.0]

2.0.0-p353 :002 > CGI.unescapeHTML('The experimental macro Hello Latex[&amp;quot;\&amp;quot;sdsd\&amp;quot;&amp;quot;'.html_safe)
TypeError: can't dup NilClass
    from /Users/juliosantos/.rvm/rubies/ruby-2.0.0-p353/lib/ruby/2.0.0/cgi/util.rb:59:in `dup'
    from /Users/juliosantos/.rvm/rubies/ruby-2.0.0-p353/lib/ruby/2.0.0/cgi/util.rb:59:in `block in unescapeHTML'
    from /Users/juliosantos/.rvm/gems/ruby-2.0.0-p353/gems/activesupport-4.0.2/lib/active_support/core_ext/string/output_safety.rb:177:in `gsub'
    from /Users/juliosantos/.rvm/gems/ruby-2.0.0-p353/gems/activesupport-4.0.2/lib/active_support/core_ext/string/output_safety.rb:177:in `gsub'
    from /Users/juliosantos/.rvm/rubies/ruby-2.0.0-p353/lib/ruby/2.0.0/cgi/util.rb:58:in `unescapeHTML'
    from (irb):2
    from /Users/juliosantos/.rvm/gems/ruby-2.0.0-p353/gems/railties-4.0.2/lib/rails/commands/console.rb:90:in `start'
    from /Users/juliosantos/.rvm/gems/ruby-2.0.0-p353/gems/railties-4.0.2/lib/rails/commands/console.rb:9:in `start'
    from /Users/juliosantos/.rvm/gems/ruby-2.0.0-p353/gems/railties-4.0.2/lib/rails/commands.rb:62:in `<top (required)>'
    from bin/rails:4:in `require'
    from bin/rails:4:in `<main>'
@rafaelfranca

This comment has been minimized.

Show comment Hide comment
@rafaelfranca

rafaelfranca Sep 10, 2014

Member

@leafac lets go with this path of removing block form. We removed gsub and sub entirely before but we reverted at 6b010c2.

Member

rafaelfranca commented Sep 10, 2014

@leafac lets go with this path of removing block form. We removed gsub and sub entirely before but we reverted at 6b010c2.

@leafac

This comment has been minimized.

Show comment Hide comment
@leafac

leafac Sep 16, 2014

Contributor

@rafaelfranca That is to say that you want me to make a PR out of these commits: 9b453dd and d395daf ?

Contributor

leafac commented Sep 16, 2014

@rafaelfranca That is to say that you want me to make a PR out of these commits: 9b453dd and d395daf ?

@rafaelfranca

This comment has been minimized.

Show comment Hide comment
@rafaelfranca

rafaelfranca Sep 16, 2014

Member

Yes.

Member

rafaelfranca commented Sep 16, 2014

Yes.

@leafac

This comment has been minimized.

Show comment Hide comment
@leafac

leafac Sep 18, 2014

Contributor

@rafaelfranca there you go: #16957

😄

Contributor

leafac commented Sep 18, 2014

@rafaelfranca there you go: #16957

😄

@Dorian

This comment has been minimized.

Show comment Hide comment
@Dorian

Dorian Jan 8, 2018

Contributor

The example doesn't crash on Rails 5.2.0.beta2, so seems like it was fixed.

Contributor

Dorian commented Jan 8, 2018

The example doesn't crash on Rails 5.2.0.beta2, so seems like it was fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment