Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

SafeBuffer#gsub breaks block form match variables $1, $2, $`, $&, and $’ #1555

Closed
tardate opened this Issue · 26 comments
@tardate

commit 53a2c0b introduced a change to ensure gsub returns an unsafe string.

When gsub is used in block form, variables such as $1, $2, $`, $&, and $’ will be set appropriately. Now it seems that SafeBuffer's intervention loses the context and these variables are no longer available in the block.

This is the simplest test I can make to demonstrate the issue (which currently fails on 3.0-stable)...

class SafeBufferTest < ActiveSupport::TestCase
  def setup
    @buffer = ActiveSupport::SafeBuffer.new
  end
  [..]
  test "Should not break gsub block form match variables" do
    @buffer << 'matchme'
    @buffer.gsub(/(matchme)/) { assert_equal 'matchme', $1 }
  end

So far it's know that this breaks escape_javascript since it uses $1 matcher (see #1553 ). I presume there may be others also affected.

Investigating what a good fix would be.. seems awful heavy handed to have to capture and reset all those variables, but on the other hand, its bad form to leave a ruby core method different from it's documented behaviour.

@josevalim
Owner

There is not much we can do here. Maybe we should get rid of gsub altogether. /cc @tenderlove

@tardate

Yes, you're right. I've investigated further and this one seems to cross the border into ruby internals. Not 100% sure but might even be considered a ruby bug: just the act of chaining up the inheritance hierarchy with super causes "last match" globs ($1, $2 etc) to be lost from the caller's context.

Action? Removing gsub support for SafeBuffer is one, but the ramifications may be quite widespread. Or just document the modified behaviour of gsub with a SafeBuffer?

@tenderlove
Owner

IMO docco is the best solution. We can't remove gsub, and even a regular subclass of string that just calls super will break:

irb(main):001:0> class X < String; def gsub(*args); super; end end
=> nil
irb(main):002:0> X.new('hello').gsub(/(l)/) { $1 + 'm' }
NoMethodError: undefined method `+' for nil:NilClass
    from (irb):2:in `block in irb_binding'
    from (irb):1:in `gsub'
    from (irb):1:in `gsub'
    from (irb):2
    from /Users/aaron/.local/bin/irb:12:in `<main>'
irb(main):003:0>
@nex3

A more serious monkeypatch could solve this, although it may have unfortunate performance implications. If you do something like this:

def gsub(rx, *args)
  super {|s| s =~ rx; yield}
end

you should be able to properly set the magic variables before running the block.

@dolzenko

+1 can't use the Rack::Utils.escape on SafeBuffer now

ree-1.8.7-2010.02 > Rack::Utils.escape('>')
 => "%3E" 

ree-1.8.7-2010.02 > Rack::Utils.escape(ActiveSupport::SafeBuffer.new('>'))
NoMethodError: undefined method `bytesize' for nil:NilClass
@tardate

I've pushed a suggested fix on #1622 .. (update: and just yanked it because it's no good)

@tardate

(per @dmathieu) http://www.ruby-forum.com/topic/198458 - best explanation/discussion I've seen of why the magic matching variables break when you attempt to intercept gsub (and why it's nigh impossible to "just fix it")

@janx

rails_autolink is affected https://github.com/tenderlove/rails_autolink/blob/master/lib/rails_autolink/helpers.rb: Line #59 turns text into safebuffer, Line #85 and #117 invoke gsub{} on it.

@jake3030 jake3030 referenced this issue from a commit in jake3030/rails
Christos Zisopoulos Fix for Integration::Session follow_redirect! headers['location'] bug…
… with Rack [#1555 state:resolved]

Signed-off-by: Joshua Peek <josh@joshpeek.com>
69387ce
@dmathieu
Collaborator

@janx line 59 in rails_autolink does not return a safebuffer. It uses #to_str, which transforms any safebuffer into a string.

ruby-1.9.2-p180 :004 > "test".html_safe.class
 => ActiveSupport::SafeBuffer 
ruby-1.9.2-p180 :006 > "test".html_safe.to_str.class
=> String 

Therefore it should not have this gsub problem.

@janx

@dmathieu You're right. I gave wrong link/lines, the problem cause of our app was lib/rails_autolink.rb in commit fe9d2c7bb4e099a70cdb5419f4b2068e7555cfb0 (v1.0.1), so it's already fixed in edge. Thanks.

@dmathieu
Collaborator

I think we should perhaps, either :

  • Send a warning for every safebuffer which uses an unsafe method with a block, saying it might not work.
  • Raise an exception on SafeBuffer#gsub when there is a block.

The second solution seems better to me.
@josevalim, @tenderlove, what do you think ?

@tonycoco

Agreed. We need to have this raise an exception at the very least.

<%= render('layouts/application.html.erb').gsub(/\{\{\s*([a-z0-9_]+)\s*\}\}/i) do |match|
  respond_to?("#{$1}_for_cobranding") ? send("#{$1}_for_cobranding") : match
end %>

Simple code like this, which I use to do different branding in layouts, just does not work anymore.

A simple fix is just to:

<%= render('layouts/application.html.erb').to_str.gsub(/\{\{\s*([a-z0-9_]+)\s*\}\}/i) do |match|
  respond_to?("#{$1}_for_cobranding") ? send("#{$1}_for_cobranding") : match
end.html_safe %>
@dmathieu dmathieu referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@dmathieu
Collaborator

I have disabled gsub and sub from safe buffers. See #2248

@sillylogger

I just want to point out this change breaks lib/mail/body.rb's delivering html email templates as they're encoded with:

def to_lf
    gsub(/\n|\r\n|\r/) { "\n" }
end

That's going to bite a few people when they find out their emails are all &lt;table&gt;

@dmathieu
Collaborator

yes, but there's no way to detect whether the block uses $* vars or not.

@dmathieu
Collaborator

Basically, gsub shouldn't be used on safe buffers because you'll have unexpected behaviors with block.
That's why my patch removes it.

After, as it works as long as you don't use $* vars, it might be enough to not remove the feature.
Then both my PR and this issue can be closed ...
But I think that's for a core committer to decide.
cc @josevalim @spastorino

@steveh

This also breaks things like CGI.escapeHTML and HTMLEntities.new.encode, and their decoding equivalents.

I think if you're going to modify behaviour from ruby-core, it should raise an exception rather than just return nil for $1.

@dmathieu
Collaborator

@steveh : that's what my PR, #2248 does.

@steveh

@dmathieu Ah, managed to miss that completely. Thanks, will watch that PR instead.

@akaspick

Just spent way too long dealing with this issue myself realizing Rails was borking gsub.

My vote is to raise an exception if a block is used if core doesn't want to remove it from the list of unsafe methods. In the mean time, I have to force standard gsub usage with String.new(my_text).gsub { }

@dmathieu
Collaborator

@akaspick : it can't be removed from the list of unsafe methods. Because it's not a safe method. It'll lead to even more confusion.
A cleaner way to do it is :

my_text.to_str.gsub {}
@janxious

The gsub part of this affected me, by way of HTTParty. It was totally unexpected and very difficult to track down.

@dmathieu dmathieu referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@dmathieu dmathieu referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@dmathieu dmathieu referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@josevalim josevalim closed this in e9f48cd
@josevalim josevalim closed this in b4a6e2f
@jrochkind

So gsub isnt' allowed on a safe buffer anymore. So if you want to take a string and html_escape it, and THEN call gsub on it, and mark the results as html_safe.... how the heck do you do it?

Since calling html_escape will result in an html_safe string that exhibits this bug, here's one way:

string = String.new(html_escape(  string ))  # html escaped, but not a SafeBuffer
string.gsub(....).html_safe  # obviously it's your responsibility to make sure whatever you are sub'ing in is safe, as any other place you explciitly call #html_safe to make a SafeBuffer. 

Is there any better way to do this? This definitely has resulted in some confusing situations here. This use case -- wanting to take an un-safe string, html-escape everything in it, and then gsub on it... is THAT crazy of a case, is it? The solution here still doesn't get people out of having to do crazy things like above, although at least it raises instead of failing mysteriously, giving you a clue you have to do something crazy.

@dmathieu
Collaborator

Yes, there's a much better way to do this :

string = html_escape(string).to_str  # Not a SafeBuffer
string.gsub(....).html_safe  # obviously it's your responsibility to make sure whatever you are sub'ing in is safe, as any other place you explciitly call #html_safe to make a SafeBuffer. 
@toddmazierski toddmazierski referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@toupeira

I also just spent a good deal of my day tracking down this bug.

Issue #1555 was resolved by pull request #2248 which raises an exception for these methods, but this change was reverted on the same day for some reason. I think raising an exception when called with a block would be far more sensible than knowingly letting it run in a broken state?

@lmars lmars referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@lmars lmars referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@jcamenisch jcamenisch referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@jcamenisch jcamenisch referenced this issue in lucaspiller/plain-david
Merged

Convert html input to string before processing #5

@BattleBrisket

Just ran into this problem myself, and lucked into finding this thread. Wanted to add one important note for other travelers.

The solution provided by @dmathieu...

my_text.to_str.gsub {}

worked for me, however note that you must use to_str rather than to_s.

SafeBuffer inherits from String and overrides to_sto return self, but does not touch to_str, which is why it works in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.