SafeBuffer#gsub breaks block form match variables $1, $2, $`, $&, and $’ #1555

tardate opened this Issue Jun 8, 2011 · 26 comments

commit 53a2c0b introduced a change to ensure gsub returns an unsafe string.

When gsub is used in block form, variables such as $1, $2, $`, $&, and $’ will be set appropriately. Now it seems that SafeBuffer's intervention loses the context and these variables are no longer available in the block.

This is the simplest test I can make to demonstrate the issue (which currently fails on 3.0-stable)...

class SafeBufferTest < ActiveSupport::TestCase
  def setup
    @buffer =
  test "Should not break gsub block form match variables" do
    @buffer << 'matchme'
    @buffer.gsub(/(matchme)/) { assert_equal 'matchme', $1 }

So far it's know that this breaks escape_javascript since it uses $1 matcher (see #1553 ). I presume there may be others also affected.

Investigating what a good fix would be.. seems awful heavy handed to have to capture and reset all those variables, but on the other hand, its bad form to leave a ruby core method different from it's documented behaviour.

Ruby on Rails member

There is not much we can do here. Maybe we should get rid of gsub altogether. /cc @tenderlove


Yes, you're right. I've investigated further and this one seems to cross the border into ruby internals. Not 100% sure but might even be considered a ruby bug: just the act of chaining up the inheritance hierarchy with super causes "last match" globs ($1, $2 etc) to be lost from the caller's context.

Action? Removing gsub support for SafeBuffer is one, but the ramifications may be quite widespread. Or just document the modified behaviour of gsub with a SafeBuffer?

Ruby on Rails member

IMO docco is the best solution. We can't remove gsub, and even a regular subclass of string that just calls super will break:

irb(main):001:0> class X < String; def gsub(*args); super; end end
=> nil
irb(main):002:0>'hello').gsub(/(l)/) { $1 + 'm' }
NoMethodError: undefined method `+' for nil:NilClass
    from (irb):2:in `block in irb_binding'
    from (irb):1:in `gsub'
    from (irb):1:in `gsub'
    from (irb):2
    from /Users/aaron/.local/bin/irb:12:in `<main>'

A more serious monkeypatch could solve this, although it may have unfortunate performance implications. If you do something like this:

def gsub(rx, *args)
  super {|s| s =~ rx; yield}

you should be able to properly set the magic variables before running the block.


+1 can't use the Rack::Utils.escape on SafeBuffer now

ree-1.8.7-2010.02 > Rack::Utils.escape('>')
 => "%3E" 

ree-1.8.7-2010.02 > Rack::Utils.escape('>'))
NoMethodError: undefined method `bytesize' for nil:NilClass

I've pushed a suggested fix on #1622 .. (update: and just yanked it because it's no good)


(per @dmathieu) - best explanation/discussion I've seen of why the magic matching variables break when you attempt to intercept gsub (and why it's nigh impossible to "just fix it")


rails_autolink is affected Line #59 turns text into safebuffer, Line #85 and #117 invoke gsub{} on it.

@jake3030 jake3030 pushed a commit to jake3030/rails that referenced this issue Jun 28, 2011
Christos Zisopoulos Fix for Integration::Session follow_redirect! headers['location'] bug…
… with Rack [#1555 state:resolved]

Signed-off-by: Joshua Peek <>

@janx line 59 in rails_autolink does not return a safebuffer. It uses #to_str, which transforms any safebuffer into a string.

ruby-1.9.2-p180 :004 > "test".html_safe.class
 => ActiveSupport::SafeBuffer 
ruby-1.9.2-p180 :006 > "test".html_safe.to_str.class
=> String 

Therefore it should not have this gsub problem.


@dmathieu You're right. I gave wrong link/lines, the problem cause of our app was lib/rails_autolink.rb in commit fe9d2c7bb4e099a70cdb5419f4b2068e7555cfb0 (v1.0.1), so it's already fixed in edge. Thanks.


I think we should perhaps, either :

  • Send a warning for every safebuffer which uses an unsafe method with a block, saying it might not work.
  • Raise an exception on SafeBuffer#gsub when there is a block.

The second solution seems better to me.
@josevalim, @tenderlove, what do you think ?


Agreed. We need to have this raise an exception at the very least.

<%= render('layouts/application.html.erb').gsub(/\{\{\s*([a-z0-9_]+)\s*\}\}/i) do |match|
  respond_to?("#{$1}_for_cobranding") ? send("#{$1}_for_cobranding") : match
end %>

Simple code like this, which I use to do different branding in layouts, just does not work anymore.

A simple fix is just to:

<%= render('layouts/application.html.erb').to_str.gsub(/\{\{\s*([a-z0-9_]+)\s*\}\}/i) do |match|
  respond_to?("#{$1}_for_cobranding") ? send("#{$1}_for_cobranding") : match
end.html_safe %>

I have disabled gsub and sub from safe buffers. See #2248


I just want to point out this change breaks lib/mail/body.rb's delivering html email templates as they're encoded with:

def to_lf
    gsub(/\n|\r\n|\r/) { "\n" }

That's going to bite a few people when they find out their emails are all &lt;table&gt;


yes, but there's no way to detect whether the block uses $* vars or not.


Basically, gsub shouldn't be used on safe buffers because you'll have unexpected behaviors with block.
That's why my patch removes it.

After, as it works as long as you don't use $* vars, it might be enough to not remove the feature.
Then both my PR and this issue can be closed ...
But I think that's for a core committer to decide.
cc @josevalim @spastorino


This also breaks things like CGI.escapeHTML and, and their decoding equivalents.

I think if you're going to modify behaviour from ruby-core, it should raise an exception rather than just return nil for $1.


@steveh : that's what my PR, #2248 does.


@dmathieu Ah, managed to miss that completely. Thanks, will watch that PR instead.


Just spent way too long dealing with this issue myself realizing Rails was borking gsub.

My vote is to raise an exception if a block is used if core doesn't want to remove it from the list of unsafe methods. In the mean time, I have to force standard gsub usage with { }


@akaspick : it can't be removed from the list of unsafe methods. Because it's not a safe method. It'll lead to even more confusion.
A cleaner way to do it is :

my_text.to_str.gsub {}

The gsub part of this affected me, by way of HTTParty. It was totally unexpected and very difficult to track down.

@josevalim josevalim closed this in e9f48cd Sep 8, 2011
@josevalim josevalim closed this in b4a6e2f Sep 8, 2011

So gsub isnt' allowed on a safe buffer anymore. So if you want to take a string and html_escape it, and THEN call gsub on it, and mark the results as html_safe.... how the heck do you do it?

Since calling html_escape will result in an html_safe string that exhibits this bug, here's one way:

string =  string ))  # html escaped, but not a SafeBuffer
string.gsub(....).html_safe  # obviously it's your responsibility to make sure whatever you are sub'ing in is safe, as any other place you explciitly call #html_safe to make a SafeBuffer. 

Is there any better way to do this? This definitely has resulted in some confusing situations here. This use case -- wanting to take an un-safe string, html-escape everything in it, and then gsub on it... is THAT crazy of a case, is it? The solution here still doesn't get people out of having to do crazy things like above, although at least it raises instead of failing mysteriously, giving you a clue you have to do something crazy.


Yes, there's a much better way to do this :

string = html_escape(string).to_str  # Not a SafeBuffer
string.gsub(....).html_safe  # obviously it's your responsibility to make sure whatever you are sub'ing in is safe, as any other place you explciitly call #html_safe to make a SafeBuffer. 

I also just spent a good deal of my day tracking down this bug.

Issue #1555 was resolved by pull request #2248 which raises an exception for these methods, but this change was reverted on the same day for some reason. I think raising an exception when called with a block would be far more sensible than knowingly letting it run in a broken state?

@jcamenisch jcamenisch referenced this issue in lucaspiller/plain-david Aug 28, 2014

Convert html input to string before processing #5


Just ran into this problem myself, and lucked into finding this thread. Wanted to add one important note for other travelers.

The solution provided by @dmathieu...

my_text.to_str.gsub {}

worked for me, however note that you must use to_str rather than to_s.

SafeBuffer inherits from String and overrides to_sto return self, but does not touch to_str, which is why it works in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment