Skip to content
This repository

SafeBuffer#gsub breaks block form match variables $1, $2, $`, $&, and $’ #1555

Closed
tardate opened this Issue June 08, 2011 · 25 comments
Paul Gallagher

commit 53a2c0b introduced a change to ensure gsub returns an unsafe string.

When gsub is used in block form, variables such as $1, $2, $`, $&, and $’ will be set appropriately. Now it seems that SafeBuffer's intervention loses the context and these variables are no longer available in the block.

This is the simplest test I can make to demonstrate the issue (which currently fails on 3.0-stable)...

class SafeBufferTest < ActiveSupport::TestCase
  def setup
    @buffer = ActiveSupport::SafeBuffer.new
  end
  [..]
  test "Should not break gsub block form match variables" do
    @buffer << 'matchme'
    @buffer.gsub(/(matchme)/) { assert_equal 'matchme', $1 }
  end

So far it's know that this breaks escape_javascript since it uses $1 matcher (see #1553 ). I presume there may be others also affected.

Investigating what a good fix would be.. seems awful heavy handed to have to capture and reset all those variables, but on the other hand, its bad form to leave a ruby core method different from it's documented behaviour.

José Valim
Owner

There is not much we can do here. Maybe we should get rid of gsub altogether. /cc @tenderlove

Paul Gallagher

Yes, you're right. I've investigated further and this one seems to cross the border into ruby internals. Not 100% sure but might even be considered a ruby bug: just the act of chaining up the inheritance hierarchy with super causes "last match" globs ($1, $2 etc) to be lost from the caller's context.

Action? Removing gsub support for SafeBuffer is one, but the ramifications may be quite widespread. Or just document the modified behaviour of gsub with a SafeBuffer?

Aaron Patterson
Owner

IMO docco is the best solution. We can't remove gsub, and even a regular subclass of string that just calls super will break:

irb(main):001:0> class X < String; def gsub(*args); super; end end
=> nil
irb(main):002:0> X.new('hello').gsub(/(l)/) { $1 + 'm' }
NoMethodError: undefined method `+' for nil:NilClass
    from (irb):2:in `block in irb_binding'
    from (irb):1:in `gsub'
    from (irb):1:in `gsub'
    from (irb):2
    from /Users/aaron/.local/bin/irb:12:in `<main>'
irb(main):003:0>
Nathan Weizenbaum
nex3 commented June 08, 2011

A more serious monkeypatch could solve this, although it may have unfortunate performance implications. If you do something like this:

def gsub(rx, *args)
  super {|s| s =~ rx; yield}
end

you should be able to properly set the magic variables before running the block.

Evgeniy Dolzhenko

+1 can't use the Rack::Utils.escape on SafeBuffer now

ree-1.8.7-2010.02 > Rack::Utils.escape('>')
 => "%3E" 

ree-1.8.7-2010.02 > Rack::Utils.escape(ActiveSupport::SafeBuffer.new('>'))
NoMethodError: undefined method `bytesize' for nil:NilClass
Paul Gallagher

I've pushed a suggested fix on #1622 .. (update: and just yanked it because it's no good)

Paul Gallagher

(per @dmathieu) http://www.ruby-forum.com/topic/198458 - best explanation/discussion I've seen of why the magic matching variables break when you attempt to intercept gsub (and why it's nigh impossible to "just fix it")

Jan
janx commented June 19, 2011

rails_autolink is affected https://github.com/tenderlove/rails_autolink/blob/master/lib/rails_autolink/helpers.rb: Line #59 turns text into safebuffer, Line #85 and #117 invoke gsub{} on it.

Jake Varghese jake3030 referenced this issue from a commit in jake3030/rails December 10, 2008
Christos Zisopoulos Fix for Integration::Session follow_redirect! headers['location'] bug…
… with Rack [#1555 state:resolved]

Signed-off-by: Joshua Peek <josh@joshpeek.com>
69387ce
Damien Mathieu
Collaborator

@janx line 59 in rails_autolink does not return a safebuffer. It uses #to_str, which transforms any safebuffer into a string.

ruby-1.9.2-p180 :004 > "test".html_safe.class
 => ActiveSupport::SafeBuffer 
ruby-1.9.2-p180 :006 > "test".html_safe.to_str.class
=> String 

Therefore it should not have this gsub problem.

Jan
janx commented July 03, 2011

@dmathieu You're right. I gave wrong link/lines, the problem cause of our app was lib/rails_autolink.rb in commit fe9d2c7 (v1.0.1), so it's already fixed in edge. Thanks.

Damien Mathieu
Collaborator

I think we should perhaps, either :

  • Send a warning for every safebuffer which uses an unsafe method with a block, saying it might not work.
  • Raise an exception on SafeBuffer#gsub when there is a block.

The second solution seems better to me.
@josevalim, @tenderlove, what do you think ?

Tony Coconate

Agreed. We need to have this raise an exception at the very least.

<%= render('layouts/application.html.erb').gsub(/\{\{\s*([a-z0-9_]+)\s*\}\}/i) do |match|
  respond_to?("#{$1}_for_cobranding") ? send("#{$1}_for_cobranding") : match
end %>

Simple code like this, which I use to do different branding in layouts, just does not work anymore.

A simple fix is just to:

<%= render('layouts/application.html.erb').to_str.gsub(/\{\{\s*([a-z0-9_]+)\s*\}\}/i) do |match|
  respond_to?("#{$1}_for_cobranding") ? send("#{$1}_for_cobranding") : match
end.html_safe %>
Damien Mathieu dmathieu referenced this issue from a commit July 25, 2011
Commit has since been removed from the repository and is no longer available.
Damien Mathieu
Collaborator

I have disabled gsub and sub from safe buffers. See #2248

Tommy Sullivan

I just want to point out this change breaks lib/mail/body.rb's delivering html email templates as they're encoded with:

def to_lf
    gsub(/\n|\r\n|\r/) { "\n" }
end

That's going to bite a few people when they find out their emails are all &lt;table&gt;

Damien Mathieu
Collaborator

yes, but there's no way to detect whether the block uses $* vars or not.

Damien Mathieu
Collaborator

Basically, gsub shouldn't be used on safe buffers because you'll have unexpected behaviors with block.
That's why my patch removes it.

After, as it works as long as you don't use $* vars, it might be enough to not remove the feature.
Then both my PR and this issue can be closed ...
But I think that's for a core committer to decide.
cc @josevalim @spastorino

Steve Hoeksema

This also breaks things like CGI.escapeHTML and HTMLEntities.new.encode, and their decoding equivalents.

I think if you're going to modify behaviour from ruby-core, it should raise an exception rather than just return nil for $1.

Damien Mathieu
Collaborator

@steveh : that's what my PR, #2248 does.

Steve Hoeksema

@dmathieu Ah, managed to miss that completely. Thanks, will watch that PR instead.

Andrew Kaspick

Just spent way too long dealing with this issue myself realizing Rails was borking gsub.

My vote is to raise an exception if a block is used if core doesn't want to remove it from the list of unsafe methods. In the mean time, I have to force standard gsub usage with String.new(my_text).gsub { }

Damien Mathieu
Collaborator

@akaspick : it can't be removed from the list of unsafe methods. Because it's not a safe method. It'll lead to even more confusion.
A cleaner way to do it is :

my_text.to_str.gsub {}
Joel "The Merciless" Meador

The gsub part of this affected me, by way of HTTParty. It was totally unexpected and very difficult to track down.

Damien Mathieu dmathieu referenced this issue from a commit September 04, 2011
Commit has since been removed from the repository and is no longer available.
Damien Mathieu dmathieu referenced this issue from a commit September 07, 2011
Commit has since been removed from the repository and is no longer available.
Damien Mathieu dmathieu referenced this issue from a commit September 08, 2011
Commit has since been removed from the repository and is no longer available.
José Valim josevalim closed this in e9f48cd September 08, 2011
José Valim josevalim closed this in b4a6e2f September 08, 2011
Jonathan Rochkind

So gsub isnt' allowed on a safe buffer anymore. So if you want to take a string and html_escape it, and THEN call gsub on it, and mark the results as html_safe.... how the heck do you do it?

Since calling html_escape will result in an html_safe string that exhibits this bug, here's one way:

string = String.new(html_escape(  string ))  # html escaped, but not a SafeBuffer
string.gsub(....).html_safe  # obviously it's your responsibility to make sure whatever you are sub'ing in is safe, as any other place you explciitly call #html_safe to make a SafeBuffer. 

Is there any better way to do this? This definitely has resulted in some confusing situations here. This use case -- wanting to take an un-safe string, html-escape everything in it, and then gsub on it... is THAT crazy of a case, is it? The solution here still doesn't get people out of having to do crazy things like above, although at least it raises instead of failing mysteriously, giving you a clue you have to do something crazy.

Damien Mathieu
Collaborator

Yes, there's a much better way to do this :

string = html_escape(string).to_str  # Not a SafeBuffer
string.gsub(....).html_safe  # obviously it's your responsibility to make sure whatever you are sub'ing in is safe, as any other place you explciitly call #html_safe to make a SafeBuffer. 
Chris Wilson twopoint718 referenced this issue in alexreisner/geocoder February 20, 2013
Closed

Prevent calling CGI.escape on a SafeBuffer #399

Kirill Kalachev dhampik referenced this issue in JangoSteve/remotipart July 16, 2013
Open

Bug: Remote javascript with &quot; is not executed #89

Markus Koller

I also just spent a good deal of my day tracking down this bug.

Issue #1555 was resolved by pull request #2248 which raises an exception for these methods, but this change was reverted on the same day for some reason. I think raising an exception when called with a block would be far more sensible than knowingly letting it run in a broken state?

Lewis Marshall lmars referenced this issue from a commit January 30, 2014
Commit has since been removed from the repository and is no longer available.
Lewis Marshall lmars referenced this issue from a commit January 30, 2014
Commit has since been removed from the repository and is no longer available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.