New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in mrb_str_lines #3359

Closed
clayton-shopify opened this Issue Dec 20, 2016 · 12 comments

Comments

Projects
None yet
4 participants
@clayton-shopify
Contributor

clayton-shopify commented Dec 20, 2016

The following input causes a crash:

("\n" * 5000).lines { break }

This issue was reported by https://hackerone.com/bee13oy

@EiNSTeiN-

This comment has been minimized.

Show comment
Hide comment
@EiNSTeiN-

EiNSTeiN- Jan 9, 2017

Contributor

The crash occurs in mrb_class but this seems to be a red herring. I fixed the initial bad dereference like this:

diff --git a/include/mruby/class.h b/include/mruby/class.h
index ce953af..1af5dc5 100644
--- a/include/mruby/class.h
+++ b/include/mruby/class.h
@@ -45,6 +45,7 @@ mrb_class(mrb_state *mrb, mrb_value v)
   case MRB_TT_CPTR:
     return mrb->object_class;
   case MRB_TT_ENV:
+  case MRB_TT_UNDEF:
     return NULL;
   default:
     return mrb_obj_ptr(v)->c;

It occured because v is a MRB_TT_UNDEF.

Digging further it seems the root cause is mrb_method_missing ends up calling mrb_any_to_s(mrb, self) with self being a MRB_TT_UNDEF which ends up calling mrb_obj_classname with a NULL class.

Contributor

EiNSTeiN- commented Jan 9, 2017

The crash occurs in mrb_class but this seems to be a red herring. I fixed the initial bad dereference like this:

diff --git a/include/mruby/class.h b/include/mruby/class.h
index ce953af..1af5dc5 100644
--- a/include/mruby/class.h
+++ b/include/mruby/class.h
@@ -45,6 +45,7 @@ mrb_class(mrb_state *mrb, mrb_value v)
   case MRB_TT_CPTR:
     return mrb->object_class;
   case MRB_TT_ENV:
+  case MRB_TT_UNDEF:
     return NULL;
   default:
     return mrb_obj_ptr(v)->c;

It occured because v is a MRB_TT_UNDEF.

Digging further it seems the root cause is mrb_method_missing ends up calling mrb_any_to_s(mrb, self) with self being a MRB_TT_UNDEF which ends up calling mrb_obj_classname with a NULL class.

@clayton-shopify

This comment has been minimized.

Show comment
Hide comment
@clayton-shopify

clayton-shopify Jan 10, 2017

Contributor

Another thing I noticed is that lines should terminate after the first break but it seems to carry on.

Contributor

clayton-shopify commented Jan 10, 2017

Another thing I noticed is that lines should terminate after the first break but it seems to carry on.

@clayton-shopify

This comment has been minimized.

Show comment
Hide comment
@clayton-shopify

clayton-shopify Jan 11, 2017

Contributor

It looks like self gets garbage collected halfway through mrb_str_lines.

Contributor

clayton-shopify commented Jan 11, 2017

It looks like self gets garbage collected halfway through mrb_str_lines.

@matz

This comment has been minimized.

Show comment
Hide comment
@matz

matz Jan 12, 2017

Member

The source of the problem is mrb_yield_argv() does not support break from the block.
String#lines (with a block) should be implemented in Ruby, but I don't have an efficient implementation yet.

Member

matz commented Jan 12, 2017

The source of the problem is mrb_yield_argv() does not support break from the block.
String#lines (with a block) should be implemented in Ruby, but I don't have an efficient implementation yet.

@clayton-shopify

This comment has been minimized.

Show comment
Hide comment
@clayton-shopify

clayton-shopify Jan 12, 2017

Contributor

OK. Do you have a suggestion for how to stop the break from corrupting program state?

Contributor

clayton-shopify commented Jan 12, 2017

OK. Do you have a suggestion for how to stop the break from corrupting program state?

@matz

This comment has been minimized.

Show comment
Hide comment
@matz

matz Jan 12, 2017

Member

Not yet, sorry.

Member

matz commented Jan 12, 2017

Not yet, sorry.

@ksss

This comment has been minimized.

Show comment
Hide comment
@ksss

ksss Jan 20, 2017

Contributor

I tried to write String#each_line by Ruby.

class String
  def each_line(rs = $/, &block)
    return to_enum(:each_line, rs) unless block
    return block.call(self) if rs.nil?

    unless rs.respond_to?(:to_str)
      raise TypeError, "no implicit conversion of #{rs.class} into String"
    end
    rs_str = rs.to_str
    unless rs_str.kind_of?(String)
      raise TypeError, "no implicit conversion of #{rs.class} into String"
    end

    paragraph_mode = false
    if rs_str.empty?
      rs_str = "\n\n"
      paragraph_mode = true
    end

    off = i = 0
    this = dup
    len = length
    rs_len = rs_str.length
    while i < len
      i = index(rs_str, off) || len
      i += rs_len
      if paragraph_mode
        i += 1 while this[i, 1] == "\n"
      end
      block.call(this[off, i - off])
      off = i
    end

    self
  end
end

You know, It was slow about x10 over.

def benchmark
  t = Time.now
  yield
  puts "#{Time.now - t}s"
end

str = "\n" * 30000

benchmark do
  str.lines{}
end

benchmark do
  str.each_line{}
end
0.008395s
0.094594s

But, It works.

1.times do
  ("\n" * 5000).each_line { |i| p i; break }
end
# => "\n"

String#lines with block is deprecated method in CRuby. (Probably from 2.0.0 https://bugs.ruby-lang.org/issues/6670)

$ ruby -we '("\n").lines{}'
-e:1: warning: passing a block to String#lines is deprecated

This is why I implement String#each_line.
And also String#lines will be able to write like this.

class String
  def lines(rs = $/, &block)
    if block
      each_line(rs, &block)
    else
      each_line(rs).to_a
    end
  end
end

I hope this helps. Thanks.

Contributor

ksss commented Jan 20, 2017

I tried to write String#each_line by Ruby.

class String
  def each_line(rs = $/, &block)
    return to_enum(:each_line, rs) unless block
    return block.call(self) if rs.nil?

    unless rs.respond_to?(:to_str)
      raise TypeError, "no implicit conversion of #{rs.class} into String"
    end
    rs_str = rs.to_str
    unless rs_str.kind_of?(String)
      raise TypeError, "no implicit conversion of #{rs.class} into String"
    end

    paragraph_mode = false
    if rs_str.empty?
      rs_str = "\n\n"
      paragraph_mode = true
    end

    off = i = 0
    this = dup
    len = length
    rs_len = rs_str.length
    while i < len
      i = index(rs_str, off) || len
      i += rs_len
      if paragraph_mode
        i += 1 while this[i, 1] == "\n"
      end
      block.call(this[off, i - off])
      off = i
    end

    self
  end
end

You know, It was slow about x10 over.

def benchmark
  t = Time.now
  yield
  puts "#{Time.now - t}s"
end

str = "\n" * 30000

benchmark do
  str.lines{}
end

benchmark do
  str.each_line{}
end
0.008395s
0.094594s

But, It works.

1.times do
  ("\n" * 5000).each_line { |i| p i; break }
end
# => "\n"

String#lines with block is deprecated method in CRuby. (Probably from 2.0.0 https://bugs.ruby-lang.org/issues/6670)

$ ruby -we '("\n").lines{}'
-e:1: warning: passing a block to String#lines is deprecated

This is why I implement String#each_line.
And also String#lines will be able to write like this.

class String
  def lines(rs = $/, &block)
    if block
      each_line(rs, &block)
    else
      each_line(rs).to_a
    end
  end
end

I hope this helps. Thanks.

@clayton-shopify

This comment has been minimized.

Show comment
Hide comment
@clayton-shopify

clayton-shopify Feb 1, 2017

Contributor

Implementing lines in Ruby won't solve the problem because the are other things that call mrb_yield_with_class, for instance instance_exec, Class.initialize and Kernel.initialize. Passing a block containing break to any of these can also cause a crash.

Contributor

clayton-shopify commented Feb 1, 2017

Implementing lines in Ruby won't solve the problem because the are other things that call mrb_yield_with_class, for instance instance_exec, Class.initialize and Kernel.initialize. Passing a block containing break to any of these can also cause a crash.

@matz

This comment has been minimized.

Show comment
Hide comment
@matz

matz Feb 4, 2017

Member

@clayton-shopify FYI, currently String#lines is an only C implemented method that calls mrb_yield repeatedly. Other methods do not have the problem. Of course, we need to address the root cause for the future use-case.

Member

matz commented Feb 4, 2017

@clayton-shopify FYI, currently String#lines is an only C implemented method that calls mrb_yield repeatedly. Other methods do not have the problem. Of course, we need to address the root cause for the future use-case.

@clayton-shopify

This comment has been minimized.

Show comment
Hide comment
@clayton-shopify

clayton-shopify Feb 6, 2017

Contributor

@matz There seems to be trouble in other places that use mrb_yield though. These inputs all demonstrate segfaults:

loop do
  Struct.new { break }
  break
end
puts [0].to_s
loop do
  instance_exec { break }
  break
end
puts [0].to_s
loop do
  ObjectSpace.each_object { break }
  break
end
puts [0].to_s
loop do
  Class.initialize { break }
  break
end
puts [0].to_s
loop do
  Module.initialize { break }
  break
end
puts [0].to_s
Contributor

clayton-shopify commented Feb 6, 2017

@matz There seems to be trouble in other places that use mrb_yield though. These inputs all demonstrate segfaults:

loop do
  Struct.new { break }
  break
end
puts [0].to_s
loop do
  instance_exec { break }
  break
end
puts [0].to_s
loop do
  ObjectSpace.each_object { break }
  break
end
puts [0].to_s
loop do
  Class.initialize { break }
  break
end
puts [0].to_s
loop do
  Module.initialize { break }
  break
end
puts [0].to_s

@matz matz closed this in 77c2aa7 Apr 19, 2017

matz added a commit that referenced this issue Apr 19, 2017

Use trampoline technique for `instance_exec`; ref #3359
A new function `mrb_yield_cont()` is provided. You have to call it
at the end of a C defined method, e.g. `return mrb_yield_cont()`.
@matz

This comment has been minimized.

Show comment
Hide comment
@matz

matz Apr 19, 2017

Member

Those programs now do not crash. But some of them behave differently.
I will investigate.

Member

matz commented Apr 19, 2017

Those programs now do not crash. But some of them behave differently.
I will investigate.

@matz

This comment has been minimized.

Show comment
Hide comment
@matz

matz Jun 14, 2017

Member

I think I got the idea to solve the issue, but it would take some time, probably after 1.3 release.

Member

matz commented Jun 14, 2017

I think I got the idea to solve the issue, but it would take some time, probably after 1.3 release.

@matz matz reopened this Jun 14, 2017

@matz matz closed this in d4d99dd Jun 16, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment