Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StringScanner.new should use StringValue #696

Closed
rubychan opened this issue Feb 20, 2011 · 7 comments
Closed

StringScanner.new should use StringValue #696

rubychan opened this issue Feb 20, 2011 · 7 comments

Comments

@rubychan
Copy link

According to the original C implementation, StringScanner.new does not copy the string, but only coerce it:

static VALUE
strscan_initialize(int argc, VALUE *argv, VALUE self)
{
    struct strscanner *p;
    VALUE str, need_dup;

    Data_Get_Struct(self, struct strscanner, p);
    rb_scan_args(argc, argv, "11", &str, &need_dup);
    StringValue(str);
    p->str = str;

    return self;
}

So, the method in lib/strscan.rb should be:

def initialize(string, dup=false)
  @string = StringValue(string)
  reset_state
end

In addition, #string= and #concat should use StringValue to coerce the string. #concat may not need it since String#<< already uses StringValue, but it seems cleaner to apply it there, too.

The change speeds up this benchmark quite a bit:

def Bench.run
  a = ''
  1_000_000.times do
    StringScanner.new a
    a << '.'
  end
end

Runs for minutes without the fix.

@kronos
Copy link
Member

kronos commented Feb 20, 2011

Your variant is wrong because it fails specs where string is a subclass of String class. But this one works well:

def initialize(string, dup=false)
  @string = string.to_str
  reset_state
end

After doing this, the bench with rubinius runs faster than with 1.9.2/1.8.7

@rubychan
Copy link
Author

Your variant is wrong

Isn't MRI doing the same?

@rubychan
Copy link
Author

In MRI 1.9.2 and 1.8.7:

StringScanner.new(Class.new(String).new('test')).string.class == String
=> false 

@kronos
Copy link
Member

kronos commented Feb 20, 2011

Really, I don't know much about MRI. I just took your variant and ran specs)

@rubychan
Copy link
Author

It seems there's a spec missing for StringScanner#string. Something like this:

describe "StringScanner#string" do
  ...
  it "returns an instance of the subclass when passed a String subclass" do
    cls = Class.new(String)
    sub = cls.new("abc")

    s = StringScanner.new(sub)

    string = s.string
    string.should be_an_instance_of(cls)
    string.should_not be_an_instance_of(String)
  end
end

@rubychan
Copy link
Author

How about this: http://gist.github.com/836007

@evanphx
Copy link
Member

evanphx commented Feb 22, 2011

Use coersion if possible in StringScanner. Closed by 7cbb344

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants