Skip to content

rb_str_resize may be called on a non-String type #342

@nirvdrum

Description

@nirvdrum

Revision 797ce53 introduced a change to resize strings while parsing in order to improve performance in some important cases. Unfortunately, it looks like it also attempts to resize types that aren't strings.

For example, the json_string_matching_test class that ships in the repo demonstrates a situation where internally, *result would be a Time object, not a String. I think through a remarkable set of coincidences, RSTRING_LEN called on most non-String objects happens to return 0. And rb_str_resize on a non-String object sees that object as having a length 0. Attempting to resize a 0-length string to 0 happens to head down a path that doesn't cause the process to segfault (any value > 0 would).

Assuming my analysis is correct, I think the stated performance objective can be maintained with much more safety by doing a type check:

if (RB_TYPE_P(*result, T_STRING)) {
  rb_str_resize(*result, RSTRING_LEN(*result));
}

If there's buy-in for that, I'll pull together a pull request.

I just want to state that what's in the extension does happen to work in MRI, but I think it's exploiting a set of internals (e.g., object layout) that are subject to change and could cause the whole thing to crash. If this is an acceptable risk for some additional performance, please feel free to close the issue out.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions