Skip to content

Commit

Permalink
Copy encoding flags when copying a regex [Bug #20039]
Browse files Browse the repository at this point in the history
* 🐛 Fixes [Bug #20039](https://bugs.ruby-lang.org/issues/20039)

When a Regexp is initialized with another Regexp, we simply copy the
properties from the original. However, the flags on the original were
not being copied correctly. This caused an issue when the original had
multibyte characters and was being compared with an ASCII string.
Without the forced encoding flag (`KCODE_FIXED`) transferred on to the
new Regexp, the comparison would fail. See the included test for an
example.

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
  • Loading branch information
dustinbrownman and nobu committed Dec 7, 2023
1 parent 1ace218 commit d89280e
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 0 deletions.
2 changes: 2 additions & 0 deletions re.c
Expand Up @@ -3853,6 +3853,8 @@ reg_copy(VALUE copy, VALUE orig)
RB_OBJ_WRITE(copy, &RREGEXP(copy)->src, RREGEXP(orig)->src);
RREGEXP_PTR(copy)->timelimit = RREGEXP_PTR(orig)->timelimit;
rb_enc_copy(copy, orig);
FL_SET_RAW(copy, FL_TEST_RAW(orig, KCODE_FIXED|REG_ENCODING_NONE));

return copy;
}

Expand Down
10 changes: 10 additions & 0 deletions test/ruby/test_regexp.rb
Expand Up @@ -1936,6 +1936,16 @@ def test_bug_19476 # [Bug #19476]
assert_equal("123456789".match(/(?:x?\dx?){2,}/)[0], "123456789")
end

def test_encoding_flags_are_preserved_when_initialized_with_another_regexp
re = Regexp.new("\u2018hello\u2019".encode("UTF-8"))
str = "".encode("US-ASCII")

assert_nothing_raised do
str.match?(re)
str.match?(Regexp.new(re))
end
end

def test_bug_19537 # [Bug #19537]
str = 'aac'
re = '^([ab]{1,3})(a?)*$'
Expand Down

0 comments on commit d89280e

Please sign in to comment.