Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MatchData#offset return utf8 character offset #2581

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from

Conversation

jarkonik
Copy link

Fixes #2580

.haystack
.char_indices()
.position(|n| n.0 >= self.region.offset())
.unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this unwrap is firing which is causing the Crash workflow to fail. Commit 0c99a8c.

This PR failing the crash worfklow with a panic in impl Drop for Guard is a bit concerning and appears to be a lurking bit of unsoundness.

https://github.com/artichoke/artichoke/actions/runs/5133653787/jobs/9294514151?pr=2581

thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', /home/runner/work/artichoke/artichoke/artichoke-backend/src/extn/core/matchdata/mod.rs:315:18
stack backtrace:
   0: rust_begin_unwind
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:579:5
   1: core::panicking::panic_fmt
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/panicking.rs:64:14
   2: core::panicking::panic
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/panicking.rs:114:5
   3: core::option::Option<T>::unwrap
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/option.rs:942:21
   4: artichoke_backend::extn::core::matchdata::MatchData::offset
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/extn/core/matchdata/mod.rs:311:26
   5: artichoke_backend::extn::core::matchdata::MatchData::begin
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/extn/core/matchdata/mod.rs:179:35
   6: artichoke_backend::extn::core::matchdata::trampoline::begin
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/extn/core/matchdata/trampoline.rs:17:17
   7: artichoke_backend::extn::core::matchdata::mruby::matchdata_begin
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/extn/core/matchdata/mruby.rs:49:18
   8: mrb_vm_exec
             at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/src/vm.c:1792:18
   9: mrb_vm_run
             at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/src/vm.c:1282:12
  10: mrb_top_run
             at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/src/vm.c:3115:7
  11: mrb_load_exec
             at /home/runner/work/artichoke/artichoke/artichoke-backend/mrbgems/mruby-compiler/core/parse.y:6918:7
  12: mrb_load_nstring_cxt
             at /home/runner/work/artichoke/artichoke/artichoke-backend/mrbgems/mruby-compiler/core/parse.y:6990:10
  13: <artichoke_backend::sys::protect::Eval as artichoke_backend::sys::protect::Protect>::run
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/sys/protect.rs:115:9
  14: protect_body
             at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/mrbgems/mruby-error/src/exception.c:13:10
  15: mrb_protect_error
             at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/src/vm.c:431:14
  16: mrb_protect
             at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/mrbgems/mruby-error/src/exception.c:20:10
  17: artichoke_backend::sys::protect::protect
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/sys/protect.rs:46:17
  18: artichoke_backend::sys::protect::eval
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/sys/protect.rs:25:5
  19: artichoke_backend::eval::<impl artichoke_core::eval::Eval for artichoke_backend::artichoke::Artichoke>::eval::{{closure}}
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/eval.rs:27:42
  20: artichoke_backend::artichoke::Artichoke::with_ffi_boundary
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/artichoke.rs:98:26
  21: artichoke_backend::eval::<impl artichoke_core::eval::Eval for artichoke_backend::artichoke::Artichoke>::eval
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/eval.rs:27:13
  22: artichoke_backend::load::<impl artichoke_core::load::LoadSources for artichoke_backend::artichoke::Artichoke>::load_source
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/load.rs:109:9
  23: artichoke_backend::extn::core::kernel::require::load
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/extn/core/kernel/require.rs:34:22
  24: artichoke_backend::extn::core::kernel::trampoline::load
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/extn/core/kernel/trampoline.rs:114:19
  25: artichoke_backend::extn::core::kernel::mruby::kernel_load
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/extn/core/kernel/mruby.rs:61:18
  26: mrb_vm_exec
             at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/src/vm.c:1792:18
  27: mrb_vm_run
             at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/src/vm.c:1282:12
  28: mrb_run
             at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/src/vm.c:3100:10
  29: mrb_funcall_with_block
             at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/src/vm.c:681:13
  30: mrb_funcall_argv
             at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/src/vm.c:692:10
  31: <artichoke_backend::sys::protect::Funcall as artichoke_backend::sys::protect::Protect>::run
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/sys/protect.rs:91:13
  32: protect_body
             at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/mrbgems/mruby-error/src/exception.c:13:10
  33: mrb_protect_error
             at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/src/vm.c:431:14
  34: mrb_protect
             at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/mrbgems/mruby-error/src/exception.c:20:10
  35: artichoke_backend::sys::protect::protect
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/sys/protect.rs:46:17
  36: artichoke_backend::sys::protect::funcall
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/sys/protect.rs:16:5
  37: <artichoke_backend::value::Value as artichoke_core::value::Value>::funcall::{{closure}}
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/value.rs:159:17
  38: artichoke_backend::artichoke::Artichoke::with_ffi_boundary
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/artichoke.rs:98:26
  39: <artichoke_backend::value::Value as artichoke_core::value::Value>::funcall
             at /home/runner/work/artichoke/artichoke/artichoke-backend/src/value.rs:158:13
  40: spec_runner::mspec::run
             at ./src/mspec.rs:93:18
  41: spec_runner::try_main
             at ./src/main.rs:172:24
  42: spec_runner::main
             at ./src/main.rs:110:18
  43: core::ops::function::FnOnce::call_once
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
thread 'main' panicked at 'Dropping Guard with no State', /home/runner/work/artichoke/artichoke/artichoke-backend/src/artichoke.rs:233:45
stack backtrace:
   0:     0x556f0000f28a - std::backtrace_rs::backtrace::libunwind::trace::ha9053a9a07ca49cb
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:     0x556f0000f28a - std::backtrace_rs::backtrace::trace_unsynchronized::h9c2852a457ad564e
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x556f0000f28a - std::sys_common::backtrace::_print_fmt::h457936fbfaa0070f
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys_common/backtrace.rs:65:5
   3:     0x556f0000f28a - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h5779d7bf7f70cb0c
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys_common/backtrace.rs:44:22
   4:     0x556f0003009e - core::fmt::write::h5a4baaff1bcd3eb5
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/fmt/mod.rs:1232:17
   5:     0x556f0000c875 - std::io::Write::write_fmt::h4bc1f301cb9e9cce
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/io/mod.rs:1684:15
   6:     0x556f0000f055 - std::sys_common::backtrace::_print::h5fcdc36060f177e8
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys_common/backtrace.rs:47:5
   7:     0x556f0000f055 - std::sys_common::backtrace::print::h54ca9458b876c8bf
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys_common/backtrace.rs:34:9
   8:     0x556f000106af - std::panicking::default_hook::{{closure}}::hbe471161c7664ed6
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:271:22
   9:     0x556f000103eb - std::panicking::default_hook::ha3500da57aa4ac4f
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:290:9
  10:     0x556f00010c58 - std::panicking::rust_panic_with_hook::h50c09d000dc561d2
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:692:13
  11:     0x556f00010b12 - std::panicking::begin_panic_handler::{{closure}}::h9e2b2176e00e0d9c
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:581:13
  12:     0x556f0000f6f6 - std::sys_common::backtrace::__rust_end_short_backtrace::h5739b8e512c09d02
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys_common/backtrace.rs:150:18
  13:     0x556f00010862 - rust_begin_unwind
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:579:5
  14:     0x556eff8c6163 - core::panicking::panic_fmt::hf33a1475b4dc5c3e
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/panicking.rs:64:14
  15:     0x556effad40b0 - <artichoke_backend::artichoke::Guard as core::ops::drop::Drop>::drop::{{closure}}::h03acbacb8665d386
                               at /home/runner/work/artichoke/artichoke/artichoke-backend/src/artichoke.rs:233:45
  16:     0x556effb18290 - core::option::Option<T>::unwrap_or_else::h98c260c82ae7a891
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/option.rs:992:21
  17:     0x556effad3fba - <artichoke_backend::artichoke::Guard as core::ops::drop::Drop>::drop::h5c639a5a03628e50
                               at /home/runner/work/artichoke/artichoke/artichoke-backend/src/artichoke.rs:233:21
  18:     0x556effb8d8eb - core::ptr::drop_in_place<artichoke_backend::artichoke::Guard>::h37be7690eb755a0a
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/ptr/mod.rs:490:1
  19:     0x556effc6e7d9 - artichoke_backend::extn::core::kernel::mruby::kernel_load::h8e83fce385d4f283
                               at /home/runner/work/artichoke/artichoke/artichoke-backend/src/extn/core/kernel/mruby.rs:66:1
  20:     0x556effce0a8e - mrb_vm_exec
                               at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/src/vm.c:1792:18
  21:     0x556effcdd291 - mrb_vm_run
                               at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/src/vm.c:1282:12
  22:     0x556effcf16b5 - mrb_run
                               at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/src/vm.c:3100:10
  23:     0x556effcdb955 - mrb_funcall_with_block
                               at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/src/vm.c:681:13
  24:     0x556effcdba14 - mrb_funcall_argv
                               at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/src/vm.c:692:10
  25:     0x556effb518fd - <artichoke_backend::sys::protect::Funcall as artichoke_backend::sys::protect::Protect>::run::had268d8aff26327a
                               at /home/runner/work/artichoke/artichoke/artichoke-backend/src/sys/protect.rs:91:13
  26:     0x556effd0696d - protect_body
                               at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/mrbgems/mruby-error/src/exception.c:13:10
  27:     0x556effcdaac9 - mrb_protect_error
                               at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/src/vm.c:431:14
  28:     0x556effd069d1 - mrb_protect
                               at /home/runner/work/artichoke/artichoke/artichoke-backend/vendor/mruby/mrbgems/mruby-error/src/exception.c:20:10
  29:     0x556effb50c40 - artichoke_backend::sys::protect::protect::h2564e1bb47afdc74
                               at /home/runner/work/artichoke/artichoke/artichoke-backend/src/sys/protect.rs:46:17
  30:     0x556effb50a47 - artichoke_backend::sys::protect::funcall::h63bc21215de2d8f1
                               at /home/runner/work/artichoke/artichoke/artichoke-backend/src/sys/protect.rs:16:5
  31:     0x556effb39c55 - <artichoke_backend::value::Value as artichoke_core::value::Value>::funcall::{{closure}}::h1192b775fafb7460
                               at /home/runner/work/artichoke/artichoke/artichoke-backend/src/value.rs:159:17
  32:     0x556effac52ec - artichoke_backend::artichoke::Artichoke::with_ffi_boundary::h84192cd4a228a955
                               at /home/runner/work/artichoke/artichoke/artichoke-backend/src/artichoke.rs:98:26
  33:     0x556effb3987c - <artichoke_backend::value::Value as artichoke_core::value::Value>::funcall::hee37548fe29604bf
                               at /home/runner/work/artichoke/artichoke/artichoke-backend/src/value.rs:158:13
  34:     0x556eff9171ba - spec_runner::mspec::run::h893fe048f9a10b30
                               at /home/runner/work/artichoke/artichoke/spec-runner/src/mspec.rs:93:18
  35:     0x556eff918959 - spec_runner::try_main::h8200c89b51f585c3
                               at /home/runner/work/artichoke/artichoke/spec-runner/src/main.rs:172:24
  36:     0x556eff917a46 - spec_runner::main::h8e1a9ceafc1dd7c0
                               at /home/runner/work/artichoke/artichoke/spec-runner/src/main.rs:110:18
  37:     0x556eff9101bb - core::ops::function::FnOnce::call_once::hbfb58f9e389056b5
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/ops/function.rs:250:5
  38:     0x556eff90a0fe - std::sys_common::backtrace::__rust_begin_short_backtrace::h762524ad39c17153
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys_common/backtrace.rs:134:18
  39:     0x556eff91afc1 - std::rt::lang_start::{{closure}}::ha85c3fe14162c26b
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/rt.rs:166:18
  40:     0x556f0000749c - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::hd6efcd3bec896f2c
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/ops/function.rs:287:13
  41:     0x556f0000749c - std::panicking::try::do_call::hce04e543bb1f4cbb
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:487:40
  42:     0x556f0000749c - std::panicking::try::h3342dd4e1f680968
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:451:19
  43:     0x556f0000749c - std::panic::catch_unwind::h148ce1e59ac0cee7
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panic.rs:140:14
  44:     0x556f0000749c - std::rt::lang_start_internal::{{closure}}::h25f9dda2057a67fe
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/rt.rs:148:48
  45:     0x556f0000749c - std::panicking::try::do_call::h7caaaeaf9401650b
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:487:40
  46:     0x556f0000749c - std::panicking::try::he7d15285746cbbc2
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:451:19
  47:     0x556f0000749c - std::panic::catch_unwind::h89fb4f50c0301fe0
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panic.rs:140:14
  48:     0x556f0000749c - std::rt::lang_start_internal::h078acd489417d3c1
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/rt.rs:148:20
  49:     0x556eff91af9a - std::rt::lang_start::ha418d6f7c511ad91
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/rt.rs:165:17
  50:     0x556eff919ace - main
  51:     0x7f6189e73d90 - <unknown>
  52:     0x7f6189e73e40 - __libc_start_main
  53:     0x556eff8c67f5 - _start
  54:                0x0 - <unknown>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the offset isn't guaranteed to exist in the MatchData. The None case needs to be handled.

@lopopolo
Copy link
Member

lopopolo commented Jun 2, 2023

I believe this PR is blocked on #1909.

I'll try to work on it this weekend.

Copy link
Member

@lopopolo lopopolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the yak shave it's looking like this PR will turn into, but this part of the code needs love.

Regexp and MatchData were the first custom types added to Artichoke and I've developed better patterns for impling native APIs since these types were added.

If you're interested in digging in deep here, the spinoso-regexp crate is where I'd love to see some additional investment.

let offset = self.region.offset();
let offset = self
.haystack
.char_indices()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calling bstr directly here is buggy for a couple of reasons:

  • It presumes that the String value the haystack bytes were derived from has a UTF-8 string encoding.
  • bstr's CharIndices iterator, like all of the library's UTF-8 oriented methods, has different behavior than Artichoke for some types of invalid UTF-8 byte sequences. bstr uses the substitution of maximal subparts replacement strategy which means some invalid UTF-8 byte sequences like \xF0\x9F\x8 get collapsed to one replacement character, whereas Artichoke treats these 3 bytes as 3 characters.

end

def offset_returns_utf8_character_index
raise unless 'тест'.match('с').offset(0) == [2, 3]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

echoing from discord: can we add a test similar to this which includes an invalid UTF-8 byte sequence?

Something like this:

    raise unless "\xF0\x9F\x87тест".match('с').offset(0) == [4, 5]

Similarly, if we could duplicate all of the test cases for binary strings (e.g. 'тест'.b), we'd be able to ensure we are supporting non UTF-8 encodings as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using current stable MRI (3.2.2) the test case that you've proposed raises invalid byte sequence in UTF-8 (ArgumentError), while Artichoke raises ArgumentError (regex crate utf8 backend for Regexp only supports UTF-8 haystacks) due to an error returned from str::from_utf8. So other than the different message wording, this seems to be in line with MRI behavior.

@@ -308,7 +308,12 @@ impl MatchData {
} else {
haystack.len()
};
let offset = self.region.offset();
let offset = self
.haystack
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having the haystack be stored in the MatchData value as a Vec<u8> instead of a spinoso_string::String is a mistake that will make fixing this bug correctly impossible.

See also this related ticket: #614.

If you're up for it, we'll want to change the MatchData rust struct to not store the haystack and regexp within it. Additionally, we should be storing the Values that represent the haystack and regexp as instance variables on the MatchData's Value (Setting these Values as ivars is necessary so mruby can correctly mark these stored values during GC due to the FFI internals of the Rust/C MatchData bridge. we can chat about this on Discord if you're curious about learning more.)

I think setting ivars on values is not something Artichoke supports right now. This is a missing API in artichoke-core: https://artichoke.github.io/artichoke/artichoke_core/.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A hack to workaround this gap would be to store the spinoso_string::String on the Rust MatchData struct instead of a Vec<u8>.

const FUNCTIONAL_TEST: &[u8] = include_bytes!("matchdata_test.rb");

#[test]
fn functional() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for adding this!

@lopopolo lopopolo added A-ruby-core Area: Ruby Core types. A-spec Area: ruby/spec infrastructure and completeness. labels Jun 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ruby-core Area: Ruby Core types. A-spec Area: ruby/spec infrastructure and completeness.
Development

Successfully merging this pull request may close these issues.

MatchData#offset returns byte offset instead of utf8 char offset for utf8 strings
2 participants