Skip to content

Bug: Bad behaviour with combination of: StringScanner#scan, multiline regexp, multiple lines #2616

Open
@hmdne

Description

@hmdne

Describe the bug

When giving StringScanner#scan a multiline (//m) regexp, instead of scanning just the next character, it scans characters after a newline too.

In Ruby, multiline modifier changes semantics of . to mean "any character", not "any character except \n".

In JavaScript instead, it changes semantics of ^ and $ to how they work in Ruby even without //m, so they match beginning/ending either of string or line. Except... there's no \A or \z.

Opal's StringScanner#scan implementation adds ^ to the beginning of the supplied regexp and copies the flags.

All combined cause a hard-to-discover bug. I don't have simple solutions for that issue, maybe except of using lookbehind to simulate \A in Opal's clone of StringScanner, but that only recently has been implemented in Safari.

Opal version: master

To Reproduce

[user@localhost mnt]# opal -rstrscan -e 'StringScanner.new("hello\nhello").tap { |i| p 10.times.map { i.scan(/h/m) } }'
["h", "h", "h", "h", "h", "h", "h", nil, nil, nil]
[user@localhost mnt]# ruby -rstrscan -e 'StringScanner.new("hello\nhello").tap { |i| p 10.times.map { i.scan(/h/m) } }'
["h", nil, nil, nil, nil, nil, nil, nil, nil, nil]
[user@localhost mnt]# 

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions