Skip to content

Lookaheads produce incorrect results (toRegExp returns empty-matching regex) #13

@futpib

Description

@futpib

Description

Lookahead assertions ((?=...) and (?!...)) are not working correctly. They produce incorrect results in both the internal parser and the public API.

Public API Impact

.toRegExp() produces wrong output

node -e "import('@gruhn/regex-utils').then(m => console.log('/(?=a)/ ->', m.RB(/(?=a)/).toRegExp()))"
# Output: /(?=a)/ -> /^$.^$/

node -e "import('@gruhn/regex-utils').then(m => console.log('/a{2}/ ->', m.RB(/a{2}/).toRegExp()))"
# Output: /a{2}/ -> /^.*aa.*$/

The lookahead /(?=a)/ becomes /^$.^$/ (matches nothing), while /a{2}/ correctly becomes /^.*aa.*$/.

isStdRegex flag is false for lookaheads

node -e "import('@gruhn/regex-utils').then(m => console.log(m.RB(/a{2}/).regex.isStdRegex))"
# Output: true

node -e "import('@gruhn/regex-utils').then(m => console.log(m.RB(/(?=a)/).regex.isStdRegex))"
# Output: false

Internal Parser Issue

The internal parseRegExpString function has a specific issue with quantified lookaheads - it parses the quantifier as literal characters instead of applying it to the lookahead:

# Quantifier on normal atom - correctly produces "repeat" type
node -e "import('./node_modules/@gruhn/regex-utils/dist/regex-parser.js').then(m => console.log(m.parseRegExpString('a{2}').type))"
# Output: repeat

# Quantifier on lookahead - lookahead captures {2} as literals in 'right' field
node -e "import('./node_modules/@gruhn/regex-utils/dist/regex-parser.js').then(m => console.log(JSON.stringify(m.parseRegExpString('(?=a){2}'))))"
# The {2} appears as literal characters in the 'right' field instead of wrapping the lookahead in a 'repeat' node

Expected Behavior

JavaScript allows quantified lookaheads:

node -e "console.log(/(?=a){2}/.test('a'))"
# Output: true

node -e "console.log(new RegExp('(?=a){2}').source)"
# Output: (?=a){2}

Lookaheads should:

  1. Produce correct toRegExp() output (not /^$.^$/)
  2. Have isStdRegex: true
  3. Parse quantifiers as quantifiers, not literals

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions