Skip to content

Bug : regular expressions that never stops matching #113734

Open
@longxya

Description

@longxya

Description

In compiled mode, when a group has negative length capture, back reference of this group will cause the current match index of input(textpos), move negative length(length of capture), aka move to opposite direction of the match.

In regular expressions, the matching process typically involves a pointer moving within the input string, trying different paths to find a match. In this situation, the pointer does not move correctly, it cause the engine to repeatedly check the same position, leading to performance issues or resource depletion.

In test Code, the pointer of current match index always move to the start position of the previous match after each match success down. Due to this, the regex would keep matching until timeout.

This bug causes the length of the matching item to be negative, resulting in successful matching items not causing the textpos pointer to move backward, which in turn causes the matching to continue at a similar index, then matching will not be done forever.

This is a sub issue of this issue

Reproduction Steps

using System;
using System.Text.RegularExpressions;

string pattern = @"\d+(?'1'.)(?<=(?'2-1'.).{5})\2";
string input = "WTF123a1";
int timeOut = 100;
Regex regex = new Regex(pattern, RegexOptions.Compiled, TimeSpan.FromMilliseconds(timeOut));
var mhes = regex.Matches(input);
try
{
	//var matchCount = mhes.Count;// will throw TimeOutException if memory is enough
	for (var i = 0; i < 1000; i++)
	{
		Console.WriteLine(mhes[i].Index + " , " + mhes[i].Length);
	}
}catch(Exception e)
{
	Console.WriteLine(e.Message);
}

Output:

3 , 0
3 , 1
3 , 0
3 , 1
3 , 0
3 , 1
3 , 0
  .
  .
  .
3 , 0
3 , 1
3 , 0
3 , 1
3 , 0
3 , 1
3 , 0

Expected behavior

There are no expected behavior.
For a input which is less than 10 character, it does not make sense for have more than 1000 matches

Actual behavior

Output:

3 , 0
3 , 1
3 , 0
3 , 1
3 , 0
3 , 1
3 , 0
  .
  .
  .
3 , 0
3 , 1
3 , 0
3 , 1
3 , 0
3 , 1
3 , 0

Regression?

No response

Known Workarounds

No response

Configuration

No response

Other information

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions