Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
GitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
regexp: crash in new backtrack engine #10319
The line in question is
I extracted the specific regexp and input that caused this but in the obvious 1-line program there is no crash. This suggests there is something bad in the caching of machines. I don't have a simple program to provoke this, but maybe that's enough information anyway.
Instead of trying to reproduce it, let's look at what's already been reported. b.cap = pos is being executed and failing because len(b.cap) is apparently 0. Why is it okay to write to b.cap on that line? It looks to me like in the stack trace the argument reqcap passed to backtrack is 0.
I think these are separate issues.
When porting the backtracking code, I made the assumption that len(m.matchcap) is always >= 2. Then b.cap would be initialized (via the call to b.reset) to have at least length len(m.matchcap), so accesses to b.cap and b.cap would always be valid. And that assumption is true whenever a machine only runs the backtracker because when a machine is initalized in progMachine, its minimum length is set to 2. However, when the standard matcher is used, m.matchcap is resliced (in m.init) to ncap which may have length less than 2. So if the backtracker is run after the standard matcher on the same machine and no captures are requested the assumption won't hold.
So the backtracking code can either check for reqcap when setting b.cap or b.cap, or initalize b.cap to have length at least 2.