Open
Description
This library makes heavy use of regular expressions. While most of them should be fairly performant, there could certainly be some room for improvement to help improve the performance of this library. Examples of improvements might include:
- Replacing non-regex parsing logic with regular expressions (if that's quicker)
- Replacing regex-based parsing with logic that doesn't use regular expressions (if that's quicker)
- Combining multiple regexes into one (if that's quicker)
- Fixing excessive backtracking in expressions
- Other improvements to existing expressions
- ???
Tools that could help here include:
- The debugger on https://regex101.com/, especially to check for excessive backtracking
- Our benchmark.php script
- A performance profiler like Blackfire
A partial list of areas where regex is used in this library include:
- https://github.com/thephpleague/commonmark/blob/main/src/Util/RegexHelper.php
- https://github.com/thephpleague/commonmark/blob/main/src/Parser/Cursor.php
- Implementations of:
BlockStartParserInterface::tryStart()
BlockContinueParserInterface::tryContinue()
InlineParserInterface::parse()
- How
InlineParserMatch
builds regular expressions, which are then used byInlineParser
I will accept (almost) any PR that aims to improve performance, though I would ask that you keep the following in mind:
- The performance improvement should be measurable, using either our performance benchmark or some other means
- Improvements that don't break BC are preferred, though substantial improvements requiring a major version bump would be considered
- The rationale behind the improvements should either be obvious or have a description in the PR explaining what you did and why