Skip to content

Conversation

@yoeunes
Copy link
Owner

@yoeunes yoeunes commented Dec 7, 2025

Description

This pull request introduces several new features and improvements to the regex-parser, including:

  • A command-line tool for regex operations.
  • Syntax highlighting for different outputs (CLI and HTML).
  • Regex modernizer to convert legacy regex to clean, concise PCRE2 patterns.
  • Support for extended character class syntax.
  • Support for extended mode in regex parsing.
  • Support for Unicode named characters.
  • Disallows NULL character octal escape '\0'.
  • Enhanced CLI help output with color and formatting.

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update

Related Issue

Fixes #(issue number)

Testing

  • Unit tests added/updated
  • Tests pass locally
  • No new warnings

Checklist

  • My code follows the project style
  • I have run phpstan and phpunit
  • Documentation updated if needed
  • No breaking changes without discussion

Extends the lexer and parser to support Unicode characters specified by name using the \N{name} syntax.

This change enables the use of Unicode characters by their official names in regular expressions, enhancing readability and precision.

Also includes some PCRE2 character types and grapheme assertions.
Implements the 'x' flag to ignore whitespace and comments within regular expressions.

This change enhances the parser to correctly handle regex patterns written in extended mode, allowing for better readability and maintainability of complex patterns.
Ensures that the octal escape sequence '\0' is not allowed in regular expressions, as it represents the NULL character, which can lead to unexpected behavior or security vulnerabilities.

Updates exception message and adds a test case to specifically target this scenario, ensuring that the validator correctly identifies and rejects the use of '\0'.
This commit introduces support for extended character class syntax,
including control characters (\cM) and class operations (intersection &&
and subtraction --).

It refactors the character class node to use an expression that
represents the class contents, which can be a literal, character
type, range, or an alternation of these, enabling more complex
character class definitions. This enhancement aligns with PCRE2
functionality, allowing for more sophisticated regular expressions.
Adds documentation on known limitations.

Improves code readability by standardizing parameter descriptions in docblocks and ensuring consistent spacing and formatting across various node visitors and the parser.

Adds a TODO file to track future enhancements.
Implements a new feature to automatically modernize legacy regex patterns.

This includes:
- Converting character class ranges to shorthands (\d, \w, \s).
- Removing unnecessary escaping.
- Modernizing backreference syntax.

The feature aims to improve readability and maintainability of regular expressions without altering their behavior.
Introduces a node visitor that modernizes regular expressions.

This visitor improves regex readability and conciseness by:
- Converting character class ranges to shorthands (\d, \w, \s)
- Removing unnecessary escaping
- Unwrapping redundant non-capturing groups
- Modernizing backreference syntax
Cleans up the repository by removing outdated guidelines and exclude files, simplifying the project structure and reducing clutter.
Adds syntax highlighting for regex patterns to improve readability, with support for both console (ANSI codes) and HTML (span tags) outputs.

This allows developers to visualize complex regular expressions more easily.
Corrects issues in regex highlighting by using the correct
properties on nodes and ensuring HTML entities are properly encoded.

This change improves the accuracy and readability of highlighted
regular expressions in both console and HTML output.
Introduces an abstract `HighlighterVisitor` to centralize common logic for regex syntax highlighting.

This commit streamlines the highlighting process by providing a base class for output-specific formatters (e.g., CLI and HTML).

Removes duplicate code and simplifies the creation of new highlighters.

Adds a generic `highlight()` method to automatically detect and apply the appropriate highlighter based on the environment (CLI or HTML).
Introduces a command-line tool to perform several operations on regular expressions, including parsing, validation, ReDoS analysis, and highlighting.

Refactors the previous separate scripts into a single tool with multiple commands for better usability and maintainability.
The tool uses Symfony Console component for handling input and output.
Removes Symfony Console component to simplify the CLI tool.

Improves the tool's usability by providing a clearer help message and usage instructions.

Adds `analyze` and `highlight` commands.

Fixes test namespace.
Updates test case to use more descriptive names for fixture files,
improving readability and maintainability of the test suite.
Introduces a command-line interface for the RegexParser library.

This tool provides functionalities such as parsing, analyzing, highlighting, and validating regular expressions directly from the command line.
It supports different output formats and ANSI color codes for improved readability.

The CLI tool allows users to quickly test and debug regular expressions.
Improves the readability and visual appeal of the CLI help message by adding color-coded elements for better clarity.

This change introduces color to distinguish between different sections and options, making the help message easier to understand and navigate for users. It uses constants for colors.
Refines the package description, keywords, and support information for enhanced clarity and discoverability.

Adds homepage, documentation and funding information.
This commit reorders the `require-dev` dependencies in the `composer.json` file.

This change improves the consistency and readability of the dependency list.
@yoeunes yoeunes merged commit 78991b7 into main Dec 7, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants