Major refactor — recursive descent parser, arithmetic, unary operators, and token system overhaul#26
Conversation
Each AST node now implements evaluate(EvaluationContext): mixed, eliminating the two massive match expressions in AstEvaluator that violated the Open/Closed Principle. - Added EvaluationContext DTO (wraps TokenStream + TokenFactory) - Added evaluate() to Node base class - Each concrete node implements its own evaluation logic - AstEvaluator is now a thin orchestrator with zero match expressions - All 263 tests pass
Made evaluate() delegate to resolve() instead of both methods independently calling $node->evaluate(). This removes the latent bug risk where one method could be modified without updating the other.
…ction The $afterValue boolean flag was fragile because it required manual maintenance at every token emission point. Adding new token types required remembering to update the flag, and whitespace/newline tokens implicitly relied on the previous state persisting. Replaced it with lastTokenIsValue() which walks backwards through the token stack, skipping ignorable tokens (whitespace/comments), and checks the actual token type hierarchy: - Does the token implement the Value interface? - Is it a closing parenthesis or closing array? This is more robust because: - Context is derived from actual emitted tokens, not a manual flag - New token types implementing Value automatically work - Whitespace/comments are naturally skipped via canBeIgnored() - The logic is self-documenting and testable
The getValue() method was stripping surrounding quotes but not unescaping the content, so escape sequences like \n, \t, \, etc. were being treated as literal characters instead of their actual control character equivalents. Added an unescape() method that handles common JavaScript escape sequences: \n, \r, \t, \, ", \', $, \0. Unknown sequences are preserved as-is. Added 9 unit tests in LexerTest.php and 1 integration test in ScalarTest.php.
DivisionNode and ModuloNode now check for a zero divisor before performing arithmetic operations, following JavaScript semantics: - Division by zero returns INF, -INF, or NAN (matching JS Infinity/NaN) - Modulo by zero returns NAN (matching JS NaN) Added 5 test methods in OperatorsTest.php covering all edge cases.
The getNativeValue() method was defined in the abstract ValueNode class and implemented in all 6 concrete subclasses (BoolNode, FloatNode, IntegerNode, NullNode, RegexNode, StringNode), but was never called anywhere in the codebase. The evaluate() method already returns the native PHP value directly, making getNativeValue() entirely redundant.
…okenKind enum - Created TokenKind enum with all token types - Created GenericToken class backed by TokenKind to replace single-purpose token classes - Added abstract getKind(): TokenKind to BaseToken - Made getType() concrete in BaseToken, deriving TokenType from getKind() - Removed isOfType() method (unused) - Removed kindToType() from TokenFactory (duplicated in BaseToken::getType()) - Updated canBeIgnored() to use getKind() directly - Updated ExpressionFactory to use TokenKind matching instead of instanceof - Removed 27 old token class files - Updated all tests accordingly All 278 tests pass.
- Split TokenStream into VariableRegistry, FunctionRegistry, MethodRegistry - Made VariableRegistry mutable to avoid recreating dependency tree on each parse() - Inlined TokenIteratorFactory in RuleEngine constructor - Made defaultVariables a promoted constructor property - Removed lazy initialization ( flag) from FunctionRegistry and MethodRegistry - Restored readonly on RuleEngine
The cached AST in Rule::isTrue() and Rule::result() was never actually used because they called RuleEngine::evaluate() and RuleEngine::result() with the raw rule string, which re-parsed it every time. - Added RuleEngine::evaluateNode(Node) and RuleEngine::resolveNode(Node) to allow evaluating a pre-parsed AST without re-parsing - Updated Rule::isTrue() to use evaluateNode() with the cached AST - Updated Rule::result() to use resolveNode() with the cached AST - Refactored RuleEngine::evaluate() and RuleEngine::result() to delegate to the new node-based methods for consistency
The get accessor on the property was returning ->error, which triggered the get accessor again, causing infinite recursion. Since the get hook had no transformation logic, the entire property hook was redundant and has been replaced with a simple typed property.
… with ArrayObject
- Replace BaseToken & Operator intersection type with BaseToken in ExpressionFactory, ExpressionFactoryInterface, EvaluableExpression, and ParserException - Remove implements Operator from GenericToken - Remove implements Method from TokenMethod - Delete unused empty marker interfaces: Logical, Method, Operator, Parenthesis, Whitespace - Keep Value interface as it's still used with instanceof in Lexer
…onRegistry Refactored FunctionRegistry to store class strings instead of Closures, with instance caching so each function class is instantiated only once per registry lifetime. Updated FunctionCallNode and TokenIterator to use CallableInterface::call() instead of Closure invocation.
Extracted all mutable scanning state ($input, $pos, $length) into a new LexerContext value object. The Lexer no longer holds any mutable state as instance properties - a LexerContext is created locally inside tokenize() and passed to all private helper methods. This makes the Lexer fully reentrant: multiple calls to tokenize() on the same Lexer instance will not interfere with each other. - Created src/Tokenizer/LexerContext.php with peek(), current(), advance(), isValid(), and startsWith() convenience methods - Refactored Lexer.php to remove $input, $pos, $length instance properties - All private methods now accept LexerContext instead of accessing $this->input/$this->pos/$this->length
…y wiring to RuleEngineBuilder The RuleEngine constructor was a classic God Constructor - it created and wired all internal dependencies (TokenFactory, Lexer, VariableRegistry, FunctionRegistry, MethodRegistry, ObjectMethodCallerFactory, TokenIteratorFactory, Parser, AstEvaluator) directly inside the constructor body. Changes: - RuleEngine now accepts fully-wired dependencies (Parser, AstEvaluator, VariableRegistry) via constructor injection - RuleEngineBuilder.build() now handles all object graph assembly - Rule.php updated to use builder instead of direct RuleEngine instantiation - Updated test that directly instantiated RuleEngine with old constructor
Replaced the do...while loop with manual index tracking in
findCallableMethod() with a cleaner two-step approach:
1. Check exact method name first with is_callable()
2. Use foreach loop over prefixes for fallback matching
This preserves the same behavior and prefix priority order
('', 'get', 'is', 'get_', 'is_') while being more maintainable
and less error-prone.
…ted test - Removed entire src/Expression/ directory (BaseExpression, all comparison expression classes, ExpressionFactory, ExpressionFactoryInterface) - Removed src/Parser/EvaluableExpression.php and EvaluableExpressionFactory.php which depended on the Expression classes - Removed tests/unit/Expression/ExpressionFactoryTest.php which tested dead code All comparison logic is handled by ComparisonNode + ComparisonOperator enum.
- Remove the redundant Token enum (src/TokenStream/Token/Token.php) - Update TokenFactory::createFromToken() to accept TokenKind instead of Token - Remove the 40-line tokenToKind() mapping function - Update Lexer to use TokenKind instead of Token throughout - Update test to use TokenKind directly
…on, getMethod) These methods were only referenced in tests and never used in production code. Also cleaned up the now-unused constructor parameters from TokenIterator and TokenIteratorFactory, and updated RuleEngineBuilder accordingly.
…ed streaming Replace the O(n) array storage in tokenize() with a PHP generator (yield), eliminating the need to hold all tokens in memory simultaneously. - Remove array and ArrayIterator wrapping - Replace lastTokenIsValue(array ) with isTokenValue(?BaseToken ) for O(1) context tracking instead of O(n) stack walk - Track last non-ignorable token via variable - Remove unused ArrayIterator import
…y from TokenIterator - Remove TokenIteratorFactory (pointless abstraction — single create() method that just does 'new TokenIterator(...)') - Inline TokenIterator creation directly in Parser::parse() - Remove peekRaw() from TokenIterator (identical to current(), misleading name) - Replace all peekRaw() calls with current() in Parser - Remove readonly from TokenIterator (implements Iterator with mutating methods) - Update RuleEngineBuilder and ParserTest to remove factory dependency
isValid() was duplicating the full parse+evaluate pipeline that getError() already performed. Refactored isValid() to delegate to getError() by checking if the error string is empty, eliminating the code duplication while preserving identical behavior.
The class 'TokenizerInterface' was actually an abstract class, not an interface. Renamed it to 'Tokenizer' to accurately reflect its nature, and changed the property from public to protected to fix the encapsulation issue. Updated all references in Lexer, Parser, RuleEngineBuilder, and ParserTest.
- Create abstract Lexer class at src/Lexer/Lexer.php - Rename concrete class to DefaultLexer at src/Lexer/DefaultLexer.php - Update LexerContext namespace from Tokenizer to Lexer - Update all imports and references in Parser, RuleEngineBuilder, Highlighter - Update test files (ParserTest, LexerTest) - Remove old src/Tokenizer/ directory - Rename tests/unit/Tokenizer/ to tests/unit/Lexer/ - All 261 tests pass
…idate token classes - Replace the old stack-based Parser with a clean recursive descent parser that builds an AST directly, with documented operator precedence levels - Consolidate 15+ single-purpose token classes into GenericToken with TokenKind enum for type identification - Delete 10 unused token class files (TokenArray, TokenBoolFalse, TokenBoolTrue, TokenFloat, TokenFunction, TokenInteger, TokenNull, TokenObject, TokenRegex, TokenVariable) - Remove dead Value marker interface - Update all grammar method implementations to use isOfKind() instead of instanceof checks - Update TokenFactory to create GenericToken instances - Fix TokenBool to extend GenericToken - Update tests to use GenericToken instead of deleted token classes - All 261 tests pass with 947 assertions
- Change CallableInterface::call() signature from ?BaseToken to mixed - Update CallableFunction to accept mixed token and parseParameter() return type - Remove token wrapping in FunctionCallNode::resolveArguments() - Remove token wrapping in MethodCallNode::resolveArguments() - Pass raw object value to MethodRegistry::get() for non-object/non-regex types - Update ObjectMethodCaller::call() to accept mixed params, remove getTokenValues() - Update all 12 method and 2 function implementations to work with raw values
- Updated README.md to reference PSR-12 instead of PSR-2 - Updated .styleci.yml preset from psr2 to psr12 - Fixed 338 PSR-12 coding style violations across 109 PHP files - Moved file-level docblocks before declare(strict_types=1) per PSR-12 - Added phpcs:ignore annotations for intentional snake_case test methods - Added squizlabs/php_codesniffer as dev dependency - All 261 tests continue to pass
- phpunit/phpunit: ^12.3 → ^13.1 (12.3.15 → 13.1.7) - mockery/mockery: ^1.6 (1.6.12, already up to date) - squizlabs/php_codesniffer: ^4.0 (4.0.1, already up to date) All transitive dependencies updated accordingly.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Release Notes
New Features
+,-,*,/,%. Parentheses can be used to override precedence. (2 + 3 * 4 == 14)+operator now performs string concatenation when either operand is a string, and numeric addition otherwise. ("hello " + "world")-5), double negation (--5), and logical NOT (!false,!(1 == 2),!foo).Rule::result()returns the actual computed value of an expression (e.g.,int(15)for5 * 3), not just true/false.Major Refactors & Improvements
GenericToken+TokenKindenum, drastically reducing complexity.Lexer::tokenize()converted from array-based to generator-based streaming for better memory efficiency.TokenStreamclass.RuleEngineinto a dedicated builder.matchexpressions with the Interpreter pattern for cleaner node evaluation.InternalCallableDTO.Expressiondirectory,EvaluableExpression, unusedTokenIteratormethods, and more.Bug Fixes
TokenEncapsedString.Rule.phpwhere the AST cache was never utilized.Rule.phpproperty hook.SplObjectStoragewithArrayObjectto fix wrong data structure usage.ObjectMethodCaller.isValid()/getError().AstEvaluator.$afterValueboolean with stack-based token inspection.Dependency Updates
>=8.5phpunit/phpunitto^13.1mockery/mockeryto^1.6squizlabs/php_codesnifferto^4.0