-
-
Notifications
You must be signed in to change notification settings - Fork 8
IsShortArrayOrList: refactor for better accuracy and improved performance #392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
jrfnl
merged 19 commits into
develop
from
utils/isshortlist-isshortarray-refactor-improve-performance
Oct 25, 2022
Merged
IsShortArrayOrList: refactor for better accuracy and improved performance #392
jrfnl
merged 19 commits into
develop
from
utils/isshortlist-isshortarray-refactor-improve-performance
Oct 25, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Add test with PHP 8.0 nullsafe object operators to safeguard against a potential tokenizer issue. As per upstream PR squizlabs/PHP_CodeSniffer 3046 * Improve a test case. Ensure that the test actually needs to token walk and doesn't hit the "next after is closer of wrapping brackets" condition. * Improve a test case. Ensure that the test actually needs to token walk and doesn't hit the "next after is semicolon" condition. * Improve some test case names ensuring that all test cases are prefixed with the expected outcome, i.e. "short array" or "short list". * Add extra test covering a particular parse error with `foreach`. * Add extra tests involving arrays with operators before/after. * Move a test up to a more appropriate place as it doesn't involve an invalid syntax.
This is step 1 in a refactor to improve short list/short array determination. In this commit, the pre-existing code from the `Arrays::isShortArray()` and `Lists::isShortList()` methods is consolidated into a class which handles both determinations. This initial refactor does not contain any functional changes other than to remove (most of) the code which handled BC with PHPCS < 3.7.1. Includes updating the `@covers` tags. Only the caching related tests are now still testing the `Arrays::isShortArray()` and `Lists::isShortList()` methods. Everything else is testing the new `PHPCSUtils\Internal\IsShortArrayOrList` class.
This commit moves the original test set for the `IsShortArrayOrList` functionality from the `Utils\Lists` test directory to the `Internal\IsShortArrayOrList` test directory. Includes updating the `README-IsShortArrayTests.md` file and adding a `README-IsShortListTests.md` file documenting where the tests are located.
…rList` class This commit updates most tests to use the `IsShortArrayOrList` class directly instead of going via the `Arrays::isShortArray()` and `Lists::isShortList()` wrapper functions. Reasons: * Running each tests via both wrapper functions, while the output from the underlying class will be the same (after the code consolidation), is redundant. * By using the `IsShortArrayOrList` class directly in the tests, we bypass the caching which is handled in the wrapper functions. This makes sure that the tests are "pure" and not incidentally affected by the caching. Additionally: * The `testValidDataProvider()` methods, which handled some quality control for the tests, have been removed as they are no longer needed. * Split the tests using the `IsShortArrayOrListTest::dataNotShortArrayShortListBracket()` into two different data sets with associated tests: 1. A test set for testing the behaviour when non-square bracket/non-short array tokens are being passed. 2. A test set for testing the behaviour when square bracket/short array tokens which are in actual fact real square brackets are being passed. * The test class and file names and `@covers` tags for the "TokenizerBC" tests have been adjusted to be more specific to the method within the `isShortArrayOrList` class covered by these tests. The `@covers` tags have also been moved to the class docblocks as these tests now only need one `@covers` tag for the whole class. Altogether, this reduces the number of tests, while still maintaining the same level of quality control and still covering all tests cases.
… tests Making sure that: 1. All types of nesting is being tested, i.e. start of outer, middle of outer, end of outer. 2. The contents of the brackets holds no clue to whether it is an array or list, to make sure the test is "pure". These tests _could_ now be considered redundant as the tokenizer issues they were testing for in the context of BC support for older PHPCS version, should no longer exist now the minimum PHPCS version has gone up. Keeping the tests in place (and expanding them) will safeguard against regressions in the PHPCS tokenizer, so is the safer choice.
Split off the tests which are specific to the `IsShortArrayOrList::__construct()` method to a dedicated test file and add some additional tests to make sure the method is fully covered.
Split off the tests which are specific to the `IsShortArrayOrList::isSquareBracket()` method to a dedicated test file and add some additional tests just in case. Note: one test with square brackets remains in the original test file to cover the function call to `isSquareBracket()` within the `solve()` method.
…ed tokenizer issue Turns out a square bracket after a non-braced control structure is always tokenized as `T_OPEN_SQUARE_BRACKET`, while it should be tokenized as `T_OPEN_SHORT_ARRAY`. The fix for this issue will be included in PHPCS 3.7.2, but that version hasn't been released yet. Refs: * sirbrillig/phpcs-variable-analysis 263 * squizlabs/PHP_CodeSniffer 3632
Split off the tests which are specific to the `IsShortArrayOrList::isInForeach()` method to a dedicated test file. Includes adding numerous extra tests. Note: one test involving `foreach` remains in the original test file to cover the function call to `isInForeach()` within the `solve()` method.
This commit improves the stability of the `IsShortArrayOrList::isInForeach()` method by: * Using the `Context::inForeachCondition()` method to determine which part of a `foreach` condition we are looking at. * Handling non-nested short arrays in the "before as" part of the condition, which should improve performance a little. * Ensuring that only non-nested short lists in the "after as" part of the condition are handled by this method. Anything nested should be handled based on the _outer_ set of brackets as otherwise the results will be unreliable. This also prevents incorrect results for code like this `foreach(["a"=>[]] as $v)`, where the empty array in the "before as" would previously be identified as a short list due to the double arrow before it. Includes additional tests. Refs: * PHPCompatibility/PHPCompatibility 1341
Make sure all named data sets use keys within the data set as well for a more readable data provider.
This is step 2 in a refactor to improve short list/short array determination. While caching had previously already been implemented, as things were, there could be four caches which would each refer to the same result, i.e. 1. A cache for an opener token passed to `Arrays::isShortArray()`. 2. A cache for an closer token of the same set of brackets passed to `Arrays::isShortArray()`. 3. A cache for an opener token of the same set of brackets passed to `Lists::isShortList()`. 4. A cache for an closer token of the same set of brackets passed to `Lists::isShortList()`. This also meant that, depending on how the methods were called, the underlying `IsShortArrayOrList` class could be called up to four times to determine what a set of brackets signified, while the result would already be known after the first time and is not subject to change. This commit introduces a wrapper class around the `IsShortArrayOrList` class which will handle the caching and will only create one cache entry for all four potential starting points, making the caching far more efficient and effective. Includes adding a set of dedicated tests to cover the functionality in the `IsShortArrayOrListWithCache` class, replacing the previous caching tests in the `IsShortArrayOrListTest` class.
As the new `IsShortArrayOrListWithCache` class will now be the entry point to the `IsShortArrayOrList` class, we can rely upon the token being passed to the `IsShortArrayOrList` always being an _opener_ for a complete set of brackets and can remove some redundancy from the `IsShortArrayOrList` class. Includes making the exception for when an incorrect token is passed more specific. Includes updating the `IsShortArrayOrList` tests to no longer pass closer tokens to the `IsShortArrayOrList` class.
This is step 3 in a refactor to improve short list/short array determination. This refactor adds a number of extra interim steps to try and avoid, whenever possible, walking the complete file to find "outer" brackets for nested short lists/arrays. This should improve performance of the functionality. Additionally, the interim steps will also provide higher accuracy of the results. Includes adding a set of dedicated tests to cover the functionality in each method in the `IsShortArrayOrList` class. The generic `PHPCSUtils\Internal\IsShortArrayOrList\IsShortArrayOrListTest` still remains for arbitrary short list vs short array tests (though some tests have moved to the method specific test classes).
A significant number of these tests would previously lead to incorrect results for the short list/short array determination. The refactor fixed these. * Add tests involving short lists with short arrays as keys. I seriously mean the inline remark: "Please sack anyone who writes code like this". * Add extra test for nested short list with empties. * Add extra tests involving comma separated lists of variables as in that case, there is the most significant risk of mis-identifying the token. * Add extra tests for short arrays/lists as values in a short array.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Aside from adding a sh*tload more tests, this PR also contains functional changes, though nothing which impacts the API.
Functional (non-test) commits
Arrays::isShortArray()/Lists::isShortList(): consolidate the logic [1]
This is step 1 in a refactor to improve short list/short array determination.
In this commit, the pre-existing code from the
Arrays::isShortArray()
andLists::isShortList()
methods is consolidated into a class which handles both determinations.This initial refactor does not contain any functional changes other than to remove (most of) the code which handled BC with PHPCS < 3.7.1.
Includes updating the
@covers
tags. Only the caching related tests are now still testing theArrays::isShortArray()
andLists::isShortList()
methods. Everything else is testing the newPHPCSUtils\Internal\IsShortArrayOrList
class.IsShortArrayOrList::isShortArrayBracket(): account for newly discovered tokenizer issue
Turns out a square bracket after a non-braced control structure is always tokenized as
T_OPEN_SQUARE_BRACKET
, while it should be tokenized asT_OPEN_SHORT_ARRAY
.The fix for this issue will be included in PHPCS 3.7.2, but that version hasn't been released yet.
Refs:
IsShortArrayOrList::isInForeach(): improve logic
This commit improves the stability of the
IsShortArrayOrList::isInForeach()
method by:Context::inForeachCondition()
method to determine which part of aforeach
condition we are looking at.Anything nested should be handled based on the outer set of brackets as otherwise the results will be unreliable.
This also prevents incorrect results for code like this
foreach(["a"=>[]] as $v)
, where the empty array in the "before as" would previously be identified as a short list due to the double arrow before it.Includes additional tests.
Refs:
Arrays::isShortArray()/Lists::isShortList(): consolidate the caching [2]
This is step 2 in a refactor to improve short list/short array determination.
While caching had previously already been implemented, as things were, there could be four caches which would each refer to the same result, i.e.
Arrays::isShortArray()
.Arrays::isShortArray()
.Lists::isShortList()
.Lists::isShortList()
.This also meant that, depending on how the methods were called, the underlying
IsShortArrayOrList
class could be called up to four times to determine what a set of brackets signified, while the result would already be known after the first time and is not subject to change.This commit introduces a wrapper class around the
IsShortArrayOrList
class which will handle the caching and will only create one cache entry for all four potential starting points, making the caching far more efficient and effective.Includes adding a set of dedicated tests to cover the functionality in the
IsShortArrayOrListWithCache
class, replacing the previous caching tests in theIsShortArrayOrListTest
class.IsShortArrayOrList: only accept bracket opener tokens
As the new
IsShortArrayOrListWithCache
class will now be the entry point to theIsShortArrayOrList
class, we can rely upon the token being passed to theIsShortArrayOrList
always being an opener for a complete set of brackets and can remove some redundancy from theIsShortArrayOrList
class.Includes making the exception for when an incorrect token is passed more specific.
Includes updating the
IsShortArrayOrList
tests to no longer pass closer tokens to theIsShortArrayOrList
class.IsShortArrayOrList: refactor for accuracy and improved performance [3]
This is step 3 in a refactor to improve short list/short array determination.
This refactor adds a number of extra interim steps to try and avoid, whenever possible, walking the complete file to find "outer" brackets for nested short lists/arrays.
This should improve performance of the functionality.
Additionally, the interim steps will also provide higher accuracy of the results.
Includes adding a set of dedicated tests to cover the functionality in each method in the
IsShortArrayOrList
class.The generic
PHPCSUtils\Internal\IsShortArrayOrList\IsShortArrayOrListTest
still remains for arbitrary short list vs short array tests (though some tests have moved to the method specific test classes).