-
-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add token index #12
Add token index #12
Conversation
I'm not maintainer on this project |
ok sorry |
Thanks for the change. I think this makes sense and it should be BC. Can you add a unit test? |
Thanks for making this change and adding the tests! I will merge a little later and I am going to follow this PR up with another to add Travis CI. |
Unfortunately this change is a BC break and needs to be reverted. :/ Thanks @goetas for spotting this! |
Looking at the code in https://github.com/egeloen/ivory-serializer/tree/master/src/Type and I don't immediately see what it was depending on that caused the break. I will look more later. |
Hi, i think we get drop the $this->tokens[$index] = array( but we can keep 'index' => $index, |
@instabledesign If you have time, can you look at https://github.com/egeloen/ivory-serializer and see why it broke after this change so that we can add tests to cover it? |
Yes. |
Investigation report:
Working solution : for ($i = 0; ($i < $count) || ($token === $nextToken); ++$i) { for ($i = 0; ($i < $count) || ($token['value'] === $nextToken['value'] && $token['type'] === $nextToken['type'] && $token['position'] === $nextToken['position']); ++$i) { I'll try to fix the AbstractLexer::$index in order to increment only when the match is a not a capture of previous one but without succeed, and i dont think is a good solution. |
I try to fix the 2 problem from above but theire is some logic to build the fixtures with some private method, so this is not easy to reproduce and fix correctly what is going on! I continue to work on it on my free time. |
tests fixed i ping him in order to merge |
egeloen/ivory-serializer look like not active anymore with only one release (from jan 2017) @jwage did you plan to create new version with this modification? |
Did we figure out a way to make the change in this repo so it doesn't break existing implementations? (even if their regex is "wrong") |
First i try with group naming but the group naming doesn't work with preg_split('/(?<FOO>=|>|<)|(?<BAR>[a-z]+)|(?<BAZ>\d+)/i', 'price>5', -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_OFFSET_CAPTURE);
/*
array(3) {
[0]=>
array(2) {
[0]=>
string(5) "price"
[1]=>
int(0)
}
[1]=>
array(2) {
[0]=>
string(1) ">"
[1]=>
int(5)
}
[2]=>
array(2) {
[0]=>
string(1) "5"
[1]=>
int(6)
}
}
*/ The second way is to deduplicate the matched element with offset $matches = preg_split('/((=|>|<)|([a-z]+)|(\d+))/i', 'price>5', -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_OFFSET_CAPTURE);
$offset = null;
$matchesDeduplicate = array_filter($matches, function($item)use(&$offset){
if (null === $offset) {
$offset = $item[1];
return true;
}
$filter = $offset !== $item[1];
$offset = $item[1];
return $filter;
});
/*
array(3) {
[0]=>
array(2) {
[0]=>
string(5) "price"
[1]=>
int(0)
}
[2]=>
array(2) {
[0]=>
string(1) ">"
[1]=>
int(5)
}
[4]=>
array(2) {
[0]=>
string(1) "5"
[1]=>
int(6)
}
}
*/ With 300 tokens match 10 each (9000 tokens) we already have 1Mo memory more consuption |
what did you think about it @jwage |
I don't think we can make this change without breaking BC or increasing memory usage as you noted. |
Hi can you consider apply this change on the v2 ? |
If there is a breaking change, then it should go into v4 I'm afraid. |
Im not completely sure it was a BC because it only add a new value in the token details. |
If it's not a breaking change then you should target v3.1 |
Hi, i recently work on new project Xpression
My need is to
resetPosition
at token index but the$token['position']
was the string position of this token in the input string.My actual workaround was to keep the lexerI index in my Parser code and reset it each time i need it.
So i think if the token index was store in the token i can get it easily with
$token['index']
Thank to read.