Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IntlPartsIterator key is wrong for KEY_LEFT/KEY_RIGHT #7734

Closed
TRowbotham opened this issue Dec 7, 2021 · 4 comments
Closed

IntlPartsIterator key is wrong for KEY_LEFT/KEY_RIGHT #7734

TRowbotham opened this issue Dec 7, 2021 · 4 comments

Comments

@TRowbotham
Copy link
Contributor

Description

When creating an IntlPartsIterator with a type of IntlPartsIterator::KEY_LEFT or IntlPartsIterator::KEY_RIGHT, the returned key value is off by one after the first key. So for the string "ABC" you would get 0, 2, 3, instead of the expected 0, 1, 2.

https://3v4l.org/O8374
https://3v4l.org/SNUAX

Originally filed as https://bugs.php.net/bug.php?id=78308
Initial fix attempt #4446

PHP Version

8.1.0

Operating System

No response

@iluuu1994
Copy link
Member

Apparently the first example is resolved now on master. https://3v4l.org/O8374/rfc#vgit.master

@iluuu1994
Copy link
Member

The second case is also fixed, except that there's a different issue where index is rewritten to -1 when rewinding the iterator.

iter->index = -1; /* will be set to 0 before using next handler */

Usually the index is computed in get_current_key but in this case we're relying on index as a storage mechanism. One way to solve this is to store an index_right in zoi_break_iter_parts which does not get reset after rewind.

diff --git a/ext/intl/breakiterator/breakiterator_iterators.cpp b/ext/intl/breakiterator/breakiterator_iterators.cpp
index 2027fb45c5..3ea56e80d3 100644
--- a/ext/intl/breakiterator/breakiterator_iterators.cpp
+++ b/ext/intl/breakiterator/breakiterator_iterators.cpp
@@ -127,6 +127,7 @@ typedef struct zoi_break_iter_parts {
 	zoi_with_current zoi_cur;
 	parts_iter_key_type key_type;
 	BreakIterator_object *bio; /* so we don't have to fetch it all the time */
+	zend_ulong index_right;
 } zoi_break_iter_parts;
 
 static void _breakiterator_parts_destroy_it(zend_object_iterator *iter)
@@ -136,8 +137,13 @@ static void _breakiterator_parts_destroy_it(zend_object_iterator *iter)
 
 static void _breakiterator_parts_get_current_key(zend_object_iterator *iter, zval *key)
 {
-	/* the actual work is done in move_forward and rewind */
-	ZVAL_LONG(key, iter->index);
+	zoi_break_iter_parts *zoi_bit = (zoi_break_iter_parts*)iter;
+
+	if (iter->index == 0 && zoi_bit->key_type == PARTS_ITERATOR_KEY_RIGHT) {
+		ZVAL_LONG(key, zoi_bit->index_right);
+	} else {
+		ZVAL_LONG(key, iter->index);
+	}
 }
 
 static void _breakiterator_parts_move_forward(zend_object_iterator *iter)
@@ -163,6 +169,7 @@ static void _breakiterator_parts_move_forward(zend_object_iterator *iter)
 		iter->index = cur;
 	} else if (zoi_bit->key_type == PARTS_ITERATOR_KEY_RIGHT) {
 		iter->index = next;
+		zoi_bit->index_right = next;
 	}
 	/* else zoi_bit->key_type == PARTS_ITERATOR_KEY_SEQUENTIAL
 	 * No need to do anything, the engine increments ->index */
@@ -229,6 +236,7 @@ void IntlIterator_from_BreakIterator_parts(zval *break_iter_zv,
 	assert(((zoi_break_iter_parts*)ii->iterator)->bio->biter != NULL);
 
 	((zoi_break_iter_parts*)ii->iterator)->key_type = key_type;
+	((zoi_break_iter_parts*)ii->iterator)->index_right = 0;
 }
 
 U_CFUNC PHP_METHOD(IntlPartsIterator, getBreakIterator)

But that's not very clean either. I don't see any other ways to solve this.

Fun fact, IntlPartsIterator::KEY_RIGHT is used a total of 0 times on GitHub.
https://sourcegraph.com/search?q=context:global+IntlPartsIterator::KEY_RIGHT+file:%5C.php%24&patternType=literal

@TRowbotham
Copy link
Contributor Author

Feel free to close this and spin off a new bug with the rewind behavior, if you think that would be better. I'm curious to know what fixed this on master as it sounds like it was unintentionally fixed.

@iluuu1994
Copy link
Member

I did some more digging, the reason this works on master is because we changed how get_iterator is inherited.

PHP-8.1:

if (class_type->parent && (class_type->parent->ce_flags & ZEND_ACC_REUSE_GET_ITERATOR)) {
/* Keep the inherited get_iterator handler. */
class_type->ce_flags |= ZEND_ACC_REUSE_GET_ITERATOR;
} else {
class_type->get_iterator = zend_user_it_get_iterator;
}

master:

/* None of the Iterator methods have been overwritten, use inherited get_iterator(). */
if (funcs_ptr->zf_rewind->common.scope != class_type &&
funcs_ptr->zf_valid->common.scope != class_type &&
funcs_ptr->zf_key->common.scope != class_type &&
funcs_ptr->zf_current->common.scope != class_type &&
funcs_ptr->zf_next->common.scope != class_type) {
return SUCCESS;
}
/* One of the Iterator methods has been overwritten,
* switch to zend_user_it_get_iterator. */

ZEND_ACC_REUSE_GET_ITERATOR was not set for the IntlPartsIterator class which caused get_iterator to be reset to zend_user_it_get_iterator. Adding this flag solves the first issue. The issue described above remains and needs to be solved separately.

iluuu1994 added a commit to iluuu1994/php-src that referenced this issue Mar 4, 2022
iluuu1994 added a commit to iluuu1994/php-src that referenced this issue Mar 4, 2022
iluuu1994 added a commit to iluuu1994/php-src that referenced this issue Mar 4, 2022
iluuu1994 added a commit to iluuu1994/php-src that referenced this issue Mar 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
@iluuu1994 @cmb69 @TRowbotham and others