-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[query] Add enumerate, deprecate zip_with_index #9308
Conversation
@@ -3384,9 +3384,45 @@ def zip(*arrays, fill_missing: bool = False) -> ArrayExpression: | |||
aggregations) | |||
|
|||
|
|||
@typecheck(a=expr_array(), start=expr_int32, index_first=bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this need to be int32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because currently our arrays store their length as an int32, so that's what hl.range
expects. All the ndarray stuff stores the shape as an int64 because I imagine we will eventually want to blow past 2^31 as an array length limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one q
Parameters | ||
---------- | ||
a : :class:`.ArrayExpression` | ||
start : :class:`.Int32Expression` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say something like "the starting 'index'", index can be first or second.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the reason I worded it the way I did was because I was afraid to give the implication that "index" meant "indexing into a
(i.e., specifying start = 2 would mean to start at position 2 of a
)".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make our language consistent with Python's enumerate
method, or if you feel strongly that this isn't the right direction, make our language internally consistent. We use "index" throughout, and "label" does not indicate that the number included in the tuple is ordinal.
Here is how Python's enumerate is described elsewhere:
"The next() method of the iterator returned by enumerate() returns a tuple containing a count (from start which defaults to 0) and the values obtained from iterating over iterable."
"And there is more! enumerate also accepts an optional argument that allows us to specify the starting index of the counter."
"Start: the index value from which the counter is
to be started, by default it is 0 "
A few of our uses of "index":
"Returns an array of (index, element) tuples."
"Array of (index, element) or (element, index) tuples."
"The number the first element of the array is labeled with.
index_first: :obj:bool
... If True
, the index is the first value of the element tuples"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've convinced me. I'll go with language that includes "index", that seems to be the general understanding. I like "The index value from which the counter is started", I think that's clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like counter, nice John!
46cf260
to
db1008b
Compare
zip_with_index
seems based on Scala'szipWithIndex
. Here I deprecate that in favor of using the more pythonicenumerate
. To match the signature of the python function, I add astart
argument and make our customindex_first
argument keyword only.