Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[query] Add enumerate, deprecate zip_with_index #9308

Merged
merged 7 commits into from Aug 31, 2020

Conversation

johnc1231
Copy link
Contributor

zip_with_index seems based on Scala's zipWithIndex. Here I deprecate that in favor of using the more pythonic enumerate. To match the signature of the python function, I add a start argument and make our custom index_first argument keyword only.

@@ -3384,9 +3384,45 @@ def zip(*arrays, fill_missing: bool = False) -> ArrayExpression:
aggregations)


@typecheck(a=expr_array(), start=expr_int32, index_first=bool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be int32?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because currently our arrays store their length as an int32, so that's what hl.range expects. All the ndarray stuff stores the shape as an int64 because I imagine we will eventually want to blow past 2^31 as an array length limit.

Copy link
Contributor

@akotlar akotlar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one q

Parameters
----------
a : :class:`.ArrayExpression`
start : :class:`.Int32Expression`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say something like "the starting 'index'", index can be first or second.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the reason I worded it the way I did was because I was afraid to give the implication that "index" meant "indexing into a (i.e., specifying start = 2 would mean to start at position 2 of a)".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make our language consistent with Python's enumerate method, or if you feel strongly that this isn't the right direction, make our language internally consistent. We use "index" throughout, and "label" does not indicate that the number included in the tuple is ordinal.

Here is how Python's enumerate is described elsewhere:

"The next() method of the iterator returned by enumerate() returns a tuple containing a count (from start which defaults to 0) and the values obtained from iterating over iterable."

"And there is more! enumerate also accepts an optional argument that allows us to specify the starting index of the counter."

"Start: the index value from which the counter is
to be started, by default it is 0 "

A few of our uses of "index":
"Returns an array of (index, element) tuples."
"Array of (index, element) or (element, index) tuples."
"The number the first element of the array is labeled with.
index_first: :obj:bool ... If True, the index is the first value of the element tuples"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've convinced me. I'll go with language that includes "index", that seems to be the general understanding. I like "The index value from which the counter is started", I think that's clear.

Copy link
Contributor

@akotlar akotlar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like counter, nice John!

@danking danking merged commit 37dad7e into hail-is:main Aug 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants