Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add ScalarType and treat bare strings as char arrays #2116

Merged
merged 9 commits into from
Jan 13, 2023

Conversation

agoose77
Copy link
Collaborator

@agoose77 agoose77 commented Jan 12, 2023

Fixes #2096, closes #2070, and adjusts #2098 to treat strings as character arrays.

The new ScalarType object let's us determine whether we have ak.Record or an ak.Array of records. It is the natural counterpart to the "high-level" ArrayType. Importantly, Awkward type objects lose information if they're considered at the low-level, i.e. not wrapped by ScalarType or ArrayType; it is not possible to determine whether one has an Array or a Record.

Note that this PR does not remove the __cast__ implementation that converts string ufunc *args arguments to character arrays. This seems desirable in the main, unless there are well-known ufuncs that take string arguments?

@agoose77 agoose77 requested review from jpivarski and removed request for jpivarski January 12, 2023 22:31
@agoose77 agoose77 marked this pull request as draft January 12, 2023 22:34
@agoose77 agoose77 temporarily deployed to docs-preview January 12, 2023 22:45 — with GitHub Actions Inactive
@agoose77 agoose77 marked this pull request as ready for review January 12, 2023 23:10
@agoose77
Copy link
Collaborator Author

Ah yes, test are now broken because we don't infer high-level / low-level from the array / layout type.

@agoose77 agoose77 temporarily deployed to docs-preview January 12, 2023 23:19 — with GitHub Actions Inactive
Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed it, and it's good to merge once the tests have been updated.

@agoose77
Copy link
Collaborator Author

agoose77 commented Jan 13, 2023

@jpivarski in the test suite, a few places have become high-level tests (i.e., I added ak.types.ArrayType around the existing comparison operand(s). This means that sometimes we're now testing whether an operation produces ArrayType(OtherType) instead of OtherType, which doesn't explicitly test OtherType. However, I think this is fine.

I added a test to cover some of the ak.type Content cases, to ensure that Content.form.type matches ak.type(...).content

@agoose77
Copy link
Collaborator Author

@jpivarski btw, I'm maybe yielding on highlevel=True. It's just a thing users would need to read the docs to understand. If you think we should have the arg, I'll restore it after all!

@jpivarski
Copy link
Member

This means that sometimes we're now testing whether an operation produces ArrayType(OtherType) instead of OtherType, which doesn't explicitly test OtherType.

It does test OtherType, since that's a nested constituent of ArrayType. Any equality test would recurse down the expected and observed Type-trees. So I think this is more than fine.

@jpivarski
Copy link
Member

@jpivarski btw, I'm maybe yielding on highlevel=True. It's just a thing users would need to read the docs to understand. If you think we should have the arg, I'll restore it after all!

I never had a strong reason for it—just symmetry with the other ak.* functions. And it doesn't mean quite the same thing as it does in the other ak.* functions, so that's not a strong argument at all. Just, if a highlevel argument were to exist, this is what it should probably mean.

@agoose77
Copy link
Collaborator Author

It does test OtherType, since that's a nested constituent of ArrayType. Any equality test would recurse down the expected and observed Type-trees. So I think this is more than fine.

Sure, I mean simply that if ak.type is a black-box, we aren't technically testing the low-level, which should most carefully be done using layout.form.type. However, it's fine as you say.

@agoose77 agoose77 temporarily deployed to docs-preview January 13, 2023 00:23 — with GitHub Actions Inactive
@codecov
Copy link

codecov bot commented Jan 13, 2023

Codecov Report

Merging #2116 (e71f54d) into main (54ab9f3) will increase coverage by 0.01%.
The diff coverage is 79.66%.

Additional details and impacted files
Impacted Files Coverage Δ
src/awkward/types/scalartype.py 77.27% <77.27%> (ø)
src/awkward/operations/ak_type.py 83.72% <78.57%> (+6.63%) ⬆️
src/awkward/operations/ak_to_layout.py 89.13% <85.71%> (+0.75%) ⬆️
src/awkward/highlevel.py 76.21% <100.00%> (ø)
src/awkward/types/__init__.py 100.00% <100.00%> (ø)
src/awkward/operations/ak_sort.py 60.00% <0.00%> (ø)
src/awkward/operations/ak_argsort.py 75.00% <0.00%> (ø)

@agoose77 agoose77 enabled auto-merge (squash) January 13, 2023 00:32
@agoose77 agoose77 merged commit 71ada43 into main Jan 13, 2023
@agoose77 agoose77 deleted the agoose77/fix-ak-scalar-type branch January 13, 2023 00:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Simplify & unify ak.type() ak.to_layout(str) does not throw exception
2 participants