Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: add name parameter to mixin_class #1030

Merged
merged 3 commits into from
Aug 9, 2021

Conversation

agoose77
Copy link
Collaborator

@agoose77 agoose77 commented Jul 22, 2021

ak.mixin_class is very useful, but does not provide a straightforward way to specify the name of the behavior; it uses the name of the class by default.

The requirements of a behavior name and a Python class name are somewhat in conflict:

  • Length
    • Python class names tend to be longer and descriptive
    • Behavior names tend to be short and concise
  • Case
    • Python class names are usually CamelCase
    • Type strings tend to be lower case

This PR makes it easier to change the name used by ak.mixin_class.

@jpivarski
Copy link
Member

What do you think, @nsmith-? (@nsmith- is the author of mixin_class.) It looks good to me.

I think most behavior names in applications would be the same as the Python class name. The built-in ones, "string", "bytestring", "char", "byte", "categorical", and proposed "sorted_map" are short and not camel-case because they're "core" names for basic data types. Applications like Vector use traditional Python class naming conventions for the behavior names: "Vector2D", "Vector3D", "Vector4D", "Momentum2D", "Momentum3D", "Momentum4D", which match class names in the Vector hierarchy, though they're not first in the method resolution order. This disparity between the core built-in naming convention and application-level naming convention is reflected in Python itself: basic types are named "str", "list", "dict", etc, but user-defined classes shouldn't be.

Even though the naming conventions wouldn't usually be in conflict, I agree that it's good to at least be able to override the name.

@nsmith-
Copy link
Contributor

nsmith- commented Aug 9, 2021

Looks fine to me. You might also consider defining a repr for the mixin if a name is provided a la
https://github.com/CoffeaTeam/coffea/blob/ea73fc7cd8b27914845db54f296d9e55740d93cf/coffea/nanoevents/methods/nanoaod.py#L20-L25
I think in the case the classname is used it could also work but perhaps too much magic?

@jpivarski
Copy link
Member

Actually, there's already magic to take "__record__": XYZ and make the type string be XYZ[field1: x, field2: y, field3: z] instead of {field1: x, field2: y, field3: z} (regardless of whether any behavior is defined). Overriding "__typestr__" with the classname only would reduce the type string to XYZ, which ought to be opt-in (maybe another argument to mixin_class). For example, Vector does not do that, so that you can see which coordinate system is being used (through the names of the fields) and their types (e.g. float32 vs float64).

This looks good to me, too, so I'll merge it. I think mixin_class has tests, right? I'll do a quick check before the merge.

@jpivarski
Copy link
Member

(Yes, there are 11 hits in tests/test_0355-mixins.py, tests/test_0815-broadcast-union-types-to-all-possibilities.py, and tests/test_0930-bug-in-unionarray-purelist_parameter.py.)

@jpivarski jpivarski merged commit 1c7017d into main Aug 9, 2021
@jpivarski jpivarski deleted the feat-mixin-class-name-parameter branch August 9, 2021 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants