-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot reproduce documentation groupby (AwkwardExtensionArray
object has no attribute all
)
#35
Comments
Hi! I have these version installed:
And I'm unable to reproduce the error you're seeing (the example in docs is running for me with those versions). Would you be able to spin up a fresh conda/virtual environment with this versions and try again? For completeness here's what I see locally: In [20]: data = """
...: - name: Bob\n team: tigers\n goals: [0, 0, 0, 1, 2, 0, 1]\n\n- name: Alice\n team: bears\n goals: [3, 2, 1, 0, 1]\n\n- name: Jack\n team: bears\n goals: [0, 0, 0, 0,
...: 0, 0, 0, 0, 1]\n\n- name: Jill\n team: bears\n goals: [3, 0, 2]\n\n- name: Ted\n team: tigers\n goals: [0, 0, 0, 0, 0]\n\n- name: Ellen\n team: tigers\n goals: [1,
...: 0, 0, 0, 2, 0, 1]\n\n- name: Dan\n team: bears\n goals: [0, 0, 3, 1, 0, 2, 0, 0]\n\n- name: Brad\n team: bears\n goals: [0, 0, 4, 0, 0, 1]\n\n- name: Nancy\n team: ti
...: gers\n goals: [0, 0, 1, 1, 1, 1, 0]\n\n- name: Lance\n team: bears\n goals: [1, 1, 1, 1, 1]\n\n- name: Sara\n team: tigers\n goals: [0, 1, 0, 2, 0, 3]\n\n- name: Ryan
...: \n team: tigers\n goals: [1, 2, 3, 0, 0, 0, 0]\n
...: """
In [21]: import yaml
...:
...: data = yaml.load(data, Loader=yaml.SafeLoader)
...: data = ak.Array(data)
In [22]: s = akpd.from_awkward(data)
In [23]: df = s.ak.to_columns(extract_all=True)
In [24]: (df
...: .set_index('name')
...: .groupby('team', group_keys=True)
...: .apply(lambda x: x.goals.ak.mean(axis=1))
...: )
Out[24]:
team name
bears Alice 1.4
Jack 0.111111
Jill 1.666667
Dan 0.75
Brad 0.833333
Lance 1.0
tigers Bob 0.571429
Ted 0.0
Ellen 0.571429
Nancy 0.571429
Sara 1.0
Ryan 0.857143
dtype: awkward
In [25]: (df
...: .set_index('name')
...: .groupby(['team', 'name'], group_keys=True)
...: .apply(lambda x: x.goals.ak.mean(axis=1))
...: )
Out[32]:
team name name
bears Alice Alice 1.4
Brad Brad 0.833333
Dan Dan 0.75
Jack Jack 0.111111
Jill Jill 1.666667
Lance Lance 1.0
tigers Bob Bob 0.571429
Ellen Ellen 0.571429
Nancy Nancy 0.571429
Ryan Ryan 0.857143
Sara Sara 1.0
Ted Ted 0.0
dtype: awkward I'm also unable to reproduce this:
In [18]: s.ak.to_columns()
Out[18]:
name team awkward-data
0 Bob tigers {'goals': [0, 0, 0, 1, 2, 0, 1]}
1 Alice bears {'goals': [3, 2, 1, 0, 1]}
2 Jack bears {'goals': [0, 0, 0, 0, 0, 0, 0, 0, 1]}
3 Jill bears {'goals': [3, 0, 2]}
4 Ted tigers {'goals': [0, 0, 0, 0, 0]}
5 Ellen tigers {'goals': [1, 0, 0, 0, 2, 0, 1]}
6 Dan bears {'goals': [0, 0, 3, 1, 0, 2, 0, 0]}
7 Brad bears {'goals': [0, 0, 4, 0, 0, 1]}
8 Nancy tigers {'goals': [0, 0, 1, 1, 1, 1, 0]}
9 Lance bears {'goals': [1, 1, 1, 1, 1]}
10 Sara tigers {'goals': [0, 1, 0, 2, 0, 3]}
11 Ryan tigers {'goals': [1, 2, 3, 0, 0, 0, 0]} In [19]: s.ak.to_columns(extract_all=True)
Out[19]:
name team goals
0 Bob tigers [0, 0, 0, 1, 2, 0, 1]
1 Alice bears [3, 2, 1, 0, 1]
2 Jack bears [0, 0, 0, 0, 0, 0, 0, 0, 1]
3 Jill bears [3, 0, 2]
4 Ted tigers [0, 0, 0, 0, 0]
5 Ellen tigers [1, 0, 0, 0, 2, 0, 1]
6 Dan bears [0, 0, 3, 1, 0, 2, 0, 0]
7 Brad bears [0, 0, 4, 0, 0, 1]
8 Nancy tigers [0, 0, 1, 1, 1, 1, 0]
9 Lance bears [1, 1, 1, 1, 1]
10 Sara tigers [0, 1, 0, 2, 0, 3]
11 Ryan tigers [1, 2, 3, 0, 0, 0, 0] |
So I downloaded the exact notebook for your "quickstart", and I started a new environment with defaults via conda, and used Here's the versions that gets:
And interestingly the groupby now works, but I do reproduce the
I'll have to go now but I can try to reproduce the main error with older pandas later today, hopefully. |
Hi all! Lovely utility here. I was playing with the example from the docs and can't quite seem to find a good workaround for this bug:
This seems to be happening with the
.agg
operator as well, and the.groupby(['team','name']).apply(...)
method I would usually use returns an error complaining about no attribute'any'
.Here's my version info, as in the docs:
I should mention that the behavior of
s.ak.to_columns()
appears to have changed as well, since my version returns only a single column namedawkward-data
, vs. the docs that have a column for every field in the array.The text was updated successfully, but these errors were encountered: