Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Selecting struct column drops field names #7249

Closed
devavret opened this issue Jan 29, 2021 · 0 comments · Fixed by #7271
Closed

[BUG] Selecting struct column drops field names #7249

devavret opened this issue Jan 29, 2021 · 0 comments · Fixed by #7271
Assignees
Labels
bug Something isn't working Python Affects Python cuDF API.

Comments

@devavret
Copy link
Contributor

After reading a dataframe from this parquet file and selecting the list of structs column "value", cudf drops the field names of the struct.

In [1]: import cudf

In [2]: df = cudf.read_parquet("maptest.parquet")

In [3]: df
Out[3]: 
                                               value
0  [{'key': {'first': 'John', 'middle': 'Y.', 'la...

In [4]: df["value"]
Out[4]: 
0    [{'0': {'0': 'John', '1': 'Y.', '2': 'Doe'}, '...
Name: value, dtype: list

This is unlike pandas' behaviour which retains the field names

In [5]: df.to_pandas()["value"]
Out[5]: 
0    [{'key': {'first': 'John', 'middle': 'Y.', 'la...
Name: value, dtype: object
@devavret devavret added bug Something isn't working Needs Triage Need team to review and classify labels Jan 29, 2021
@github-actions github-actions bot added this to Needs prioritizing in Bug Squashing Jan 29, 2021
@shwina shwina self-assigned this Jan 29, 2021
@kkraus14 kkraus14 added Python Affects Python cuDF API. and removed Needs Triage Need team to review and classify labels Jan 29, 2021
@rapids-bot rapids-bot bot closed this as completed in #7271 Feb 4, 2021
Bug Squashing automation moved this from Needs prioritizing to Closed Feb 4, 2021
rapids-bot bot pushed a commit that referenced this issue Feb 4, 2021
Fixes #7249

Copies dtype metadata after calling `ColumnBase.copy()`. Moves logic for copying dtype metadata after calling libcudf functions from `Frame` to `ColumnBase`.

Authors:
  - Ashwin Srinath (@shwina)

Approvers:
  - Keith Kraus (@kkraus14)
  - GALI PREM SAGAR (@galipremsagar)

URL: #7271
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Affects Python cuDF API.
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants