Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return schema info from JSON reader #11419

Merged
merged 4 commits into from
Aug 1, 2022

Conversation

vuule
Copy link
Contributor

@vuule vuule commented Aug 1, 2022

Description

Populate the schema_info structure (in addition to column_names) to match the behavior of a (future) JSON reader that supports nested columns.
Use the schema_info in Cython to set the struct columns' field names (unused until nested type support is added).

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added Python Affects Python cuDF API. libcudf Affects libcudf (C++/CUDA) code. labels Aug 1, 2022
@vuule vuule self-assigned this Aug 1, 2022
@vuule vuule added this to PR-WIP in v22.10 Release via automation Aug 1, 2022
@vuule vuule added feature request New feature or request non-breaking Non-breaking change cuIO cuIO issue labels Aug 1, 2022
@vuule vuule changed the title Pass schema info from JSON reader Return schema info from JSON reader Aug 1, 2022
@codecov
Copy link

codecov bot commented Aug 1, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.10@35a7c81). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head 6268b5e differs from pull request most recent head 08b0ce6. Consider uploading reports for the commit 08b0ce6 to get more accurate results

@@               Coverage Diff               @@
##             branch-22.10   #11419   +/-   ##
===============================================
  Coverage                ?   86.47%           
===============================================
  Files                   ?      144           
  Lines                   ?    22856           
  Branches                ?        0           
===============================================
  Hits                    ?    19765           
  Misses                  ?     3091           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 35a7c81...08b0ce6. Read the comment docs.

@vuule vuule marked this pull request as ready for review August 1, 2022 21:51
@vuule vuule requested review from a team as code owners August 1, 2022 21:51
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both Python and C++ changes look fine to me.

return data_from_unique_ptr(move(c_out_table.tbl),
column_names=column_names)
meta_names = [name.decode() for name in c_result.metadata.column_names]
df = cudf.DataFrame._from_data(*data_from_unique_ptr(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you moved the _from_data call into the Cython because update_struct_field_names has to happen in Cython? In the long term I think that probably indicates some level of restructuring is required, but we can deal with that when we get around to cuIO/Cython refactoring more broadly.

v22.10 Release automation moved this from PR-WIP to PR-Reviewer approved Aug 1, 2022
@vuule
Copy link
Contributor Author

vuule commented Aug 1, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 71a5292 into rapidsai:branch-22.10 Aug 1, 2022
v22.10 Release automation moved this from PR-Reviewer approved to Done Aug 1, 2022
@GregoryKimball GregoryKimball added this to the Nested JSON reader milestone Nov 3, 2022
@vuule vuule deleted the fea-json-reader-schema-info branch August 10, 2023 03:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuIO cuIO issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants