Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support reading JSON data with single quotes around attribute names and values #10273

Merged
merged 7 commits into from
Jan 29, 2024

Conversation

andygrove
Copy link
Contributor

@andygrove andygrove commented Jan 25, 2024

Closes #10270

Depends on rapidsai/cudf#14729

@andygrove andygrove self-assigned this Jan 25, 2024
Signed-off-by: Andy Grove <andygrove@nvidia.com>
@andygrove andygrove changed the title Json single quote Support reading JSON data with single quotes around attribute names and values Jan 25, 2024
@andygrove
Copy link
Contributor Author

build

@andygrove andygrove marked this pull request as ready for review January 25, 2024 15:19
@sameerz sameerz added the feature request New feature or request label Jan 25, 2024
jlowe
jlowe previously approved these changes Jan 26, 2024
@jlowe
Copy link
Member

jlowe commented Jan 26, 2024

build

@andygrove
Copy link
Contributor Author

build failing due to #10291

revans2
revans2 previously approved these changes Jan 26, 2024
@andygrove
Copy link
Contributor Author

build

@andygrove andygrove dismissed stale reviews from revans2 and jlowe via c6550f9 January 27, 2024 12:35
@andygrove
Copy link
Contributor Author

build

@@ -599,7 +599,7 @@ def test_from_json_map_fallback():
@allow_non_gpu(*non_utc_allow)
def test_from_json_struct(schema):
# note that column 'a' does not use leading zeroes due to https://github.com/NVIDIA/spark-rapids/issues/9588
json_string_gen = StringGen(r'{"a": [1-9]{0,5}, "b": "[A-Z]{0,5}", "c": 1\d\d\d}') \
json_string_gen = StringGen(r'{\'a\': [1-9]{0,5}, "b": \'[A-Z]{0,5}\', "c": 1\d\d\d}') \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could use fewer escape sequences with triple quotes

Suggested change
json_string_gen = StringGen(r'{\'a\': [1-9]{0,5}, "b": \'[A-Z]{0,5}\', "c": 1\d\d\d}') \
json_string_gen = StringGen(r'''{'a': [1-9]{0,5}, "b": '[A-Z]{0,5}', "c": 1\d\d\d}''') \

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I will address this in a follow-up PR if that is ok since we need this to make it into the 24.02 release.

@andygrove andygrove merged commit ad6fde9 into NVIDIA:branch-24.02 Jan 29, 2024
40 checks passed
@andygrove andygrove deleted the json-single-quote branch January 29, 2024 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Add support for single quotes when reading JSON
5 participants