Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-17847: [C++] Support unquoted decimal in JSON parser #14242

Merged
merged 4 commits into from
Sep 28, 2022

Conversation

js8544
Copy link
Collaborator

@js8544 js8544 commented Sep 26, 2022

Support both quoted and unquoted decimal in JSON parser automatically.

@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@js8544
Copy link
Collaborator Author

js8544 commented Sep 27, 2022

As far as I can tell this PR is ready for review. CI failures are unrelated. @pitrou would you mind having a look?

@pitrou
Copy link
Member

pitrou commented Sep 27, 2022

As far as I can tell this PR is ready for review. CI failures are unrelated.

Yes, unfortunately we're fighting against some CI failures lately.

@pitrou
Copy link
Member

pitrou commented Sep 27, 2022

@benibus Since you've been working on the JSON reader recently, perhaps you want to take a quick look at this.

@js8544
Copy link
Collaborator Author

js8544 commented Sep 27, 2022

@bkietz Hi Ben. Since you are the author of arrow's JSON parser, do you have any suggestions on how to implement this feature? Thanks in advance.

@js8544 js8544 force-pushed the jinshang/parse_unquoted_decimal branch from 91131ad to c8ef16b Compare September 28, 2022 02:23
cpp/src/arrow/json/parser.cc Outdated Show resolved Hide resolved
@js8544
Copy link
Collaborator Author

js8544 commented Sep 28, 2022

@pitrou I added Kind::kNumberOrString and it works like a charm! Thanks for the suggestion!

Copy link
Member

@bkietz bkietz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This design looks great, thanks for working on this!

My comments are just nits

cpp/src/arrow/json/test_common.h Outdated Show resolved Hide resolved
cpp/src/arrow/json/reader_test.cc Outdated Show resolved Hide resolved
cpp/src/arrow/json/parser.cc Outdated Show resolved Hide resolved
@js8544 js8544 requested a review from bkietz September 28, 2022 12:26
Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @js8544 . Just a few nits below.

cpp/src/arrow/json/parser.cc Outdated Show resolved Hide resolved
cpp/src/arrow/json/parser.cc Outdated Show resolved Hide resolved
cpp/src/arrow/json/parser.cc Outdated Show resolved Hide resolved
cpp/src/arrow/json/parser.cc Outdated Show resolved Hide resolved
cpp/src/arrow/json/parser.cc Show resolved Hide resolved
cpp/src/arrow/json/reader_test.cc Outdated Show resolved Hide resolved
cpp/src/arrow/json/reader_test.cc Outdated Show resolved Hide resolved
@js8544 js8544 requested review from pitrou and bkietz and removed request for bkietz and pitrou September 28, 2022 15:52
Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM now. Thanks @js8544 !

@pitrou
Copy link
Member

pitrou commented Sep 28, 2022

Will wait for CI.

@pitrou
Copy link
Member

pitrou commented Sep 28, 2022

CI seems to pass on @js8544 's fork.

@pitrou pitrou merged commit eec94ff into apache:master Sep 28, 2022
@ursabot
Copy link

ursabot commented Sep 29, 2022

Benchmark runs are scheduled for baseline = b672e5c and contender = eec94ff. eec94ff is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.48% ⬆️0.0%] test-mac-arm
[Failed ⬇️0.27% ⬆️0.27%] ursa-i9-9960x
[Finished ⬇️0.11% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] eec94ff4 ec2-t3-xlarge-us-east-2
[Finished] eec94ff4 test-mac-arm
[Failed] eec94ff4 ursa-i9-9960x
[Finished] eec94ff4 ursa-thinkcentre-m75q
[Finished] b672e5c9 ec2-t3-xlarge-us-east-2
[Finished] b672e5c9 test-mac-arm
[Failed] b672e5c9 ursa-i9-9960x
[Finished] b672e5c9 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@js8544 js8544 deleted the jinshang/parse_unquoted_decimal branch October 11, 2022 08:43
stiga-huang pushed a commit to stiga-huang/arrow that referenced this pull request Oct 12, 2022
)

Support both quoted and unquoted decimal in JSON parser automatically.

Merge conflicts to 8.0.0:
Replace std::string_view with util::string_view

Authored-by: Jin Shang <shangjin1997@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
fatemehp pushed a commit to fatemehp/arrow that referenced this pull request Oct 17, 2022
)

Support both quoted and unquoted decimal in JSON parser automatically.

Authored-by: Jin Shang <shangjin1997@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
stiga-huang pushed a commit to stiga-huang/arrow that referenced this pull request Oct 18, 2022
)

Support both quoted and unquoted decimal in JSON parser automatically.

Merge conflicts to 8.0.0:
Replace std::string_view with util::string_view

Authored-by: Jin Shang <shangjin1997@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants