Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-43986][SQL] Create error classes for HyperLogLog function call failures #41486

Closed
wants to merge 12 commits into from

Conversation

dtenedor
Copy link
Contributor

@dtenedor dtenedor commented Jun 7, 2023

What changes were proposed in this pull request?

This PR creates error classes for HyperLogLog function call failures.

Why are the changes needed?

These replace previous Java exceptions or other cases, in order to improve the user experience and bring consistency with other parts of Spark.

Does this PR introduce any user-facing change?

Yes, error messages change slightly.

How was this patch tested?

This PR also adds SQL query test files for the HLL functions.

commit

commit

commit

commit
@dtenedor
Copy link
Contributor Author

dtenedor commented Jun 7, 2023

Hi @MaxGekk @RyanBerti can you please help review this PR to improve error messages and add more test coverage? 🙏

Copy link
Contributor

@mkaravel mkaravel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dtenedor Thank you for this PR. Inline comments are mostly about naming and phrasing.

My high level comment is that we have one more category of errors and this is related to passing BINARY values to hll_union, hll_union_agg, and hll_sketch_estimate that are not valid representations of sketches. At this point we rely on the DataSketches library to throw an exception for these cases, and we let that exception surface. My take on this is that we should add one more error class for these (along the lines of an illegal argument exception/error). What do you think?

@dtenedor
Copy link
Contributor Author

Hi @MaxGekk @mkaravel I have responded to your comments, please take another look :)

Copy link
Contributor

@mkaravel mkaravel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dtenedor Thank you for the fixes.
Posting a couple of minor comments.

core/src/main/resources/error/error-classes.json Outdated Show resolved Hide resolved
core/src/main/resources/error/error-classes.json Outdated Show resolved Hide resolved
@mkaravel
Copy link
Contributor

@dtenedor Thank you for this PR. Inline comments are mostly about naming and phrasing.

My high level comment is that we have one more category of errors and this is related to passing BINARY values to hll_union, hll_union_agg, and hll_sketch_estimate that are not valid representations of sketches. At this point we rely on the DataSketches library to throw an exception for these cases, and we let that exception surface. My take on this is that we should add one more error class for these (along the lines of an illegal argument exception/error). What do you think?

@dtenedor What about the above?

respond to code review comments

respond to code review comments
@dtenedor
Copy link
Contributor Author

@dtenedor Thank you for this PR. Inline comments are mostly about naming and phrasing.
My high level comment is that we have one more category of errors and this is related to passing BINARY values to hll_union, hll_union_agg, and hll_sketch_estimate that are not valid representations of sketches. At this point we rely on the DataSketches library to throw an exception for these cases, and we let that exception surface. My take on this is that we should add one more error class for these (along the lines of an illegal argument exception/error). What do you think?

Sorry for missing this before, I made this suggested update. It was a good idea, I think the errors are clearer now.

Copy link
Contributor Author

@dtenedor dtenedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mkaravel for the review! I implemented your suggestions, please take another look.

@dtenedor dtenedor requested a review from mkaravel June 22, 2023 18:28
Copy link
Contributor

@mkaravel mkaravel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
@dtenedor Thank you for putting the time to work on this and for addressing my comments!

I have a few nits regarding the function names we pass to some error classes (case needs to be lowercase and basically, the function names we pass need to match the pretty name of the function).

@dtenedor
Copy link
Contributor Author

Hi @MaxGekk can we trouble you to help us merge this please when you have a moment :)

@dtenedor
Copy link
Contributor Author

Note the only CI failure is spurious and unrelated to this PR:
image

@dtenedor dtenedor requested a review from MaxGekk June 28, 2023 22:58
Copy link
Contributor Author

@dtenedor dtenedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @MaxGekk, thanks for your careful review. I responded to your comments, please look again :)

@dtenedor dtenedor requested a review from MaxGekk June 29, 2023 21:19
@github-actions github-actions bot removed the CORE label Jun 29, 2023
Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the failure below is not related to PR's changes:

systemctl restart cri-docker: Process exited with status 1
stdout:


stderr:
Job for cri-docker.service failed because the control process exited with error code.
See "systemctl status cri-docker.service" and "journalctl -xe" for details.

* 
╭─────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                             │
│    * If the above advice does not help, please let us know:                                 │
│      https://github.com/kubernetes/minikube/issues/new/choose                               │
│                                                                                             │
│    * Please run `minikube logs --file=logs.txt` and attach logs.txt to the GitHub issue.    │
│                                                                                             │
╰─────────────────────────────────────────────────────────────────────────────────────────────╯
Error: Process completed with exit code 90.

@MaxGekk
Copy link
Member

MaxGekk commented Jun 30, 2023

+1, LGTM. Merging to master.
Thank you, @dtenedor and @mkaravel for review.

@MaxGekk MaxGekk closed this in ab67f46 Jun 30, 2023
@dtenedor
Copy link
Contributor Author

Thanks @MaxGekk for your reviews and merging!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants