Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Flint error handling with detailed exception messages #348

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

dai-chen
Copy link
Collaborator

@dai-chen dai-chen commented May 20, 2024

Description

This PR enhances the error handling capabilities of the FlintSpark class by improving the exceptions thrown during transaction operations. The new error handling approach includes root cause information, making it easier to diagnose and resolve issues.

Changes

  • Introduced the withTransaction method to centralize logging and exception handling across different operations within FlintSpark. This method standardizes transaction control and error reporting, reducing code redundancy and enhancing readability.
  • Reworked the error handling within the withTransaction method to include detailed root cause information in exceptions. When an operation fails, the catch block now enriches the thrown exceptions with clear, actionable error messages that explain why the operation failed.

Testing

Index state doesn't satisfy the precondition

Tested index state exception for #201:

# Before the changes
java.lang.IllegalStateException: Failed to vacuum Flint index

# After the changes
java.lang.IllegalStateException: Failed to execute index operation [Vacuum Flint index
flint_myglue_ds_tables_http_logs_skipping_index] caused by:
Index state [refreshing] doesn't satisfy precondition

OpenSearch exception

Tested OpenSearch exception for #252:

# Simulate by setting cluster to read only mode:
PUT _cluster/settings
{
   "persistent":{
      "cluster.blocks.read_only": true
   }
}

spark-sql> CREATE INDEX all ON myglue.ds_tables.http_logs (status);

# Before the changes
java.lang.IllegalStateException: Failed to create Flint index

# After the changes
java.lang.IllegalStateException: Failed to execute index operation [Create Flint index
flint_myglue_ds_tables_http_logs_all_index with ignoreIfExists false] caused by:
OpenSearch exception [type=cluster_block_exception, reason=blocked by:
[FORBIDDEN/6/cluster read-only (api)];]

Issues Resolved

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Chen Dai <daichen@amazon.com>
@dai-chen dai-chen added enhancement New feature or request 0.5 labels May 20, 2024
@dai-chen dai-chen self-assigned this May 20, 2024
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
} catch {
case e: Exception =>
// Extract and add root cause message to final error message
val rootCauseMessage = extractRootCause(e)
Copy link
Collaborator Author

@dai-chen dai-chen May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to confirm whether any underlying error messages should be hidden from users for security reasons.

  1. If confirmed, we will hide these messages.
  2. Alternatively, we extract only the root cause message for known exceptions, such as those from OpenSearch or our own Flint exceptions (custom exception class).

@penghuo @vamsi-amazon

Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.5 enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant