Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vdk-impala: Handle errors on refresh/invalidate metadata #2511

Merged
merged 2 commits into from
Aug 7, 2023

Conversation

doks5
Copy link
Contributor

@doks5 doks5 commented Aug 2, 2023

Currently, if a data job executes a query, which fails due to metadata issue, the ImpalaErrorHandler
logic tries to refresh or invalidate the metadata, and re-try the query. If the job has only read
access to the table, an authorization error is raised and the job failes with User Error. This should
not happen, as the refresh/invalidate operation is initiated by vdk, not the user.

This change updates the error handler logic by adding try/except blocks around refresh/invalidate
metadata operations, so that errors that arise from these operations do not cause job failures.

Testing Done: Added test

@antoniivanov
Copy link
Collaborator

antoniivanov commented Aug 2, 2023

I do not fully comprehend the problem and it clashes of my (independent) understanding of the overall problem.

Related problems I know about are

A)Impala is eventually consistent and sometimes raises AuthorizationException even if the data job user has been authorized recently. So re-trying with back-off makes sense while waiting for new authorization rules to sync

B)Impala recovery mechanism when querying a table and it detects that the table metadata is out of sync, tries to execute invalidate metadata on that table (to sync it) before retrying the original query. But if it doesn't have "write" permission to that table, invalidate metadata fails. No amount of re-trying would prevent that since the permission would not change. It still would not have "write" permission.

The code seems like a solution for A) but the description seems more like B) So I am confused.

Is there option C I am not seeing ?

@doks5
Copy link
Contributor Author

doks5 commented Aug 3, 2023

I do not fully comprehend the problem and it clashes of my (independent) understanding of the overall problem.

Related problems I know about are

A)Impala is eventually consistent and sometimes raises AuthorizationException even if the data job user has been authorized recently. So re-trying with back-off makes sense while waiting for new authorization rules to sync

B)Impala recovery mechanism when querying a table and it detects that the table metadata is out of sync, tries to execute invalidate metadata on that table (to sync it) before retrying the original query. But if it doesn't have "write" permission to that table, invalidate metadata fails. No amount of re-trying would prevent that since the permission would not change. It still would not have "write" permission.

The code seems like a solution for A) but the description seems more like B) So I am confused.

Is there option C I am not seeing ?

Hi, @antoniivanov

When an authorization error is raised by impala, the message gives indication what type of operation has been attempted.
The case that I'm trying to attempt at solving is when a data job, that has been granted only read access to a table, executes a SELECT * FROM some_table query, the query fails with metadata inconsistency error (maybe some other job which has write access has modified the table just before the "select" query was issued), and the error handling tries to refresh/invalidate the metadata. Because the job has only read access, the refresh/invalidate query would fail with authorization error and trigger a job failure. By just sleeping for a bit and re-trying the initial select query, we can avoid job failures if, in the meantime, the metadata has managed to sync across the impala coordinators.

@doks5 doks5 changed the title [vdk-plugins] vdk-impala: Handle authorization errors on refresh [vdk-plugins] vdk-impala: Handle errors on refresh/invalidate metadata Aug 3, 2023
@doks5 doks5 requested a review from antoniivanov August 3, 2023 15:04
@doks5 doks5 force-pushed the person/andonova/impala-auth-errors branch from b0f3a7c to 3247472 Compare August 3, 2023 19:01
Currently, if a data job executes a select query, which fails due to some metadata-related
error, the error handling logic executes refresh/invalidate metadata query to sync the
table metadata and retry the initial query. This works if the data job has write access
to the table. However, if the job has only read access, the refresh/invalidate metadata
query will fail with authorization error. This will lead to overall job execution failure,
which is not ideal, as vdk, not the job itself is trying to execute an illegal operation.

This change updates the error handling logic, so that when an authorization error on
refresh/invalidate metadata is raised, vdk will re-try the query with a backoff (if the
job does not have write access, we cannot know when the metadata for the table has been
updated, and the best we can do is sleep and re-try).

Testing Done: Added test

Signed-off-by: Andon Andonov <andonova@vmware.com>
Currently, if a data job executes a query, which fails due to metadata issue, the ImpalaErrorHandler
logic tries to refresh or invalidate the metadata, and re-try the query. If the job has only read
access to the table, an authorization error is raised and the job failes with User Error. This should
not happen, as the refresh/invalidate operation is initiated by vdk, not the user.

This change updates the error handler logic by adding try/except blocks around refresh/invalidate
metadata operations, so that errors that arise from these operations do not cause job failures.

Testing Done: Added test

Signed-off-by: Andon Andonov <andonova@vmware.com>
@doks5 doks5 force-pushed the person/andonova/impala-auth-errors branch from 3247472 to d963763 Compare August 3, 2023 19:09
@doks5
Copy link
Contributor Author

doks5 commented Aug 7, 2023

Hi, @antoniivanov

I've addressed the feedback and updated the comments and PR description.

@antoniivanov antoniivanov changed the title [vdk-plugins] vdk-impala: Handle errors on refresh/invalidate metadata vdk-impala: Handle errors on refresh/invalidate metadata Aug 7, 2023
@antoniivanov
Copy link
Collaborator

Looks good to me. Thanks.

I updated the title. The prefix is just the name of the plugin (vdk-impala) without [vdk-plugins]

@doks5 doks5 merged commit a4bce83 into main Aug 7, 2023
12 checks passed
@doks5 doks5 deleted the person/andonova/impala-auth-errors branch August 7, 2023 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants