Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vdk-impala: Add validation for queries that doesn't provide lineage info #1175

Merged
merged 2 commits into from
Sep 6, 2022

Conversation

kostoww
Copy link
Contributor

@kostoww kostoww commented Sep 6, 2022

Checking for lineage information is gathered from the profile making one more request to the impala server, which in some point the number of requests might be huge, and there might not be any lineage data at all.

What?

Previously there was just a validation for keep-a-live connections. Current commit is adding fine validation also if query is of DCL or some DDL type (REFRESH, COMPUTE, EXPLAIN, etc) which could be frequently used in a data jobs and leading to a huge load on the impala server.

How has this been tested?

Unit and integration tests, also with production-grade queries

What type of change are you making?

Improvement

Signed-off-by: Plamen Kostov pkostov@vmware.com

Checking for lineage information is gathered from the profile making one more request to the impala server, which in some point the number of requests might be huge, and there might not be any lineage data at all.

# What?
Previously there was just a validation for keep-a-live connections. Current commit is adding fine validation also if query is of DCL or some DDL type (REFRESH, COMPUTE, EXPLAIN, etc) which could be frequently used in a data jobs and leading to a huge load on the impala server.

# How has this been tested?
Unit and integration tests, also with production-grade queries

# What type of change are you making?
Improvement

Signed-off-by: Plamen Kostov <pkostov@vmware.com>
@antoniivanov
Copy link
Collaborator

Please fix the title of the PR

@kostoww kostoww changed the title # Why? vdk-impala: Add validation for queries that doesn't provide lineage info Sep 6, 2022
@kostoww kostoww merged commit be7a9e7 into main Sep 6, 2022
@kostoww kostoww deleted the topic/kostoww/blacklist-impala-lineage-queries branch September 6, 2022 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants