Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support to hive hook for high availability Hive installations #38651

Merged
merged 6 commits into from
Apr 4, 2024

Conversation

amoghrajesh
Copy link
Contributor

Right now there is a limitation where the HiveOperator incorrectly parses and constructs the beeline command if hive "high availability" url is given for host. I was able to work this around by adding this change to extra and changing the HiveHook code but a better fix would be to add a new field called HA, which when selected, parses the connection better in the hook itself.

I would like to propose adding this into the connection form for Hive CLI type in a better way. Recently some change was made to remove extra and integrate this into the form and simplify it: #37043.
Would propose adding similar fields for HA and if enabled, the beeline command construction would vary slightly.

MY HA URL looks somewhat like this btw: jdbc:hive2://host1:port1,host2:port2,host3:port3/default;principal=hive/principal@REALM;serviceDiscoveryMode=zooKeeper;ssl=true;zooKeeperNamespace=hiveserver2

Tested using a HA setting, beeline works as expected.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@amoghrajesh amoghrajesh requested a review from potiuk April 1, 2024 08:47
@amoghrajesh
Copy link
Contributor Author

This is how the form looks now:
image (13)

@romsharon98
Copy link
Collaborator

Looks good :)
Can u just add test with proxy_user in extra_json just to verify that it add ; correctly and to verify someone else not remove it.

@amoghrajesh
Copy link
Contributor Author

Looks good :) Can u just add test with proxy_user in extra_json just to verify that it add ; correctly and to verify someone else not remove it.

Good catch! Updated the test, can you check once?

@amoghrajesh amoghrajesh requested a review from eladkal April 2, 2024 04:18
@amoghrajesh
Copy link
Contributor Author

@potiuk @eladkal could you review this when you have some time? Or who would be the right person for hive related PRs?

Copy link
Member

@hussein-awala hussein-awala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some nits, LGTM

airflow/providers/apache/hive/hooks/hive.py Outdated Show resolved Hide resolved
airflow/providers/apache/hive/hooks/hive.py Outdated Show resolved Hide resolved
airflow/providers/apache/hive/hooks/hive.py Outdated Show resolved Hide resolved
@amoghrajesh
Copy link
Contributor Author

@hussein-awala I have addressed the nits as well, can you take a look if it looks OK now?

@amoghrajesh
Copy link
Contributor Author

@potiuk @eladkal @hussein-awala if all looks good shall we send this one in?

Copy link
Member

@hussein-awala hussein-awala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My previous comments were nits and not blockers, LGTM

@amoghrajesh
Copy link
Contributor Author

Merging this in favour of the reviews from Hussein and Elad.

@amoghrajesh amoghrajesh merged commit c2981fe into apache:main Apr 4, 2024
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants