Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BIGTOP-3802: Fix Mpack Hive fail to start when kerberos enabled #1000

Merged
merged 2 commits into from Sep 16, 2022

Conversation

timyuer
Copy link
Contributor

@timyuer timyuer commented Sep 3, 2022

Description of PR

How was this patch tested?

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'BIGTOP-3638. Your PR title ...')?
  • Make sure that newly added files do not have any licensing issues. When in doubt refer to https://www.apache.org/licenses/

@timyuer
Copy link
Contributor Author

timyuer commented Sep 3, 2022

hive_status_20220903091150

hive_service_check

@timyuer
Copy link
Contributor Author

timyuer commented Sep 3, 2022

Metastore in Hive 3.1.3 is not compatible with Hadoop 3.3.x when Kerberos enabled.
Related to #1001

@guyuqi
Copy link
Member

guyuqi commented Sep 14, 2022

@timyuer Thanks for working on it.

But after I combined the #1001 ,#1003 with this PR, and rebuilt Hive, it seems HiveServer2 failed to start:

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/stacks/BGTP/1.0/services/HIVE/package/scripts/hive_server.py", line 143, in <module>
    HiveServer().execute()
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 352, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/stacks/BGTP/1.0/services/HIVE/package/scripts/hive_server.py", line 53, in start
    hive_service('hiveserver2', action = 'start', upgrade_type=upgrade_type)
  File "/var/lib/ambari-agent/cache/stacks/BGTP/1.0/services/HIVE/package/scripts/hive_service.py", line 101, in hive_service
    wait_for_znode()
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/decorator.py", line 62, in wrapper
    return function(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/stacks/BGTP/1.0/services/HIVE/package/scripts/hive_service.py", line 196, in wait_for_znode
    raise Fail(format("ZooKeeper node /{hive_server2_zookeeper_namespace} is not ready yet"))
resource_management.core.exceptions.Fail: ZooKeeper node /hiveserver2 is not ready yet
................
.........
.....
2022-09-14 08:49:51,350 - Will retry 5 time(s), caught exception: ZooKeeper node /hiveserver2 is not ready yet. Sleeping for 10 sec(s)
2022-09-14 08:50:01,361 - call['/usr/lib/zookeeper/bin/zkCli.sh -server ambari-server:2181,ambari-agent-01:2181,ambari-agent-02:2181 ls /hiveserver2 | grep 'serverUri=''] {}
2022-09-14 08:50:02,086 - call returned (1, '')
2022-09-14 08:50:02,087 - Will retry 4 time(s), caught exception: ZooKeeper node /hiveserver2 is not ready yet. Sleeping for 10 sec(s)
2022-09-14 08:50:12,098 - call['/usr/lib/zookeeper/bin/zkCli.sh -server ambari-server:2181,ambari-agent-01:2181,ambari-agent-02:2181 ls /hiveserver2 | grep 'serverUri=''] {}
2022-09-14 08:50:12,926 - call returned (1, '')
2022-09-14 08:50:12,927 - Will retry 3 time(s), caught exception: ZooKeeper node /hiveserver2 is not ready yet. Sleeping for 10 sec(s)
2022-09-14 08:50:22,938 - call['/usr/lib/zookeeper/bin/zkCli.sh -server ambari-server:2181,ambari-agent-01:2181,ambari-agent-02:2181 ls /hiveserver2 | grep 'serverUri=''] {}
2022-09-14 08:50:23,686 - call returned (1, '')
2022-09-14 08:50:23,687 - Will retry 2 time(s), caught exception: ZooKeeper node /hiveserver2 is not ready yet. Sleeping for 10 sec(s)
2022-09-14 08:50:33,688 - call['/usr/lib/zookeeper/bin/zkCli.sh -server ambari-server:2181,ambari-agent-01:2181,ambari-agent-02:2181 ls /hiveserver2 | grep 'serverUri=''] {}
2022-09-14 08:50:34,521 - call returned (1, '')
2022-09-14 08:50:34,522 - Will retry 1 time(s), caught exception: ZooKeeper node /hiveserver2 is not ready yet. Sleeping for 10 sec(s)
2022-09-14 08:50:44,529 - call['/usr/lib/zookeeper/bin/zkCli.sh -server ambari-server:2181,ambari-agent-01:2181,ambari-agent-02:2181 ls /hiveserver2 | grep 'serverUri=''] {}
2022-09-14 08:50:45,282 - call returned (1, '')

Hadoop: 3.3.4

@guyuqi
Copy link
Member

guyuqi commented Sep 14, 2022

I deployed Hadoop, zookeeper, Hive, Tez by Ambari UI first, and then enabled Kerberos:
1663146335406

@kevinw66
Copy link
Contributor

kevinw66 commented Sep 15, 2022

@guyuqi works fine on my machine.

image

But the MapReduce service check failed at first, but succeed when I run it separately after kerberos enabled
image

image
image

@timyuer
Copy link
Contributor Author

timyuer commented Sep 16, 2022

ZhiguoWu's deploy order is:

  1. Deploy Hadoop, zookeeper, Tez by Ambari UI first.
  2. Deploy Hive by Ambari UI first.
  3. Enabled Kerberos.

This lead to the /etc/passwd(source file MapReduce checked) file size changed, and then cause MapReduce Service Check Failed.

This problem maybe caused by a BUG from resource_management in Ambari Server-2.7.5, I will find it and fix it.

All in all, I can make sure this Hive PR is correct.

Can you please try again? @guyuqi
@kevinw66
Thanks.

@guyuqi
Copy link
Member

guyuqi commented Sep 16, 2022

ZhiguoWu's deploy order is:

  1. Deploy Hadoop, zookeeper, Tez by Ambari UI first.
  2. Deploy Hive by Ambari UI first.
  3. Enabled Kerberos.

This lead to the /etc/passwd(source file MapReduce checked) file size changed, and then cause MapReduce Service Check Failed.

This problem maybe caused by a BUG from resource_management in Ambari Server-2.7.5, I will find it and fix it.

All in all, I can make sure this Hive PR is correct.

Can you please try again? @guyuqi @kevinw66 Thanks.

May I ask which Hadoop version you deploy?

@timyuer
Copy link
Contributor Author

timyuer commented Sep 16, 2022

ZhiguoWu's deploy order is:

Deploy Hadoop, zookeeper, Tez by Ambari UI first.
Deploy Hive by Ambari UI first.
Enabled Kerberos.
This lead to the /etc/passwd(source file MapReduce checked) file size changed, and then cause MapReduce Service Check Failed.

This problem maybe caused by a BUG from resource_management in Ambari Server-2.7.5, I will find it and fix it.

All in all, I can make sure this Hive PR is correct.

Can you please try again? @guyuqi @kevinw66 Thanks.

May I ask which Hadoop version you deploy?

Hadoop 3.3.4, compiled by https://ci.bigtop.apache.org/job/Bigtop-trunk-snapshot/.

@guyuqi
Copy link
Member

guyuqi commented Sep 16, 2022

@timyuer @kevinw66
oops, I found that I did not update my local packages repo and failed to fetch the new built Hive rpms.
After using latest Hadoop and Hive rpms, everything is ok.
LGTM, +1.
Thank you guys for working on it. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants