Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZooKeeper connection refused #1

Closed
coke519 opened this issue Dec 12, 2016 · 19 comments
Closed

ZooKeeper connection refused #1

coke519 opened this issue Dec 12, 2016 · 19 comments
Assignees

Comments

@coke519
Copy link

coke519 commented Dec 12, 2016

I pulled down this repo and ran docker-compose -p docker up to spin up all of the services. They all seem to be working individually. However, when I issue a sql command to the Hive server through a JDBC driver it just hangs. The Hive server logs show the following error connecting to ZooKeeper:

2016-12-12T20:48:18,844 INFO  [HiveServer2-Background-Pool: Thread-42-SendThread(localhost:2181)]: zookeeper.ClientCnxn (:()) - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2016-12-12T20:48:18,844 WARN  [HiveServer2-Background-Pool: Thread-42-SendThread(localhost:2181)]: zookeeper.ClientCnxn (:()) - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)

Any suggestions?

Thanks

@mcapitanio mcapitanio self-assigned this Dec 13, 2016
@mcapitanio
Copy link
Collaborator

mcapitanio commented Dec 13, 2016

Hi @coke519, which JDBC driver are you using and which is the JDBC connection url?

I'll try to reproduce your error.

@mcapitanio
Copy link
Collaborator

Hi @coke519 , I have updated some updates to the master.
You can download the latest and rebuild docker or pulling the image directly from the Docker Hub.
I have tried connecting with DBeaver using:

url: jdbc:hive2://localhost:10000/postgres
schema: postgres

without problems.

@MattReidArnold
Copy link

MattReidArnold commented Dec 13, 2016

Hi @mcapitanio,

I was pairing with @coke519 yesterday on this. We are using the hive-jdbc 2.1.0 driver to connect. We are using the same url that you posted.

We have successfully connected from DBVisualizer and RubyMine using this driver. However, when we actually try to run a query like show databases;, we are seeing the above ZooKeeper connection error in the Hive server logs. We repeated this from a different workstation today and saw the same results.

Were you able to run queries and get results from DBeaver?

@coke519
Copy link
Author

coke519 commented Dec 15, 2016

Hi @mcapitanio, yes just like @MattReidArnold said, we can successfully connect, but it's when we try to run a query that we get a ZooKeeper connection.

@mcapitanio
Copy link
Collaborator

@coke519 @MattReidArnold sorry for the delay, I am a bit busy in these days, hoping to try in next days. Yesterday I tried but I had also problems to connect witb DbVisualizer, for other reasons so I gave up.

But could you try to make this update to the /hive/conf/hive-site.xml adding this fragment:

<property>
  <name>hive.zookeeper.quorum</name>
  <description>Zookeeper quorum used by Hive's Table Lock Manager</description>
  <value>zookeeper</value>
</property>`

and then rebuilding the Docker Image with:

docker-compose build --no-cache

or if this doesn't solve the problem trying adding this snippet:

<property>
  <name>hive.support.concurrency</name>
  <value>false</value>
  <description>Whether hive supports concurrency or not. A zookeeper instance must be up and running for the default hive lock manager to support read-write locks.</description>
</property>

and rebuilding the image again?

@MattReidArnold
Copy link

@mcapitanio

I have been unable to try your suggestion because I'm now having trouble getting docker-compose -p docker up to work. It is now getting stuck in a loop of retrying this over and over again:

hive_1       | 2016-12-20 19:08:45,556 DEBG fd 8 closed, stopped monitoring <POutputDispatcher at 140411050807240 for <Subprocess at 140411050984800 with name hcat in state RUNNING> (stdout)>
hive_1       | 2016-12-20 19:08:45,556 DEBG fd 10 closed, stopped monitoring <POutputDispatcher at 140411050333176 for <Subprocess at 140411050984800 with name hcat in state RUNNING> (stderr)>
hive_1       | 2016-12-20 19:08:45,556 INFO exited: hcat (exit status 1; not expected)
hive_1       | 2016-12-20 19:08:45,557 DEBG received SIGCLD indicating a child quit
hive_1       | 2016-12-20 19:08:45,558 INFO spawned: 'hcat' with pid 5802
hive_1       | 2016-12-20 19:08:45,576 DEBG 'hcat' stdout output:
hive_1       | Started metastore server init, testing if initialized correctly...
hive_1       |
hive_1       | 2016-12-20 19:08:46,579 INFO success: hcat entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
hive_1       | 2016-12-20 19:09:00,582 DEBG 'hcat' stdout output:
hive_1       | Metastore startup failed, see /opt/hive/logs/hcat.err

I'm a docker noob so I don't know how to troubleshoot this problem very well. I have completely uninstalled docker and tried with a fresh install of everything with the same result.

@mcapitanio
Copy link
Collaborator

@MattReidArnold I have committed the first change I proposed you, but I had no time to test it. To rebuild an image you can pull the changes from master and recreating the docker using:

docker-compose build --no-cache

and then restarting it again:

docker-compose -p docker up

Otherwise you can remove your existing image with:

docker rmi mcapitanio/hive

and pulling the new one from the Docker Hub:

docker pull mcapitanio/hive:2.1.1

@MattReidArnold
Copy link

MattReidArnold commented Dec 21, 2016

@mcapitanio It looks like latest and 2.1.1 resolve to the same image. However, I modified the docker compose file so that its using the tag you suggested.

#docker-compose.yml
...
  hive:
        build:
            context: .
            dockerfile: Dockerfile
        image: mcapitanio/hive:2.1.1
...

It is still getting stuck in the same loop i mentioned above.

hive_1       | 2016-12-21 16:31:53,616 DEBG 'hcat' stdout output:
hive_1       | Metastore startup failed, see /opt/hive/logs/hcat.err
hive_1       |
hive_1       | 2016-12-21 16:31:53,616 DEBG fd 16 closed, stopped monitoring <POutputDispatcher at 139689034144224 for <Subprocess at 139689034323296 with name hcat in state RUNNING> (stderr)>
hive_1       | 2016-12-21 16:31:53,616 DEBG fd 14 closed, stopped monitoring <POutputDispatcher at 139689034143936 for <Subprocess at 139689034323296 with name hcat in state RUNNING> (stdout)>
hive_1       | 2016-12-21 16:31:53,617 INFO exited: hcat (exit status 1; not expected)
hive_1       | 2016-12-21 16:31:53,617 DEBG received SIGCLD indicating a child quit
hive_1       | 2016-12-21 16:31:53,619 INFO spawned: 'hcat' with pid 481
hive_1       | 2016-12-21 16:31:53,637 DEBG 'hcat' stdout output:
hive_1       | Started metastore server init, testing if initialized correctly...
hive_1       |
hive_1       | 2016-12-21 16:31:54,640 INFO success: hcat entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
hive_1       | 2016-12-21 16:32:08,647 DEBG 'hcat' stdout output:
hive_1       | Metastore startup failed, see /opt/hive/logs/hcat.err
hive_1       |
hive_1       | 2016-12-21 16:32:08,647 DEBG fd 11 closed, stopped monitoring <POutputDispatcher at 139689034143936 for <Subprocess at 139689034323296 with name hcat in state RUNNING> (stdout)>
hive_1       | 2016-12-21 16:32:08,647 DEBG fd 15 closed, stopped monitoring <POutputDispatcher at 139689034144224 for <Subprocess at 139689034323296 with name hcat in state RUNNING> (stderr)>
hive_1       | 2016-12-21 16:32:08,647 INFO exited: hcat (exit status 1; not expected)
hive_1       | 2016-12-21 16:32:08,648 DEBG received SIGCLD indicating a child quit
hive_1       | 2016-12-21 16:32:08,649 INFO spawned: 'hcat' with pid 627
hive_1       | 2016-12-21 16:32:08,667 DEBG 'hcat' stdout output:
hive_1       | Started metastore server init, testing if initialized correctly...
hive_1       |
hive_1       | 2016-12-21 16:32:09,670 INFO success: hcat entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

@mcapitanio
Copy link
Collaborator

MattReidArnold after having started the hive docker try stopping with:

docker-compose down -v

This removes the container and the named volumes.

Then try restarting again

@MattReidArnold
Copy link

@mcapitanio I tried that and its still getting stuck in the same place.

@mcapitanio
Copy link
Collaborator

@MattReidArnold in the docher-compose.tml file there is rhe list of the exposed ports to the host.
Could you check that there is no port conflict with the port used when you start the Docker.

Coould you also che the logs that are located in Hive named volume, it should be under:

/var/lib/docker/volumes/docker_hive_logs/_data

@MattReidArnold
Copy link

I'm not seeing a docker directory in /var/lib. Is that the problem?

I ran docker inspect and it included this:

...
           {
                "Name": "docker_hive_logs",
                "Source": "/var/lib/docker/volumes/docker_hive_logs/_data",
                "Destination": "/opt/hive/logs",
                "Driver": "local",
                "Mode": "rw",
                "RW": true,
                "Propagation": "rprivate"
            },
...

@mcapitanio
Copy link
Collaborator

@MattReidArnold You should find that directory as root

@MattReidArnold
Copy link

MattReidArnold commented Dec 22, 2016

@mcapitanio Sorry, I'm a Docker noob. I'm running Docker for Mac and didn't realize I needed to look in the linux VM running under the hood. So this is what is showing up in the hive logs while the container is stuck in that look. Does this help?

2016-12-22T21:20:10,527 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:main(6618)) - Starting hive metastore on port 9083
2016-12-22T21:20:10,560 INFO  [main]: metastore.HiveMetaStore (HiveMetaStore.java:newRawStore(582)) - 0: Opening raw store with implementation class:org.apache.hadoop.hive.metastore.ObjectStore
2016-12-22T21:20:10,608 INFO  [main]: metastore.ObjectStore (ObjectStore.java:initializeHelper(397)) - ObjectStore, initialize called
2016-12-22T21:20:11,537 INFO  [main]: metastore.ObjectStore (ObjectStore.java:getPMF(520)) - Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
2016-12-22T21:20:12,518 INFO  [main]: metastore.MetaStoreDirectSql (MetaStoreDirectSql.java:<init>(145)) - Using direct SQL, underlying DB is OTHER
2016-12-22T21:20:12,520 INFO  [main]: metastore.ObjectStore (ObjectStore.java:setConf(311)) - Initialized ObjectStore
2016-12-22T21:20:12,630 ERROR [main]: metastore.HiveMetaStore (HiveMetaStore.java:startMetaStore(6839)) - MetaException(message:Version information not found in metastore. )
        at org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:7753)
        at org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:7731)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101)
        at com.sun.proxy.$Proxy17.verifySchema(Unknown Source)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:565)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:626)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:416)
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:78)
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:84)
        at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6484)
        at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6479)
        at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:6737)
        at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:6664)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

2016-12-22T21:20:12,631 ERROR [main]: metastore.HiveMetaStore (HiveMetaStore.java:main(6669)) - Metastore Thrift Server threw an exception...
org.apache.hadoop.hive.metastore.api.MetaException: Version information not found in metastore.
        at org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:7753) ~[hive-exec-2.1.1.jar:2.1.1]
        at org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:7731) ~[hive-exec-2.1.1.jar:2.1.1]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_80]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_80]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_80]
        at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_80]
        at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101) ~[hive-exec-2.1.1.jar:2.1.1]
        at com.sun.proxy.$Proxy17.verifySchema(Unknown Source) ~[?:?]
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:565) ~[hive-exec-2.1.1.jar:2.1.1]
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:626) ~[hive-exec-2.1.1.jar:2.1.1]
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:416) ~[hive-exec-2.1.1.jar:2.1.1]
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:78) ~[hive-exec-2.1.1.jar:2.1.1]
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:84) ~[hive-exec-2.1.1.jar:2.1.1]
        at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6484) ~[hive-exec-2.1.1.jar:2.1.1]
        at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6479) ~[hive-exec-2.1.1.jar:2.1.1]
        at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:6737) ~[hive-exec-2.1.1.jar:2.1.1]
        at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:6664) [hive-exec-2.1.1.jar:2.1.1]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_80]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_80]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_80]
        at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_80]
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221) [hadoop-common-2.7.2.jar:?]
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136) [hadoop-common-2.7.2.jar:?]
2016-12-22T21:20:12,638 INFO  [Thread-4]: metastore.HiveMetaStore (HiveMetaStore.java:run(6634)) - Shutting down hive metastore.
2016-12-22T21:20:12,639 INFO  [Thread-3]: metastore.HiveMetaStore (HiveStringUtils.java:run(711)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down HiveMetaStore at 951f20b72e75/172.18.0.5

@mcapitanio
Copy link
Collaborator

@MattReidArnold It seems that the Hive Metastore in Postgres has not correctly created.
Could you check if you can find the follwing lines in the Docker logs?

hive_1       | 
hive_1       | 2016-12-23 20:29:18,210 DEBG 'bootstrap' stdout output:
hive_1       | Initialization script hive-schema-2.1.0.postgres.sql
hive_1       | 
postgres_1   | WARNING:  hash indexes are not WAL-logged and their use is discouraged
hive_1       | 2016-12-23 20:29:19,972 DEBG 'bootstrap' stdout output:
hive_1       | Initialization script completed
hive_1       | 
hive_1       | 2016-12-23 20:29:19,976 DEBG 'bootstrap' stdout output:
hive_1       | schemaTool completed

If not you probably are not running the last Docker image. So to be sure you should delete the image:

docker rmi mcapitanio/hive:latest

and rebuild from the master

 docker-compose build --no-cache

The try to start again, eventually changing the project name:

docker-compose -p docker2 up

@MattReidArnold
Copy link

@mcapitanio
Great news, it all works. I was using my work machine before having all of the issues. However, running the above commands on my home machine I am able to connect and query the hive server using DBVisualizer.

I will try it again on my work machine next week sometime and let you know if this fixes it there as well.

Thanks for all of you help. Sorry to be such a pain. Have a Merry Christmas!

@MattReidArnold
Copy link

MattReidArnold commented Dec 24, 2016

Also, I noticed that the port for the web url on the zookeeper container is not exposed. To access http://localhost:8099/ in a browser I had to add this.

# zookeeper.yml
...        
       ports:
       ...
            - "8099:8099"
       ...

@mcapitanio
Copy link
Collaborator

@MattReidArnold Great news then! Yes, I exposed that port on the ZooKeeper Image but not in this one, but it would be useful to expose that also in the Hive image.

Merry Christmas to you too!

@MattReidArnold
Copy link

@mcapitanio I verified that this also fixes it on my work machine. Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants