Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hive Metastore: Failed to get table info from metastore #215

Open
jbolesjc opened this issue Mar 18, 2024 · 8 comments
Open

Hive Metastore: Failed to get table info from metastore #215

jbolesjc opened this issue Mar 18, 2024 · 8 comments

Comments

@jbolesjc
Copy link

Hey all,

We are experiencing an issue where the sink connector cannot get the table info of our table in the hive metastore. We do have iceberg.tables.auto-create-enabled set to true but I went and created the table in the hive metastore anyways because this was failing. So the table exists but the sink connector is unable to pull any info from it.

Is there some permissions issue between the local hive:4.0.0-beta-1 running and the local kafka-connect?

The error we receive in the connector is the following: "Failed to get table info from metastore default.iceberg_events"

Screenshot 2024-03-18 at 4 44 22 PM

"org.apache.kafka.connect.errors.ConnectException: Exiting WorkerSinkTask due to unrecoverable exception. \n\tat org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:610) \n\tat org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:330) \n\tat org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:232) \n\tat org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:201) \n\tat org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:188) \n\tat org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:237) \n\tat java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) \n\tat java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) \n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) \n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) \n\tat java.base/java.lang.Thread.run(Thread.java:829)\nCaused by: java.lang.RuntimeException: Failed to get table info from metastore default.iceberg_events \n\tat org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:171) \n\tat org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:97) \n\tat org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:80) \n\tat org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:47) \n\tat io.tabular.iceberg.connect.data.IcebergWriterFactory.createWriter(IcebergWriterFactory.java:54) \n\tat io.tabular.iceberg.connect.channel.Worker.lambda$writerForTable$8(Worker.java:242) \n\tat java.base/java.util.HashMap.computeIfAbsent(HashMap.java:1134) \n\tat io.tabular.iceberg.connect.channel.Worker.writerForTable(Worker.java:241) \n\tat io.tabular.iceberg.connect.channel.Worker.lambda$routeRecordStatically$5(Worker.java:197) \n\tat java.base/java.util.Arrays$ArrayList.forEach(Arrays.java:4390) \n\tat io.tabular.iceberg.connect.channel.Worker.routeRecordStatically(Worker.java:195) \n\tat io.tabular.iceberg.connect.channel.Worker.save(Worker.java:184) \n\tat java.base/java.util.ArrayList.forEach(ArrayList.java:1541) \n\tat io.tabular.iceberg.connect.channel.Worker.save(Worker.java:171) \n\tat io.tabular.iceberg.connect.IcebergSinkTask.put(IcebergSinkTask.java:150) \n\tat org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:582) \n\t... 10 more\nCaused by: org.apache.thrift.transport.TTransportException \n\tat org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) \n\tat org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) \n\tat org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) \n\tat org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) \n\tat org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) \n\tat org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77) \n\tat org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table(ThriftHiveMetastore.java:1514) \n\tat org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table(ThriftHiveMetastore.java:1500) \n\tat org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1346) \n\tat java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) \n\tat java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) \n\tat java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) \n\tat java.base/java.lang.reflect.Method.invoke(Method.java:566) \n\tat org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:169) \n\tat com.sun.proxy.$Proxy58.getTable(Unknown Source) \n\tat org.apache.iceberg.hive.HiveTableOperations.lambda$doRefresh$0(HiveTableOperations.java:158) \n\tat org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:58) \n\tat org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:51) \n\tat org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:122) \n\tat org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:158) \n\t... 25 more\n

@jbolesjc
Copy link
Author

Wanted to add that we are using the runtime hive v0.6.12 release

@danielcweeks
Copy link

It looks like that's an exception from the hive metastore client (thrift exception). I would suspect that the issue is related to connectivity to the metastore (host not resolvable or port issue?).

@k0bayash1maru
Copy link

Is your Hive metastore setup for SSL/SASL with PLAIN client auth ? Are you running with Minio ?

  • Check your hive-site.xml and/or metastore-site.xml on your hms to see what you have for your SASL

@jbolesjc
Copy link
Author

Does not appear that the hive-site.xml has any SASL set up currently.

<configuration> <property> <name>hive.server2.enable.doAs</name> <value>false</value> </property> <property> <name>hive.tez.exec.inplace.progress</name> <value>false</value> </property> <property> <name>hive.exec.scratchdir</name> <value>/opt/hive/scratch_dir</value> </property> <property> <name>hive.user.install.directory</name> <value>/opt/hive/install_dir</value> </property> <property> <name>tez.runtime.optimize.local.fetch</name> <value>true</value> </property> <property> <name>hive.exec.submit.local.task.via.child</name> <value>false</value> </property> <property> <name>mapreduce.framework.name</name> <value>local</value> </property> <property> <name>tez.local.mode</name> <value>true</value> </property> <property> <name>hive.execution.engine</name> <value>tez</value> </property> <property> <name>metastore.warehouse.dir</name> <value>/opt/hive/data/warehouse</value> </property> <property> <name>metastore.metastore.event.db.notification.api.auth</name> <value>false</value> </property> </configuration>

@jbolesjc
Copy link
Author

I am able to ping the IP of the hive container from my kafka-connect container. However I am not able to ping the URI I am passing to the connector config of thrift://hive:10000/

@jbolesjc
Copy link
Author

I am not using minio, I am using localstack. If I go ahead and set up PLAIN client auth, what are the connector config keys to pass the username and password with in order to connect?

@tabmatfournier
Copy link
Contributor

tabmatfournier commented Mar 19, 2024

@jbolesjc
Copy link
Author

As I don't see any out of the box SASL set up with my hms currently I don't think adding a username and password to my kafka-connect is the solution here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants