Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to configure the use of HiveCatalog? #197

Open
saimigo opened this issue May 30, 2023 · 3 comments
Open

How to configure the use of HiveCatalog? #197

saimigo opened this issue May 30, 2023 · 3 comments
Labels
dependencies Pull requests that update a dependency file question Further information is requested stale

Comments

@saimigo
Copy link

saimigo commented May 30, 2023

1、How to configure the use of HiveCatalog and What additional dependencies need to be added, please?
2、How to set table property "engine.hive.enabled=true" when creating tables?

error:

{"timestamp":"2023-05-30T03:14:17.889Z","sequence":100,"loggerClassName":"org.jboss.logging.Logger","loggerName":"io.quarkus.runtime.Application","level":"ERROR","message":"Failed to start application (with profile prod)","threadName":"main","threadId":1,"mdc":{},"ndc":"","hostName":"a85fccd0c6e5","processName":"io.debezium.server.Main","processId":1274,"exception":{"refId":1,"exceptionType":"java.lang.NoSuchMethodException","message":"Cannot find constructor for interface org.apache.iceberg.catalog.Catalog\n\tMissing org.apache.iceberg.hive.HiveCatalog [java.lang.NoClassDefFoundError: org/apache/hadoop/hive/metastore/api/UnknownDBException]","frames":[{"class":"org.apache.iceberg.common.DynConstructors$Builder","method":"buildChecked","line":227},{"class":"org.apache.iceberg.CatalogUtil","method":"loadCatalog","line":180},{"class":"org.apache.iceberg.CatalogUtil","method":"buildIcebergCatalog","line":234},{"class":"io.debezium.server.iceberg.IcebergChangeConsumer","method":"connect","line":128},{"class":"io.debezium.server.iceberg.IcebergChangeConsumer_Bean","method":"create"},{"class":"io.debezium.server.iceberg.IcebergChangeConsumer_Bean","method":"create"},{"class":"io.debezium.server.DebeziumServer","method":"start","line":119},{"class":"io.debezium.server.DebeziumServer_Bean","method":"create"},{"class":"io.debezium.server.DebeziumServer_Bean","method":"create"},{"class":"io.quarkus.arc.impl.AbstractSharedContext","method":"createInstanceHandle","line":111},{"class":"io.quarkus.arc.impl.AbstractSharedContext$1","method":"get","line":35},{"class":"io.quarkus.arc.impl.AbstractSharedContext$1","method":"get","line":32},{"class":"io.quarkus.arc.impl.LazyValue","method":"get","line":26},{"class":"io.quarkus.arc.impl.ComputingCache","method":"computeIfAbsent","line":69},{"class":"io.quarkus.arc.impl.AbstractSharedContext","method":"get","line":32},{"class":"io.quarkus.arc.impl.ClientProxies","method":"getApplicationScopedDelegate","line":18},{"class":"io.debezium.server.DebeziumServer_ClientProxy","method":"arc$delegate"},{"class":"io.debezium.server.DebeziumServer_ClientProxy","method":"arc_contextualInstance"},{"class":"io.debezium.server.DebeziumServer_Observer_Synthetic_d70cd75bf32ab6598217b9a64a8473d65e248c05","method":"notify"},{"class":"io.quarkus.arc.impl.EventImpl$Notifier","method":"notifyObservers","line":323},{"class":"io.quarkus.arc.impl.EventImpl$Notifier","method":"notify","line":305},{"class":"io.quarkus.arc.impl.EventImpl","method":"fire","line":73},{"class":"io.quarkus.arc.runtime.ArcRecorder","method":"fireLifecycleEvent","line":128},{"class":"io.quarkus.arc.runtime.ArcRecorder","method":"handleLifecycleEvents","line":97},{"class":"io.quarkus.deployment.steps.LifecycleEventsBuildStep$startupEvent1144526294","method":"deploy_0"},{"class":"io.quarkus.deployment.steps.LifecycleEventsBuildStep$startupEvent1144526294","method":"deploy"},{"class":"io.quarkus.runner.ApplicationImpl","method":"doStart"},{"class":"io.quarkus.runtime.Application","method":"start","line":101},{"class":"io.quarkus.runtime.ApplicationLifecycleManager","method":"run","line":103},{"class":"io.quarkus.runtime.Quarkus","method":"run","line":67},{"class":"io.quarkus.runtime.Quarkus","method":"run","line":41},{"class":"io.quarkus.runtime.Quarkus","method":"run","line":120},{"class":"io.debezium.server.Main","method":"main","line":15}]}}

@ismailsimsek
Copy link
Member

@saimigo hive catalog should be configured like below. also added hive-metastore dependency to the release could you try with new release?

debezium.sink.iceberg.type=hive
debezium.sink.iceberg.uri=thrift://hadoop101:9083
debezium.sink.iceberg.clients=5
debezium.sink.iceberg.warehouse=hdfs://example.com:8020/warehouse
debezium.sink.iceberg.hive.metastore.table.owner=xyz
debezium.sink.iceberg.hive.other.configs=xyz
### 
debezium.sink.iceberg.engine.hive.enabled=true
### if above one doesn't work please try following config
debezium.sink.iceberg.iceberg.engine.hive.enabled=true

@ismailsimsek ismailsimsek added question Further information is requested dependencies Pull requests that update a dependency file labels Jun 13, 2023
@soullinuxer
Copy link

soullinuxer commented Sep 6, 2023

Anyone tried this lately? I am not being successful trying to set up the debezium server to update metadata stored in standalone hive metastore.

I am using a config similar to the one you pasted above:
debezium.sink.iceberg.type=hive
debezium.sink.iceberg.uri=thrift://myserver:9083
debezium.sink.iceberg.clients=5
debezium.sink.iceberg.warehouse=s3a://mybucket/
debezium.sink.iceberg.table-namespace=debeziumevents
debezium.sink.iceberg.hive.metastore.table.owner=myowner
debezium.sink.iceberg.engine.hive.enabled=true
2023-09-06 08:43:02,744 DEBUG [org.apa.had.sec.UserGroupInformation] (pool-6-thread-1) Hadoop login
2023-09-06 08:43:02,745 DEBUG [org.apa.had.sec.UserGroupInformation] (pool-6-thread-1) hadoop login commit
2023-09-06 08:43:02,746 DEBUG [org.apa.had.sec.UserGroupInformation] (pool-6-thread-1) Using local user: UnixPrincipal: myuser
2023-09-06 08:43:02,746 DEBUG [org.apa.had.sec.UserGroupInformation] (pool-6-thread-1) Using user: "UnixPrincipal: myuser" with name: myuser
2023-09-06 08:43:02,747 DEBUG [org.apa.had.sec.UserGroupInformation] (pool-6-thread-1) User entry: "myuser"
2023-09-06 08:43:02,747 DEBUG [org.apa.had.sec.UserGroupInformation] (pool-6-thread-1) UGI loginUser: myuser (auth:SIMPLE)
2023-09-06 08:43:03,023 INFO  [org.apa.had.hiv.met.HiveMetaStoreClient] (pool-6-thread-1) Trying to connect to metastore with URI thrift://myserver:9083
2023-09-06 08:43:03,280 INFO  [org.apa.had.hiv.met.HiveMetaStoreClient] (pool-6-thread-1) Opened a connection to metastore, current connections: 1
2023-09-06 08:43:03,285 INFO  [org.apa.had.hiv.met.HiveMetaStoreClient] (pool-6-thread-1) Connected to metastore.
2023-09-06 08:43:03,285 INFO  [org.apa.had.hiv.met.RetryingMetaStoreClient] (pool-6-thread-1) RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.metastore.HiveMetaStoreClient ugi=myuser (auth:SIMPLE) retries=1 delay=1 lifetime=0
2023-09-06 08:43:04,100 WARN  [org.apa.had.hiv.met.RetryingMetaStoreClient] (pool-6-thread-1) MetaStoreClient lost connection. Attempting to reconnect (1 of 1) after 1s. getTable: org.apache.thrift.transport.TTransportException
	at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
	at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
	at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:380)
	at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:230)
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table_req(ThriftHiveMetastore.java:2079)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table_req(ThriftHiveMetastore.java:2066)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1578)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1570)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:208)
	at com.sun.proxy.$Proxy15.getTable(Unknown Source)
	at org.apache.iceberg.hive.HiveTableOperations.lambda$doRefresh$0(HiveTableOperations.java:158)
	at <unknown class>.run(Unknown Source)
	at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:58)
	at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:51)
	at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:122)
	at org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:158)
	at org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:97)
	at org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:80)
	at org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:47)
	at io.debezium.server.iceberg.IcebergUtil.loadIcebergTable(IcebergUtil.java:123)
	at io.debezium.server.iceberg.IcebergChangeConsumer.loadIcebergTable(IcebergChangeConsumer.java:188)
	at io.debezium.server.iceberg.IcebergChangeConsumer.handleBatch(IcebergChangeConsumer.java:166)
	at io.debezium.embedded.ConvertingEngineBuilder.lambda$notifying$2(ConvertingEngineBuilder.java:101)
	at <unknown class>.handleBatch(Unknown Source)
	at io.debezium.embedded.EmbeddedEngine.run(EmbeddedEngine.java:912)
	at io.debezium.embedded.ConvertingEngineBuilder$2.run(ConvertingEngineBuilder.java:229)
	at io.debezium.server.DebeziumServer.lambda$start$1(DebeziumServer.java:170)
	at <unknown class>.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:866)

and finally:

2023-09-06 08:44:12,398 INFO  [org.apa.had.hiv.met.HiveMetaStoreClient] (pool-6-thread-1) Opened a connection to metastore, current connections: 1
2023-09-06 08:44:12,424 INFO  [org.apa.had.hiv.met.HiveMetaStoreClient] (pool-6-thread-1) Connected to metastore.
2023-09-06 08:44:12,866 INFO  [io.deb.emb.EmbeddedEngine] (pool-6-thread-1) Stopping the task and engine
2023-09-06 08:44:12,878 INFO  [org.apa.kaf.con.sto.FileOffsetBackingStore] (pool-6-thread-1) Stopped FileOffsetBackingStore
2023-09-06 08:44:12,880 ERROR [io.deb.ser.ConnectorLifecycle] (pool-6-thread-1) Connector completed: success = 'false', message = 'Stopping connector after error in the application's handler method: Failed to get table info from metastore debeziumevents.debeziumcdc_account', error = 'java.lang.RuntimeException: Failed to get table info from metastore debeziumevents.debeziumcdc_account': java.lang.RuntimeException: Failed to get table info from metastore myschema.mytable
	at org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:171)
	at org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:97)
	at org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:80)
	at org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:47)
	at io.debezium.server.iceberg.IcebergUtil.loadIcebergTable(IcebergUtil.java:123)
	at io.debezium.server.iceberg.IcebergChangeConsumer.loadIcebergTable(IcebergChangeConsumer.java:188)
	at io.debezium.server.iceberg.IcebergChangeConsumer.handleBatch(IcebergChangeConsumer.java:166)
	at io.debezium.embedded.ConvertingEngineBuilder.lambda$notifying$2(ConvertingEngineBuilder.java:101)
	at <unknown class>.handleBatch(Unknown Source)
	at io.debezium.embedded.EmbeddedEngine.run(EmbeddedEngine.java:912)
	at io.debezium.embedded.ConvertingEngineBuilder$2.run(ConvertingEngineBuilder.java:229)
	at io.debezium.server.DebeziumServer.lambda$start$1(DebeziumServer.java:170)
	at <unknown class>.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:866)
Caused by: org.apache.thrift.transport.TTransportException
	at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
	at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
	at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:380)
	at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:230)
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table_req(ThriftHiveMetastore.java:2079)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table_req(ThriftHiveMetastore.java:2066)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1578)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1570)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:208)
	at com.sun.proxy.$Proxy15.getTable(Unknown Source)
	at org.apache.iceberg.hive.HiveTableOperations.lambda$doRefresh$0(HiveTableOperations.java:158)
	at <unknown class>.run(Unknown Source)
	at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:58)
	at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:51)
	at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:122)
	at org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:158)
	... 15 more

Do you have maybe some advice?

Copy link

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

@github-actions github-actions bot added the stale label Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file question Further information is requested stale
Projects
None yet
Development

No branches or pull requests

3 participants