Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Add storage support for Hive catalog #12407

Closed
PsiACE opened this issue Aug 10, 2023 · 6 comments · Fixed by #12469
Closed

Feature: Add storage support for Hive catalog #12407

PsiACE opened this issue Aug 10, 2023 · 6 comments · Fixed by #12469
Assignees
Labels
C-feature Category: feature

Comments

@PsiACE
Copy link
Member

PsiACE commented Aug 10, 2023

Background

Databend's hive catalog currently lacks its own storage backend configuration and can only use the storage in Databend's configuration (default catalog).

But for users, the data managed by these hives may be located in different storage backends such as s3/hdfs. Moreover, unlike the storage backend of the Databend instance they manage, this will cause Databend to be unable to read data from Hive.

Expected

Allow using connection to specify the corresponding storage backend when creating a catalog, like Iceberg catalog.

CREATE CATALOG hive_ctl
TYPE=HIVE
HMS_ADDRESS='127.0.0.1:9083'
CONNECTION=(
    URL='s3://warehouse/'
    AWS_KEY_ID='admin'
    AWS_SECRET_KEY='password'
    ENDPOINT_URL='http://localhost:9000'
);
@PsiACE PsiACE added the C-feature Category: feature label Aug 10, 2023
@Xuanwo
Copy link
Member

Xuanwo commented Aug 10, 2023

Nice catch, let me take this.

@Xuanwo Xuanwo self-assigned this Aug 10, 2023
@PsiACE
Copy link
Member Author

PsiACE commented Aug 10, 2023

Another issue to note is that creating a catalog via SQL seems to be ineffective for Hive, even when using a release with Hive support.

@Xuanwo
Copy link
Member

Xuanwo commented Aug 10, 2023

Another issue to note is that creating a catalog via SQL seems to be ineffective for Hive, even when using a release with Hive support.

Are you talking about creating hive catalog but failed?

@PsiACE
Copy link
Member Author

PsiACE commented Aug 10, 2023

Are you talking about creating hive catalog but failed?

Using SQL will result in an error, but the catalog specified in the configuration can still take effect.

root@localhost:8000/> CREATE CATALOG ctl TYPE=HIVE CONNECTION=(HMS_ADDRESS='127.0.0.1:9083');
APIError: ResponseError with 2318: Hive catalog support is not enabled in your databend-query distribution.
[catalogs.hive]
type = "hive"
# hive metastore address, such as 127.0.0.1:9083
address = "127.0.0.1:9083"

@Xuanwo
Copy link
Member

Xuanwo commented Aug 10, 2023

Ok, this is an unexpected behavior for that. Would you like to create a new issue for this?

@PsiACE
Copy link
Member Author

PsiACE commented Aug 16, 2023

After the merge of #12469, the new usage is:

CREATE CATALOG hive_ctl 
TYPE = HIVE 
CONNECTION =(
    ADDRESS = '127.0.0.1:9083' 
    URL = 's3://warehouse/' 
    AWS_KEY_ID = 'admin' 
    AWS_SECRET_KEY = 'password' 
    ENDPOINT_URL = 'http://localhost:9000/'
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature Category: feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants