Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Iceberg/create-catalog #9017

Merged
merged 67 commits into from
Feb 21, 2023
Merged

Conversation

ClSlaid
Copy link
Contributor

@ClSlaid ClSlaid commented Nov 29, 2022

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

Summary

This PR implements:

  1. the creation of iceberg catalog, from SQL.
  2. list database under iceberg catalog. (return database struct)
  3. list table under iceberg database. (return table struct)

Note:

  • moved catalog option validating from intepreter to binder
  • stashed some unrelated logic of reading iceberg tables.
  • allow_insecure will also affect the catalog!

Example

-- create on aws s3
CREATE CATALOG iceberg_ctl
  TYPE=ICEBERG
  CONNECTION=( 
    URL="s3://my_bucket/path/to/db" 
    AWS_KEY_ID="<access-key>"
    AWS_SECRET_KEY="<secret_key>"
    SESSION_TOKEN="<session_token>"
  );

SHOW DATABASES IN iceberg_ctl;
-- +--------------------------+
-- | databases_in_iceberg_ctl |
-- +--------------------------+
-- | iceberg_db               |
-- +--------------------------+
-- 1 row in set
-- Time: 0.066s

SHOW TABLES IN iceberg_ctl.iceberg_db;
-- +----------------------+
-- | tables_in_iceberg_db |
-- +----------------------+
-- | iceberg_tbl          |
-- +----------------------+
-- 1 row in set
-- Time: 0.144s

Future Possibility

The metadata deserialize is mostly done with a fork of iceberg-rs. To fully implement iceberg in Databend repository may result in large size of code and drags compile time.

It may be a better choice to have a standalone databend-iceberg crate, working as a query engine, implementing validation, reading, writing, and partitioning for iceberg. And then plug it in Databend.

@vercel
Copy link

vercel bot commented Nov 29, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated
databend ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Feb 21, 2023 at 3:39AM (UTC)

@mergify mergify bot added the pr-feature this PR introduces a new feature to the codebase label Nov 29, 2022
Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
- introduce `flatten` catalog
- all tables now created from database

Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
@ClSlaid ClSlaid marked this pull request as ready for review December 5, 2022 10:17
src/meta/app/src/schema/catalog.rs Show resolved Hide resolved
src/query/service/src/catalogs/catalog_manager.rs Outdated Show resolved Hide resolved
src/query/storages/iceberg/src/catalog.rs Outdated Show resolved Hide resolved
1. since iceberg introduces not realy much dependency, no conditional
   compile
2. iceberg don't have tanant abstraction, use `default` of `TableMeta` and `DatabaseMeta`

Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
@BohuTANG
Copy link
Member

BohuTANG commented Dec 7, 2022

Is it possible add a test for this PR?

@ClSlaid
Copy link
Contributor Author

ClSlaid commented Dec 7, 2022

Is it possible add a test for this PR?

Working on it, I'm implementing new SQLs for listing databases and tables under different catalogs.

Signed-off-by: 蔡略 <cailue@bupt.edu.cn>
1. no awk

Signed-off-by: 蔡略 <cailue@bupt.edu.cn>
1. sql error in sqllogictests
2. upload test data in cluster

Signed-off-by: 蔡略 <cailue@bupt.edu.cn>
Signed-off-by: 蔡略 <cailue@bupt.edu.cn>
@ClSlaid
Copy link
Contributor Author

ClSlaid commented Feb 20, 2023

Strange failures in sql logic tests, having been fixing it...

Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
This PR addes new syntax along with its implementations, resulting
imcompatibility against old versions. The test should be resumed by
following PRs under this topic

Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
@Xuanwo Xuanwo marked this pull request as ready for review February 21, 2023 03:07
Copy link
Member

@BohuTANG BohuTANG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants