Skip to content

[#2254] feat: support python client to create table#4824

Closed
Lanznx wants to merge 25 commits intoapache:mainfrom
Lanznx:feat/#2254
Closed

[#2254] feat: support python client to create table#4824
Lanznx wants to merge 25 commits intoapache:mainfrom
Lanznx:feat/#2254

Conversation

@Lanznx
Copy link
Contributor

@Lanznx Lanznx commented Aug 30, 2024

What changes were proposed in this pull request?

  • This PR aims to provide the Python client with the ability to send a create table API request.
  • Originally, the goal of [Subtask] Add Table APIs in Gravitino Python client #2254 was to implement the methods listTables, loadTable, alterTable, and dropTable.
  • However, to avoid a large codebase variation, I have minimized the scope of the issue by focusing on the createTable method.
  • Currently, the createTable method supports only the required arguments. Optional components such as partitions, distribution, sort_orders, and indexes are not implemented.

Why are the changes needed?

To satisfy the requirements of #2254.

Does this PR introduce any user-facing change?

Yes, it introduces the create_table API.

How was this patch tested?

  1. I added an integration test in test_relational_catalog.py, which requires MySQL to be running during the test.

@jerryshao
Copy link
Contributor

Hi @Lanznx thanks a lot for your contribution. I think the current PR is still super large for review. For this kind of large PR, I would suggest we have a design doc and break down into small subtasks, which will be easy to review and track. CC @noidname01 @xunliu to also take a look.

@xunliu xunliu requested a review from noidname01 August 30, 2024 23:04
@xunliu
Copy link
Member

xunliu commented Sep 1, 2024

@Lanznx Thank you for your contribustions.

Hi @Lanznx thanks a lot for your contribution. I think the current PR is still super large for review. For this kind of large PR, I would suggest we have a design doc and break down into small subtasks, which will be easy to review and track. CC @noidname01 @xunliu to also take a look.

There is no new design for the python client, it is just rewritten in the python programming language following the java client implementation. If we follow the initial java client way, one small PR step by step commit, then it may take us about the same time as the java client to finish the python client development, which I guess will be very long. And the java client keeps adding new features, then the python client keeps catching up.
So I would suggest that the python client should be completed as soon as possible with the same features as the java client.

@jerryshao
Copy link
Contributor

jerryshao commented Sep 1, 2024

@Lanznx Thank you for your contribustions.

Hi @Lanznx thanks a lot for your contribution. I think the current PR is still super large for review. For this kind of large PR, I would suggest we have a design doc and break down into small subtasks, which will be easy to review and track. CC @noidname01 @xunliu to also take a look.

There is no new design for the python client, it is just rewritten in the python programming language following the java client implementation. If we follow the initial java client way, one small PR step by step commit, then it may take us about the same time as the java client to finish the python client development, which I guess will be very long. And the java client keeps adding new features, then the python client keeps catching up. So I would suggest that the python client should be completed as soon as possible with the same features as the java client.

I don't agree that we should be in a hurry to get everything merged in one PR and catch up with the Java client ASAP.

  1. Though Python API is similar to Java, but there's still some things that need to be thought of, whether the design is Pythonic or not, how to express the Java implementation into Python smoothly, and whether we need a Python counterpart actually. Take a simple question about "error" or "exception", that's the difference between Python and Java, shall we follow the Java way, or we define it more Pythonic, that's not just a simple mimic of Java. Besides, the design doc also helps to figure out how much work we need to do.
  2. Remember last time we check in the Python fileset support without review. There are many follow-up works done by @noidname01 to make it relatively workable. I don't want this to happen again, we need to review each PR one by one to make sure it is OK before merging.
  3. We can never catch up with the Java API counterpart, because most of the engineers are not familiar with Python, they only implement the Java part and leave the Python part as a follow-up work, unless we have someone dedicated to the Python part. So even if we catch up for now, we will be delayed somehow in future.
  4. A deep question is that do we need to have all the APIs implemented in Python. For example, like Table API, what is the typical usage scenario in Python world, how would people use it in Python? These problems should be carefully weighed before we actually do so. So that's why we need a design doc.

Anyway, from my point, this Python table API work should have a design doc, a community discussion and vote, and a plan before the code works. That's also how the community works.

@xunliu
Copy link
Member

xunliu commented Dec 9, 2024

@Lanznx Thank you for your contributions.
The community guys help split this PR into EPC #5198
We can finish this issue one by one.
So, I will close this PR.
@Lanznx Thank you again!

@xunliu xunliu closed this Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants