Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage): update interpreter and storage support #9261

Merged
merged 32 commits into from
Jan 16, 2023

Conversation

zhyass
Copy link
Member

@zhyass zhyass commented Dec 15, 2022

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

Summary

mysql> create table t(a int, b int);
Query OK, 0 rows affected (0.01 sec)

mysql> insert into t values(2,2);
Query OK, 1 row affected (0.01 sec)

mysql> insert into t values(1,1);
Query OK, 1 row affected (0.02 sec)

mysql> update t set a=2 where b=1;
Query OK, 1 row affected (0.03 sec)

mysql> select * from t;
+------+------+
| a    | b    |
+------+------+
|    2 |    1 |
|    2 |    2 |
+------+------+
2 rows in set (0.01 sec)
Read 2 rows, 16.00 B in 0.007 sec., 272.95 rows/sec., 2.13 KiB/sec.

The flow of Pipeline is as follows:

+---------------+      +-----------------------+
|MutationSource1| ---> |SerializeDataTransform1|   ------
+---------------+      +-----------------------+         |      +-----------------+      +------------+
|     ...       | ---> |          ...          |   ...   | ---> |MutationTransform| ---> |MutationSink|
+---------------+      +-----------------------+         |      +-----------------+      +------------+
|MutationSourceN| ---> |SerializeDataTransformN|   ------
+---------------+      +-----------------------+

UpdateSource: check whether need to update and generate a new block and write the object. The process is similar to DeletionSource.

UPDATE column = expression WHERE condition

Replace column to the result of the following expression:

if(condition, CAST(expression, type), column)

@vercel
Copy link

vercel bot commented Dec 15, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Updated
databend ⬜️ Ignored (Inspect) Jan 16, 2023 at 6:12AM (UTC)

@zhyass zhyass marked this pull request as draft December 15, 2022 14:42
@zhyass zhyass mentioned this pull request Dec 15, 2022
@mergify mergify bot added the pr-feature this PR introduces a new feature to the codebase label Dec 15, 2022
@zhyass zhyass force-pushed the feature_update branch 2 times, most recently from 8bcdc4b to 7d9beaf Compare December 22, 2022 15:42
@zhyass zhyass marked this pull request as ready for review January 3, 2023 14:32
@sundy-li
Copy link
Member

sundy-li commented Jan 3, 2023

This pr may be conflicted a lot with new expression, maybe we can slightly wait for that merged ?

I think it's better to create a pr based on expression branch.

@BohuTANG
Copy link
Member

BohuTANG commented Jan 4, 2023

This pr may be conflicted a lot with new expression, maybe we can slightly wait for that merged ?

I think it's better to create a pr based on expression branch.

OK.
I convert this PR to a draft.
cc @zhyass @dantengsky

@BohuTANG BohuTANG marked this pull request as draft January 4, 2023 00:03
@dantengsky
Copy link
Member

dantengsky commented Jan 4, 2023

@zhyass DeletionSource and UpdateSource share some common code, it would be better if we could de-duplicate them. it is ok to me to do the refactoring in a dedicated PR.

@BohuTANG BohuTANG mentioned this pull request Jan 15, 2023
5 tasks
@BohuTANG
Copy link
Member

image

👍

@dantengsky
Copy link
Member

@zhyass just a reminder:

after PR #9460 is merged, i think, the updation process needs to

  • adjust the use of "column identity"
    not quite sure, seems bind_update, and UpdateInterpreter::execute2 should be adjusted.
  • adjust the logic of conflict detection
    the simplest way, by now I can come up with, is just to abort if updating in parallel with a scheme evolution, and the update is committed later. but apparently, the strategy could be better for some cases (fields being updated not changed, e.g.)

@BohuTANG BohuTANG merged commit f014d56 into datafuselabs:main Jan 16, 2023
@BohuTANG
Copy link
Member

@soyeric128 looking forward to add update feature to documents, thanks.

@BohuTANG BohuTANG mentioned this pull request Jan 21, 2023
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants