-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Found duplicate data when using the iceberge #44753
Comments
same here #41267 (comment) |
hi, what's your starrocks version? |
Thanks! I will go study |
The starrocks version is 3.1.10 |
@mlbzssk hi, how do you upsert your table. could you share the reproduce process? I can't reproduce the issue using 3.1.10 |
CREATE TABLE flink_table_to_spark ( |
resolved |
Hi~
I am a green hand with starrocks.
I found a quesition, I use v2 table with iceberg,and use the spark to upsert. The primary key is id and ts.
CREATE TABLE
upsert_demo(
idbigint,
addrstring,
tsstring) PARTITIONED BY (ts) TBLPROPERTIES ( 'current-schema'='{"type":"struct","schema-id":0,"fields":[{"id":1,"name":"id","required":false,"type":"long"},{"id":2,"name":"addr","required":false,"type":"string"},{"id":3,"name":"ts","required":false,"type":"string"}]}', 'current-snapshot-id'='5208014445283775116', 'current-snapshot-summary'='{"added-data-files":"1","deleted-data-files":"2","removed-equality-delete-files":"1","removed-delete-files":"1","added-records":"9","deleted-records":"15","added-files-size":"1049719","removed-files-size":"3148815","removed-equality-deletes":"3","changed-partition-count":"1","total-records":"9","total-files-size":"2101152","total-data-files":"1","total-delete-files":"2","total-position-deletes":"3","total-equality-deletes":"9"}', 'current-snapshot-timestamp-ms'='1713261667810', 'default-partition-spec'='{"spec-id":0,"fields":[{"name":"ts","transform":"identity","source-id":3,"field-id":1000}]}', 'dlc.ao.data.govern.sorted.keys'='id', 'smart-optimizer.inherit'='default', 'snapshot-count'='3', 'table_type'='ICEBERG', 'uuid'='b4be3ec8-de16-4107-a11c-a8e69d1feec3', 'write.distribution-mode'='hash', 'write.merge.mode'='merge-on-read', 'write.metadata.delete-after-commit.enabled'='true', 'write.metadata.metrics.default'='full', 'write.metadata.previous-versions-max'='100', 'write.parquet.bloom-filter-enabled.column.id'='true', 'write.update.mode'='merge-on-read', 'write.upsert.enabled'='true', 'format-version'='2')
I got 9 lines data when I use the spark to query.
![image](https://private-user-images.githubusercontent.com/31941943/325554805-16a4fa7b-c8ed-47cb-982c-7e9513be722f.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE1MzUzNDcsIm5iZiI6MTcyMTUzNTA0NywicGF0aCI6Ii8zMTk0MTk0My8zMjU1NTQ4MDUtMTZhNGZhN2ItYzhlZC00N2NiLTk4MmMtN2U5NTEzYmU3MjJmLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MjElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzIxVDA0MTA0N1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTY4NTFmZTljYTVlYzE4NGEzODlkYjlhODM5N2ZkNzMzNmZjYWQ5MDExOTM1NDE5ODAxODJhOGJlNzQwYWYwMWMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.fcpvxJ-j3QsGg-GMDVXvjF44cE2h7OZdYxyJkfyey2s)
But! I got 18 lines data when I use starsocks.
![image](https://private-user-images.githubusercontent.com/31941943/325555015-6e533324-5203-4460-af84-ddc6b74f92cd.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE1MzUzNDcsIm5iZiI6MTcyMTUzNTA0NywicGF0aCI6Ii8zMTk0MTk0My8zMjU1NTUwMTUtNmU1MzMzMjQtNTIwMy00NDYwLWFmODQtZGRjNmI3NGY5MmNkLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MjElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzIxVDA0MTA0N1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWFiMTk4NmRiZjU4N2ExYTk4NjQwYWNlOTc5Y2U0YWRjM2IzZGI0NDFmNjM3MWVjNWNhZmY3NTI3ODU3MDY3YjEmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.4fv75RHqmZNxjqtPdBcc6AlVNUGJLbZ5MgEpZoexSOY)
Can anyone help me see where the problem lies?
The text was updated successfully, but these errors were encountered: