-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix restore with partitions cause some query fail #8245
Conversation
fe/fe-core/src/main/java/org/apache/doris/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
if (partition == null) { | ||
continue; | ||
} | ||
partition.setState(PartitionState.RESTORE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you also need to modify replayCheckAndPrepareMeta()
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
…partitions are restoring (#39595) If broker load or stream load task execute in one table that is restoring data, load task will failed with Exception. Exception info :"Table [xxx] is under restore" or "Table [xxx] is in restore process, can't load into it". But mostly restoreJob only effects some partitions in this table, not all of them, so that the other partitions still need to load data successfully. To achieve this goal, before checking olap table state, check partition state first. ps: set restore status for partitions in this pr:#8245 ## test case for this pr ### restore tbl's partition p202408 $ RESTORE SNAPSHOT db.tbl_p202408_test FROM repo ON( `tbl` PARTITION (p202408) ) PROPERTIES( "backup_timestamp"="2024-08-22-20-32-37", "replication_num" = "1" ); ### check restore job state\G $ SHOW RESTORE\G *************************** 1. row *************************** JobId: 21741 Label: tbl_p202408_test Timestamp: 2024-08-22-20-32-37 State: DOWNLOADING RestoreObjs: { "name": "tbl_p202408_test", "database": "db", "olap_table_list": [ { "name": "tbl", "partition_names": ["p202408"] } ] ### load to partition p202408, failed with exception curl --location-trusted -u root:"" \ > -H "label:tbl_test_load_19" \ > -H "timeout:300" \ > -H "format: parquet" \ > -T data_for_p202408.parquet \ > -XPUT http://fe_ip:8030/api/db/tbl/_stream_load { "TxnId": 3042, "Label": "tbl_test_load_19", "Comment": "", "TwoPhaseCommit": "false", "Status": "Fail", "Message": "[ANALYSIS_ERROR]TStatus: errCode = 2, detailMessage = Table [zt_order_detail_v3], Partition [p202408] is in restore process. Can not load into it.etc.", "NumberTotalRows": 682, "NumberLoadedRows": 682, "NumberFilteredRows": 0, "NumberUnselectedRows": 0, "LoadBytes": 82025, "LoadTimeMs": 48, "BeginTxnTimeMs": 0, "StreamLoadPutTimeMs": 7, "ReadDataTimeMs": 0, "WriteDataTimeMs": 38, "CommitAndPublishTimeMs": 0 } ### load to partition p202408, successfully $ curl --location-trusted -u root:"" \ > -H "timeout:300" \ > -H "format: json" \ > -H "read_json_by_line:true" \ > -T data_for_p202407.json \ > -XPUT http://fe_ip:8030/api/db/tbl/_stream_load { "TxnId": 3043, "Label": "2f2dae38-a495-4c22-9492-419ea70b724e", "Comment": "", "TwoPhaseCommit": "false", "Status": "Success", "Message": "OK", "NumberTotalRows": 1, "NumberLoadedRows": 1, "NumberFilteredRows": 0, "NumberUnselectedRows": 0, "LoadBytes": 1128, "LoadTimeMs": 51, "BeginTxnTimeMs": 0, "StreamLoadPutTimeMs": 6, "ReadDataTimeMs": 0, "WriteDataTimeMs": 30, "CommitAndPublishTimeMs": 13 } Co-authored-by: shenshoucheng <shenshoucheng@jd.com>
…partitions are restoring (#39411) If broker load or stream load task execute in one table that is restoring data, load task will failed with Exception. Exception info :"Table [xxx] is under restore" or "Table [xxx] is in restore process, can't load into it". But mostly restoreJob only effects some partitions in this table, not all of them, so that the other partitions still need to load data successfully. To achieve this goal, before checking olap table state, check partition state first. ps: set restore status for partitions in this pr:#8245 ## test case for this pr ### restore tbl's partition p202408 $ RESTORE SNAPSHOT db.tbl_p202408_test FROM repo ON( `tbl` PARTITION (p202408) ) PROPERTIES( "backup_timestamp"="2024-08-22-20-32-37", "replication_num" = "1" ); ### check restore job state\G $ SHOW RESTORE\G *************************** 1. row *************************** JobId: 21741 Label: tbl_p202408_test Timestamp: 2024-08-22-20-32-37 State: DOWNLOADING RestoreObjs: { "name": "tbl_p202408_test", "database": "db", "olap_table_list": [ { "name": "tbl", "partition_names": ["p202408"] } ] ### load to partition p202408, failed with exception curl --location-trusted -u root:"" \ > -H "label:tbl_test_load_19" \ > -H "timeout:300" \ > -H "format: parquet" \ > -T data_for_p202408.parquet \ > -XPUT http://fe_ip:8030/api/db/tbl/_stream_load { "TxnId": 3042, "Label": "tbl_test_load_19", "Comment": "", "TwoPhaseCommit": "false", "Status": "Fail", "Message": "[ANALYSIS_ERROR]TStatus: errCode = 2, detailMessage = Table [zt_order_detail_v3], Partition [p202408] is in restore process. Can not load into it.etc.", "NumberTotalRows": 682, "NumberLoadedRows": 682, "NumberFilteredRows": 0, "NumberUnselectedRows": 0, "LoadBytes": 82025, "LoadTimeMs": 48, "BeginTxnTimeMs": 0, "StreamLoadPutTimeMs": 7, "ReadDataTimeMs": 0, "WriteDataTimeMs": 38, "CommitAndPublishTimeMs": 0 } ### load to partition p202408, successfully $ curl --location-trusted -u root:"" \ > -H "timeout:300" \ > -H "format: json" \ > -H "read_json_by_line:true" \ > -T data_for_p202407.json \ > -XPUT http://fe_ip:8030/api/db/tbl/_stream_load { "TxnId": 3043, "Label": "2f2dae38-a495-4c22-9492-419ea70b724e", "Comment": "", "TwoPhaseCommit": "false", "Status": "Success", "Message": "OK", "NumberTotalRows": 1, "NumberLoadedRows": 1, "NumberFilteredRows": 0, "NumberUnselectedRows": 0, "LoadBytes": 1128, "LoadTimeMs": 51, "BeginTxnTimeMs": 0, "StreamLoadPutTimeMs": 6, "ReadDataTimeMs": 0, "WriteDataTimeMs": 30, "CommitAndPublishTimeMs": 13 } Co-authored-by: shenshoucheng <shenshoucheng@jd.com>
…partitions are restoring (#39595) If broker load or stream load task execute in one table that is restoring data, load task will failed with Exception. Exception info :"Table [xxx] is under restore" or "Table [xxx] is in restore process, can't load into it". But mostly restoreJob only effects some partitions in this table, not all of them, so that the other partitions still need to load data successfully. To achieve this goal, before checking olap table state, check partition state first. ps: set restore status for partitions in this pr:#8245 ## test case for this pr ### restore tbl's partition p202408 $ RESTORE SNAPSHOT db.tbl_p202408_test FROM repo ON( `tbl` PARTITION (p202408) ) PROPERTIES( "backup_timestamp"="2024-08-22-20-32-37", "replication_num" = "1" ); ### check restore job state\G $ SHOW RESTORE\G *************************** 1. row *************************** JobId: 21741 Label: tbl_p202408_test Timestamp: 2024-08-22-20-32-37 State: DOWNLOADING RestoreObjs: { "name": "tbl_p202408_test", "database": "db", "olap_table_list": [ { "name": "tbl", "partition_names": ["p202408"] } ] ### load to partition p202408, failed with exception curl --location-trusted -u root:"" \ > -H "label:tbl_test_load_19" \ > -H "timeout:300" \ > -H "format: parquet" \ > -T data_for_p202408.parquet \ > -XPUT http://fe_ip:8030/api/db/tbl/_stream_load { "TxnId": 3042, "Label": "tbl_test_load_19", "Comment": "", "TwoPhaseCommit": "false", "Status": "Fail", "Message": "[ANALYSIS_ERROR]TStatus: errCode = 2, detailMessage = Table [zt_order_detail_v3], Partition [p202408] is in restore process. Can not load into it.etc.", "NumberTotalRows": 682, "NumberLoadedRows": 682, "NumberFilteredRows": 0, "NumberUnselectedRows": 0, "LoadBytes": 82025, "LoadTimeMs": 48, "BeginTxnTimeMs": 0, "StreamLoadPutTimeMs": 7, "ReadDataTimeMs": 0, "WriteDataTimeMs": 38, "CommitAndPublishTimeMs": 0 } ### load to partition p202408, successfully $ curl --location-trusted -u root:"" \ > -H "timeout:300" \ > -H "format: json" \ > -H "read_json_by_line:true" \ > -T data_for_p202407.json \ > -XPUT http://fe_ip:8030/api/db/tbl/_stream_load { "TxnId": 3043, "Label": "2f2dae38-a495-4c22-9492-419ea70b724e", "Comment": "", "TwoPhaseCommit": "false", "Status": "Success", "Message": "OK", "NumberTotalRows": 1, "NumberLoadedRows": 1, "NumberFilteredRows": 0, "NumberUnselectedRows": 0, "LoadBytes": 1128, "LoadTimeMs": 51, "BeginTxnTimeMs": 0, "StreamLoadPutTimeMs": 6, "ReadDataTimeMs": 0, "WriteDataTimeMs": 30, "CommitAndPublishTimeMs": 13 } Co-authored-by: shenshoucheng <shenshoucheng@jd.com>
Proposed changes
Issue Number: close #8244
Problem Summary:
Describe the overview of changes.
Checklist(Required)
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...