-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Labels
Description
Describe the bug
We have a table with many replicas.
Because of base tablet may be delete by balance, create tablet may fail.
To Reproduce
Steps to reproduce the behavior:
- create schema change job
ALTER TABLE test_db.test_table ADD COLUMN name varchar(100) comment 'xxx'
- FE will create a schema change job:
2021-06-11 14:17:57,739 INFO (thrift-server-pool-150|359) [SchemaChangeHandler.createJob():1383] finished to create schema change job: 64494481
This step generates the partitionIndexTabletMap of SchemaChangeJobV2. So, the locations of new tablet replicas are fixed.
3. Wait table become stable:
2021-06-11 14:18:12,816 INFO (schema change|25) [OlapTable.isStable():1391] table 23196651 is not stable because tablet 60422768 status is REDUNDANT. replicas: [[replicaId=60422770, BackendId=10003], [replicaId=60422771, BackendId=10006], [replicaId=62269543, BackendId=61958307, version=2], [replicaId=64494415, BackendId=61958301]]
Tablet 60422768 is REDUNDANT.
4. TabletScheduler remove 60422768 in FE meta.
2021-06-11 14:18:26,495 INFO (tablet scheduler|38) [TabletScheduler.deleteReplicaInternal():982] delete replica. tablet id: 60422768, backend id: 10006. reason: DECOMMISSION state, force: false
- Delete replica when report:
2021-06-11 14:19:12,757 WARN (Thread-33|79) [ReportHandler.deleteFromBackend():677] failed add to meta. tablet[60422768], backend[10006]. errCode = 2, detailMessage = replica is enough[3-3]
2021-06-11 14:19:12,757 WARN (Thread-33|79) [ReportHandler.deleteFromBackend():690] delete tablet[60422768 - 118915135] from backend[10006] because not found in meta
- Start create tablet
2021-06-11 14:20:12,947 INFO (schema change|25) [AlterJobV2.checkTableStable():209] table 23196651 is stable, start SCHEMA_CHANGE job {}
- BE create tablet fail, because fail to find base tablet 60422768
W0611 14:20:46.569319 425891 tablet_manager.cpp:244] fail to create tablet(change schema), base tablet does not exist. new_tablet_id=64530888, new_schema_hash=1683434764, base_tablet_id=60422768, base_schema_hash=118915135
- schema change fail
2021-06-11 14:20:46,628 WARN (schema change|25) [SchemaChangeJobV2.runPendingJob():309] failed to create replicas for job: 64494481, 10006: []
Expected behavior
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]
Smartphone (please complete the following information):
- Device: [e.g. iPhone6]
- OS: [e.g. iOS8.1]
- Browser [e.g. stock browser, safari]
- Version [e.g. 22]
Additional context
Add any other context about the problem here.