Make ddl operations for placement rule atomic #28159

lcwangchao · 2021-09-17T12:06:23Z

Enhancement

For now, the ddl operations for placement rule is not atomic. For the current implement, if it fails to send request to PD, meta operations will rollback:

tidb/ddl/partition.go

Lines 1131 to 1135 in 680de92

    
           err = infosync.PutRuleBundles(context.TODO(), bundles) 
        
           if err != nil { 
        
           	job.State = model.JobStateCancelled 
        
           	return ver, errors.Wrapf(err, "failed to notify PD the placement rules") 
        
           }

It works well in most times. However, it is possible that PD have received and processed the request successfully but the response packet is lost in the network. In this case, the meta operations will rollback and left an inconsistent state with PD.

Some ways to resolve the problem:

Rollback PD states when the ddl job fails even if we got a failed response. In this way, rollback for PD should be idempotent and should retry infinitely if it still fails.
Left it to users. In this way, we should allow user to retry his operation to make the state consistent manually. It's better to provide some commands to check the consistency too.

xhebox · 2021-09-17T14:34:05Z

You should add another option, retrying sending requests to PD until successful. API is not only atomic, but also idempotent. That means we can retry DDL job forever until it succeeds, i.e. blocking.

It can be archived with a middle state. And that is what we wanted to do at the very beginning of the previous work plan. But we decided not to do. Because in the previous version, there is no persistence on TiKV. Infoschema is merely a cache. It can not be inconsistent.

But, since now we have persistence, this plan becomes useful again.

The reason that I did not choose to rollback: our DDL can not be canceled in many cases anyway. We are not good at rollback. It is hard.

BTW, please add it in the tracking issue, in the rule engine section, maybe.

morgo · 2021-09-20T02:22:57Z

Yes I agree. We should be retrying requests for n times before giving up and rolling back.

morgo · 2021-09-27T02:48:33Z

We discussed this today on our weekly meeting. For direct placement options, it is more straight forward because we can retry the placement option.. except we need to be careful that the retrying is not in the DDL queue (which would block other DDL jobs).

For altering a PLACEMENT POLICY, we need to consider that it could cascade to many options, which may fail or require retrying.

@AilinKid can you please share your suggestion?

AilinKid · 2021-09-27T02:52:30Z

how about moving the sync pd logic out of ddl txn? I mean do the sync after the ddl txn is committed for sure.

This comes from a similar logic for what gc does now, for example, drop a table, when the ddl is committed already, the following action is to write the table id into the gc_delete_range physical table, which in done in finishDDLJob.

For the case here, sync PD is a kind of afterward action, which needs the ddl txn is done for sure. When ddl is done, the bundle can be found in the meta, which is truth already, syncing pd is just time problem. (which avoided the retry ddl job which is time-costy)

xhebox · 2021-09-27T03:19:57Z

how about moving the sync pd logic out of ddl txn? I mean do the sync after the ddl txn is committed for sure.

This comes from a similar logic for what gc does now, for example, drop a table, when the ddl is committed already, the following action is to write the table id into the gc_delete_range physical table, which in done in finishDDLJob.

For the case here, sync PD is a kind of afterward action, which needs the ddl txn is done for sure. When ddl is done, the bundle can be found in the meta, which is truth already, syncing pd is just time problem. (which avoided the retry ddl job which is time-costy)

I like the idea, but the order of DDL jobs still needs to be kept. Bundle does not come with monotonic id like key-ranes, which means the job either can not be executed parallel or must take care of the order. Otherwise, drop policy for table -> create policy for table may be synchronized as create policy for table -> drop policy for table.

Serial execution does not make a difference. And taking care of the order may introduce new inconsistency. BTW, now tiflash works by pulling meta from tidb regularly(one persistence).

lcwangchao · 2021-11-12T10:30:54Z

Summary for some candidates:

Retry infinitely until success. It will make a final consistency. However it will block other DDL operations if PD request continually fails. Moreover the PD request may fail for non-network problems. For example, we left some parameter checks to PD instead of checking it in tidb layer.
Retry some times and rollback. @xhebox has some comments about this way that: "The reason that I did not choose to rollback: our DDL can not be canceled in many cases anyway. We are not good at rollback. It is hard." . But I think it is not necessary to add a rollback stage for every ddl operation. The first stage of ddl operation can always be rollback because not any changes have been committed yet. So we can choose the "last stage the ddl can rollback" and do some PD requests. This way still has a small probability that makes the state not consistency when we have a severe network jitter(request succeed but response package lost).
Syncing with PD out of ddl operations. Retry infinitely in a standalone process. In this way the DDL operation will never be blocked, but we should consider more about orders and this makes our system more complex.

AilinKid · 2021-11-13T08:44:01Z

Summary for some candidates:

Retry infinitely until success. It will make a final consistency. However it will block other DDL operations if PD request continually fails. Moreover the PD request may fail for non-network problems. For example, we left some parameter checks to PD instead of checking it in tidb layer.

Retry some times and rollback. @xhebox has some comments about this way that: "The reason that I did not choose to rollback: our DDL can not be canceled in many cases anyway. We are not good at rollback. It is hard." . But I think it is not necessary to add a rollback stage for every ddl operation. The first stage of ddl operation can always be rollback because not any changes have been committed yet. So we can choose the "last stage the ddl can rollback" and do some PD requests. This way still has a small probability that makes the state not consistency when we have a severe network jitter(request succeed but response package lost).

Syncing with PD out of ddl operations. Retry infinitely in a standalone process. In this way the DDL operation will never be blocked, but we should consider more about orders and this makes our system more complex.

The third way of mine is that: doing it in finishDDLJob is not in a standalone goroutine, but the mainstream. The benefit is that it can be doing after the KV meta txn is committed for sure. (strictly the same sequence of what the ddl job done is like)

Or we can make it async (in a standalone goroutine), strictly control the syncing sequence of of what the ddl job done is like (another subtask queue?)

xhebox · 2021-11-14T04:19:13Z

The third way of mine is that: doing it in finishDDLJob is not in a standalone goroutine, but the mainstream. The benefit is that it can be doing after the KV meta txn is committed for sure. (strictly the same sequence of what the ddl job done is like)

Or we can make it async (in a standalone goroutine), strictly control the syncing sequence of of what the ddl job done is like (another subtask queue?)

It is trivial about where the third method is done. But it will be complex to ensure order and consistency.

Moreover the PD request may fail for non-network problems. For example, we left some parameter checks to PD instead of checking it in tidb layer.
This way still has a small probability that makes the state not consistency when we have a severe network jitter(request succeed but response package lost).

We can tell if it is timeout or not. In case you lost the response, you could send it again to gain the correct response.

But I think it is not necessary to add a rollback stage for every ddl operation. The first stage of ddl operation can always be rollback because not any changes have been committed yet.

The hard part is to determine the state that needs to rollback to.

lcwangchao · 2021-11-15T02:16:02Z

We can tell if it is timeout or not. In case you lost the response, you could send it again to gain the correct response.

Yes, we can tell it wether it is a network or request error. But what I mean is that if it is a request error, it is meaningless to retry it because it will always be failed. So we still need to rollback.

The hard part is to determine the state that needs to rollback to.

Actually, we need not to add extra rollback stage for placement. If a ddl operation does not support rollback before, we just need to request PD in first stage.

xhebox · 2021-11-15T02:41:09Z

Yes, we can tell it wether it is a network or request error. But what I mean is that if it is a request error, it is meaningless to retry it because it will always be failed. So we still need to rollback.

So you can just retry if it is network, but fail if it is HTTP 200 for example.

Actually, we need not to add extra rollback stage for placement. If a ddl operation does not support rollback before, we just need to request PD in first stage.

What if it supports? What is the original state?

lcwangchao · 2021-11-15T03:53:44Z

What if it supports? What is the original state?
It's easier to handle a job which supports rollback. Put the PD request operation in the last stage.

xhebox · 2021-11-15T04:07:56Z

What if it supports? What is the original state?
It's easier to handle a job which supports rollback. Put the PD request operation in the last stage.

Yeah, you could do so, if you can construct the old placement rule bundles.

lcwangchao added the type/enhancement label Sep 17, 2021

lcwangchao mentioned this issue Sep 18, 2021

Define the placement of data by SQL statements #18030

Closed

morgo assigned lcwangchao Nov 8, 2021

lcwangchao mentioned this issue Dec 9, 2021

*: Retry when placement PutBundles failed #30590

Merged

12 tasks

ti-chi-bot closed this as completed in #30590 Dec 10, 2021

xhebox mentioned this issue Dec 28, 2021

ddl: Implement basic version of TiFlash replica management #30842

Merged

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make ddl operations for placement rule atomic #28159

Make ddl operations for placement rule atomic #28159

lcwangchao commented Sep 17, 2021 •

edited by xhebox

Loading

xhebox commented Sep 17, 2021 •

edited

Loading

morgo commented Sep 20, 2021

morgo commented Sep 27, 2021

AilinKid commented Sep 27, 2021 •

edited

Loading

xhebox commented Sep 27, 2021

lcwangchao commented Nov 12, 2021

AilinKid commented Nov 13, 2021 •

edited

Loading

xhebox commented Nov 14, 2021

lcwangchao commented Nov 15, 2021

xhebox commented Nov 15, 2021

lcwangchao commented Nov 15, 2021

xhebox commented Nov 15, 2021

Make ddl operations for placement rule atomic #28159

Make ddl operations for placement rule atomic #28159

Comments

lcwangchao commented Sep 17, 2021 • edited by xhebox Loading

Enhancement

xhebox commented Sep 17, 2021 • edited Loading

morgo commented Sep 20, 2021

morgo commented Sep 27, 2021

AilinKid commented Sep 27, 2021 • edited Loading

xhebox commented Sep 27, 2021

lcwangchao commented Nov 12, 2021

AilinKid commented Nov 13, 2021 • edited Loading

xhebox commented Nov 14, 2021

lcwangchao commented Nov 15, 2021

xhebox commented Nov 15, 2021

lcwangchao commented Nov 15, 2021

xhebox commented Nov 15, 2021

lcwangchao commented Sep 17, 2021 •

edited by xhebox

Loading

xhebox commented Sep 17, 2021 •

edited

Loading

AilinKid commented Sep 27, 2021 •

edited

Loading

AilinKid commented Nov 13, 2021 •

edited

Loading