Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disttask: add finish flow for distribute framework #43676

Merged
merged 17 commits into from May 15, 2023

Conversation

GMHDBJD
Copy link
Contributor

@GMHDBJD GMHDBJD commented May 10, 2023

What problem does this PR solve?

Issue Number: close #43675

Problem Summary:

What is changed and how it works?

  • scheduler update subtask results by OnSubtaskFinished
  • dispatcher fetch subtask results and then pass it to business handle by processFinishFlow.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@GMHDBJD GMHDBJD requested a review from a team as a code owner May 10, 2023 04:34
@ti-chi-bot
Copy link

ti-chi-bot bot commented May 10, 2023

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • D3Hunter
  • zimulala

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot bot added release-note-none size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 10, 2023
Copy link
Contributor

@D3Hunter D3Hunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

15/19 done

// ProcessNormalFlow processes the normal flow.
// It receives the previous subtask metas to do some post-processing.
// returns the new subtask metas and whether the error is retryable.
ProcessNormalFlow(ctx context.Context, h TaskHandle, gTask *proto.Task, prevSubtaskMetas [][]byte) (subtaskMetas [][]byte, retryable bool, err error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just add a new api to this TaskFlowHandle to check retriable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer do retry in business logic, so this retryable variable is just a temporary solution.

Copy link
Contributor

@dhysum dhysum May 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether being able to retry should be a attribute of each step, and the dispatcher or scheduler decides the follow-up action according to that attribute.

The user story could be:

  1. The framework provide the capability to set the attribute
  2. Each task / job / business (or whatever you called it) should have such kind of attributes or definitions which define how the framework trigger the steps and control the flow
  3. The task that need to be paused on the failure, define the value of the attribute
  4. The framework takes actions according to the definition of attributes

API may not be a good idea because we may need another system or mechanism to call such API accordingly which breaks the cohesion.

disttask/framework/dispatcher/dispatcher.go Outdated Show resolved Hide resolved
disttask/framework/framework_test.go Outdated Show resolved Hide resolved
args := m.Called(ctx, subtask)
return args.Error(0)
if args.Error(1) != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this file manually mocked?

Copy link
Contributor Author

@GMHDBJD GMHDBJD May 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will change to generated in another pr

Copy link
Contributor

@D3Hunter D3Hunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest lgmt

if ver >= version145 {
return
}
doReentrantDDL(s, "ALTER TABLE mysql.tidb_background_subtask ADD COLUMN `step` INT", infoschema.ErrColumnExists)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will add step the last column of the table, but in row2SubTask we use Step: r.GetInt64(1)

Copy link
Contributor

@D3Hunter D3Hunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest lgtm

disttask/framework/dispatcher/dispatcher_test.go Outdated Show resolved Hide resolved
@ti-chi-bot ti-chi-bot bot added the status/LGT1 Indicates that a PR has LGTM 1. label May 11, 2023
for _, subtask := range prevSubtasks {
prevSubtaskMetas = append(prevSubtaskMetas, subtask.Meta)
}
metas, retryable, err := handle.ProcessNormalFlow(d.ctx, d, gTask, prevSubtaskMetas)
if err != nil {
logutil.BgLogger().Warn("gen dist-plan failed", zap.Error(err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add retryable to this log?

for _, subtask := range prevSubtasks {
prevSubtaskMetas = append(prevSubtaskMetas, subtask.Meta)
}
metas, retryable, err := handle.ProcessNormalFlow(d.ctx, d, gTask, prevSubtaskMetas)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does prevSubtasks do? I don't think it's in use at the moment. In addition, this quantity may be very large, is there any other way to deal with it? For example, put it in gTask to reshard the task.

Rest LGTM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prevSubtaskMetas contains the result of previous step

Copy link
Contributor

@dhysum dhysum May 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a good idea.

If we need to record all the previous sub task, we may need to record the whole sub-task context because it could fail on any sub-task. The sub-tasks should not depends on each other, they should all depends on another thing, such as context or meta.

Refer to 'Dependence Inversion Principle'.

@GMHDBJD GMHDBJD changed the title disttask: add finish flow for distribute framework disttask: pass previous subtask result to next step in dispatcher May 15, 2023
@GMHDBJD GMHDBJD changed the title disttask: pass previous subtask result to next step in dispatcher disttask: add finish flow for distribute framework May 15, 2023
if stepIsFinished && len(errStr) == 0 && gTask.State == proto.TaskStateRunning {
logutil.BgLogger().Info("detect task, this task finished a step",
zap.Int64("taskID", gTask.ID), zap.String("state", gTask.State), zap.Int64("step", gTask.Step))
if err := d.processFinishFlow(gTask); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can put processFinishFlow directly, and only import step2 or other special step needs to do this.

@GMHDBJD
Copy link
Contributor Author

GMHDBJD commented May 15, 2023

/retest

1 similar comment
@GMHDBJD
Copy link
Contributor Author

GMHDBJD commented May 15, 2023

/retest

@GMHDBJD
Copy link
Contributor Author

GMHDBJD commented May 15, 2023

/test build

Copy link
Contributor

@zimulala zimulala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels May 15, 2023
@GMHDBJD
Copy link
Contributor Author

GMHDBJD commented May 15, 2023

/merge

@ti-chi-bot
Copy link

ti-chi-bot bot commented May 15, 2023

This pull request has been accepted and is ready to merge.

Commit hash: 76f8c68

@ti-chi-bot ti-chi-bot bot added the status/can-merge Indicates a PR has been approved by a committer. label May 15, 2023
@ti-chi-bot ti-chi-bot bot removed the status/can-merge Indicates a PR has been approved by a committer. label May 15, 2023
@GMHDBJD
Copy link
Contributor Author

GMHDBJD commented May 15, 2023

/merge

@ti-chi-bot
Copy link

ti-chi-bot bot commented May 15, 2023

This pull request has been accepted and is ready to merge.

Commit hash: b85b9a2

@ti-chi-bot ti-chi-bot bot added the status/can-merge Indicates a PR has been approved by a committer. label May 15, 2023
@ti-chi-bot ti-chi-bot bot merged commit a60cf97 into pingcap:master May 15, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none size/L Denotes a PR that changes 100-499 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

disttask: add finish flow for distribute framwork
5 participants