Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disttask: support detectTask in parallel and do some updates for dispatcher #42454

Merged
merged 8 commits into from Apr 3, 2023

Conversation

zimulala
Copy link
Contributor

@zimulala zimulala commented Mar 21, 2023

What problem does this PR solve?

Issue Number: close #42383, close #42369

Problem Summary:

What is changed and how it works?

  • detectTask in parallel

    • Every gTask using a goroutine when detecting a task state(call detectTask)
  • Using spool.Pool instead of tidbutil.WaitGroupWrapper to new a goroutine.

  • Handle the TODO of Consider get all unfinished tasks.

  • If updating task/subtasks to the storage fails, ensure that the gTask state is not changed in cache.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Mar 21, 2023

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • hawkingrei
  • wjhuang2016

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added do-not-merge/needs-triage-completed release-note-none size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 21, 2023
@zimulala
Copy link
Contributor Author

Unstable test: #42461

@zimulala
Copy link
Contributor Author

/test unit-test

@GMHDBJD
Copy link
Contributor

GMHDBJD commented Mar 22, 2023

/retest

// TODO: UpdateTask and AddNewTask in a txn.
prevState := gTask.State
defer func() {
if err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better way to handle the error is discard the task info in the memory and refetch it from storage because we do not know either the operation is succeed or not. Morever, the refetch operation still can return an error, so we need retry it until success.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, the reason for the error here is to update data to the storage layer error, the result is still to obtain information from the storage layer, feeling that the probability of failure is relatively large. It feels better to do a copy before this operation, and then do a copy after the operation is successful. But I'm only checking state so I'm just going to do it right now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider one scene the network request succeed but failed to response to client. If we revert the state in memory, this reverted state is wrong.

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Mar 28, 2023
@hawkingrei
Copy link
Member

/test unit-test

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Mar 28, 2023
@hawkingrei
Copy link
Member

/test all

@zimulala zimulala force-pushed the zimuxia/dispatcher-parallel-1 branch from b741c93 to 4afe50c Compare April 2, 2023 16:54
@pingcap pingcap deleted a comment from ti-chi-bot Apr 2, 2023
@zimulala
Copy link
Contributor Author

zimulala commented Apr 2, 2023

/test check-dev2

@zimulala
Copy link
Contributor Author

zimulala commented Apr 3, 2023

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 4afe50c

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Apr 3, 2023
@ti-chi-bot ti-chi-bot merged commit edd4f26 into pingcap:master Apr 3, 2023
8 checks passed
@zimulala zimulala deleted the zimuxia/dispatcher-parallel-1 branch April 3, 2023 00:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

detectTask in parallel for dispatcher DATA RACE in the dispatcher_test.TestSimple
6 participants