Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no task is generated after sink factory is rebuilt #10091

Closed
CharlesCheung96 opened this issue Nov 14, 2023 · 2 comments · Fixed by #10132
Closed

no task is generated after sink factory is rebuilt #10091

CharlesCheung96 opened this issue Nov 14, 2023 · 2 comments · Fixed by #10132
Labels
affects-6.5 affects-7.1 affects-7.5 area/ticdc Issues or PRs related to TiCDC. severity/major This is a major bug. type/bug This is a bug.

Comments

@CharlesCheung96
Copy link
Contributor

CharlesCheung96 commented Nov 14, 2023

What did you do?

  1. create changefeed with mysql sink.
  2. inject sink error to trigger sink reconstruction.

What did you expect to see?

The changefeed returns to normal after some time.

What did you see instead?

changefeed stuck, and handleTasks is blocked:

goroutine 64349 [select, 83215 minutes]:
github.com/pingcap/tiflow/cdc/processor/sinkmanager.(*sinkWorker).handleTasks(0xc04e812360?, {0x42f9ae8, 0xc       036bf9700}, 0xc03a5e0f60)
        github.com/pingcap/tiflow/cdc/processor/sinkmanager/table_sink_worker.go:76 +0x9f
github.com/pingcap/tiflow/cdc/processor/sinkmanager.(*SinkManager).startSinkWorkers.func1()
        github.com/pingcap/tiflow/cdc/processor/sinkmanager/manager.go:355 +0x30
golang.org/x/sync/errgroup.(*Group).Go.func1()
        golang.org/x/sync@v0.1.0/errgroup/errgroup.go:75 +0x64
created by golang.org/x/sync/errgroup.(*Group).Go
        golang.org/x/sync@v0.1.0/errgroup/errgroup.go:72 +0xa5

goroutine 64350 [select, 83215 minutes]:
github.com/pingcap/tiflow/cdc/processor/sinkmanager.(*sinkWorker).handleTasks(0xc04e8123c0?, {0x42f9ae8, 0xc       036bf9700}, 0xc03a5e0f60)
        github.com/pingcap/tiflow/cdc/processor/sinkmanager/table_sink_worker.go:76 +0x9f
github.com/pingcap/tiflow/cdc/processor/sinkmanager.(*SinkManager).startSinkWorkers.func1()
        github.com/pingcap/tiflow/cdc/processor/sinkmanager/manager.go:355 +0x30
golang.org/x/sync/errgroup.(*Group).Go.func1()
        golang.org/x/sync@v0.1.0/errgroup/errgroup.go:75 +0x64
created by golang.org/x/sync/errgroup.(*Group).Go
        golang.org/x/sync@v0.1.0/errgroup/errgroup.go:72 +0xa5

Versions of the cluster

Upstream TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

(paste TiDB cluster version here)

Upstream TiKV version (execute tikv-server --version):

(paste TiKV version here)

TiCDC version (execute cdc version): 485c8ff

(paste TiCDC version here)
@CharlesCheung96 CharlesCheung96 added type/bug This is a bug. area/ticdc Issues or PRs related to TiCDC. labels Nov 14, 2023
@github-actions github-actions bot added this to Need Triage in Question and Bug Reports Nov 14, 2023
@CharlesCheung96
Copy link
Contributor Author

CharlesCheung96 commented Nov 16, 2023

image

  1. Suppose a txn1 of gf.t1, where the startTs is 445677261719339010 and commitTs is 445677261719339011.
    image
  2. SinkManager generate sink task with txn1.
  3. TableSink handle this task. Then the nextLowerBoundPos will be pushed to a position greater than 445677261719339011 (which is 445677262531985408 in the log) after txn1 was sent to backend sink.
  4. The syncPoint is enabled, which is also 445677262531985408 in the log. Note that syncPoint and upperbound will not be advanced until checkpointTs reaches 445677262531985408.
  5. However, the backend sink encountered a duplicate entry error while executing txn1. The the backend sink will be rebuilt, and checkpointTs will never advance until a new task is scheduled.
  6. Since lowerBound equals upperBound, the task for this table will never be generated:
    image
  7. Eventually a deadlock appears.

@fubinzh
Copy link

fubinzh commented Nov 20, 2023

/severity major

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.5 affects-7.1 affects-7.5 area/ticdc Issues or PRs related to TiCDC. severity/major This is a major bug. type/bug This is a bug.
Development

Successfully merging a pull request may close this issue.

3 participants