Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Routine leaks after long runs #12731

Closed
1 task done
nnsgmsone opened this issue Nov 14, 2023 · 10 comments
Closed
1 task done

[Bug]: Routine leaks after long runs #12731

nnsgmsone opened this issue Nov 14, 2023 · 10 comments
Assignees
Labels
kind/bug Something isn't working resolved/v1.0.1 severity/s0 Extreme impact: Cause the application to break down and seriously affect the use
Milestone

Comments

@nnsgmsone
Copy link
Contributor

nnsgmsone commented Nov 14, 2023

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Environment

- Version or commit-id (e.g. v0.1.0 or 8b23a93): dd43a976f540229eefba47d5a8f83722db1b729c
- Hardware parameters:
- OS type:
- Others:
稳定性测试的机器

Actual Behavior

在稳定性测试的机器上运行mo时间超过一天后,goroutine的数目会从几百飙升到几千,这些goroutine会附带的导致大量的资源泄露。目前这是在cache重构后的分支上,长时间运行遇到的一个比较严重的阻碍长时间运行的一个问题。

Expected Behavior

No response

Steps to Reproduce

No response

Additional information

No response

@nnsgmsone nnsgmsone added kind/bug Something isn't working needs-triage severity/s0 Extreme impact: Cause the application to break down and seriously affect the use labels Nov 14, 2023
@nnsgmsone nnsgmsone added this to the 1.1.0 milestone Nov 14, 2023
@nnsgmsone nnsgmsone self-assigned this Nov 14, 2023
@nnsgmsone
Copy link
Contributor Author

正在定位和处理中。

@nnsgmsone
Copy link
Contributor Author

@gavinyue 麻烦gavinyue修复一下大量的sql goroutine泄漏的问题。。具体原因已经告知了gavinyue了。

@nnsgmsone
Copy link
Contributor Author

@gavinyue gavinyue self-assigned this Nov 22, 2023
@gavinyue
Copy link
Contributor

#12899

这个PR 给connection加了timeout, 应该解决了db connection泄露的问题

@xzxiong
Copy link
Contributor

xzxiong commented Nov 23, 2023

@gavinyue

此处的goroutine 有明显的泄露。(感谢 @daviszhen 指出问题)

goroutine-full.txt 有数千个以下的goroutine

goroutine 2669528 [select, 6733 minutes]:
github.com/go-sql-driver/mysql.(*mysqlConn).startWatcher.func1()
    /go/pkg/mod/github.com/go-sql-driver/mysql@v1.7.1/connection.go:614 +0xaa
created by github.com/go-sql-driver/mysql.(*mysqlConn).startWatcher
    /go/pkg/mod/github.com/go-sql-driver/mysql@v1.7.1/connection.go:611 +0x10a

应该是使用的问题:
dbConn 本身就是连接池,全局统一管理更合理。
image

@daviszhen
Copy link
Contributor

image

watcher 的关闭的方式:
1,调用close方法。
image

2,用带ctx的函数。PingContext,ExecContext,QueryContext,PrepareContext等。

当ctx被各种原因cancel时,watcher也会退出

@gavinyue
Copy link
Contributor

Working on it

@xzxiong
Copy link
Contributor

xzxiong commented Nov 30, 2023

@gavinyue

此处的goroutine 有明显的泄露。(感谢 @daviszhen 指出问题)

goroutine-full.txt 有数千个以下的goroutine

goroutine 2669528 [select, 6733 minutes]:
github.com/go-sql-driver/mysql.(*mysqlConn).startWatcher.func1()
    /go/pkg/mod/github.com/go-sql-driver/mysql@v1.7.1/connection.go:614 +0xaa
created by github.com/go-sql-driver/mysql.(*mysqlConn).startWatcher
    /go/pkg/mod/github.com/go-sql-driver/mysql@v1.7.1/connection.go:611 +0x10a

应该是使用的问题:
dbConn 本身就是连接池,全局统一管理更合理。
...

已经是全局管理了。
所以问题,还是出在 Close 没有正常调用,导致连接没有释放。

企业微信截图_525f4b8f-f493-4c80-9d3b-e1cf0c4fad21

@gavinyue
Copy link
Contributor

#13122

@gavinyue gavinyue assigned nnsgmsone and unassigned gavinyue Nov 30, 2023
@gavinyue
Copy link
Contributor

@nnsgmsone please take a look and close.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working resolved/v1.0.1 severity/s0 Extreme impact: Cause the application to break down and seriously affect the use
Projects
None yet
Development

No branches or pull requests

6 participants