-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: profileReporter internal event circulation #77
Conversation
另外,channel本来是解决速差的,如果选择发送是否可以在send的地方选择select适当丢弃? |
你说服我了。 |
holmes.go
Outdated
func (h *Holmes) ReportProfile(pType string, buf []byte, reason string, eventID string) { | ||
if atomic.LoadInt64(&h.stopped) == 1 { | ||
close(h.rptEventsCh) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个地方是不是有bug会导致panic。
场景如下:
timeline1: ticke.c-> grDump() -> close(rptEventsCh) -> memDump -> close(rptEventsCh) ...
timeline2: stop holmes->
rptEventsCh 似乎会被close多次? 因为stop标识位不会在Dump 过程中检查
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
close(h.rptEventsCh)
可以移到 h.Stop() 中?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯,应该放 stop 里,最好保留 stopReporter 这个函数名,语义更清晰
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
确实这里可能会有异常
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯,应该放 stop 里,最好保留 stopReporter 这个函数名,语义更清晰
如果单独开一个stop方法,是否我们的start和stop的粒度就变了呢,我更倾向在发送端去close,当然我现在的做法可能有点问题,我需要修复一下
能否增加一个单元测试:不stop holmes的同时只关闭reporter,然后再开启。 |
好的,我尝试一下 |
holmes.go
Outdated
h.rptEventsChLock.Lock() | ||
h.rptEventsCh = make(chan rptEvent, 32) | ||
h.rptEventsChLock.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好多个锁...
我有个小建议,Start 里初始化这些 channel,stop 里关闭这些 channel
同时,start 和 stop 用 CompareAndSwap 保护一下重入
这样 race 应该足够小了,除非是 start 和 stop 的并发 race 了,这个我觉得这个很小 的race 是可以接受的了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果在stop上去close,有可能在发送端的时候发送消息的时候发送到close channel中
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯,也对,close 得放在 sender 这里
我们搞成一个锁吧,fin 和 reporter 都放在 h.mutex 里,复用一下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 284 to 289 in 17ac08a
h.goroutineCheckAndDump(gNum) | |
h.memCheckAndDump(mem) | |
h.cpuCheckAndDump(cpu) | |
h.threadCheckAndDump(tNum) | |
h.threadCheckAndShrink(tNum) | |
} |
判断并关闭的操作放在 每次dumpLoop函数的最后面如何,这样就不会有多次关闭channe的问题了,锁也可以删掉。
然后每个类型只需要判断stopped 然后 return。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
想法不错,不过,gcheap 是另外一个循环,不在这个 loop 里呢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
想法不错,不过,gcheap 是另外一个循环,不在这个 loop 里呢
gcheap 也做类似判断吧。
2处判断 比 6处判断+锁竞争好一些。
@songzhibin97 你觉得呢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
放外面的话感觉别人阅读代码是不是有点抽象
holmes.go
Outdated
ch := gc.h.getGcEventCh() | ||
if ch == nil { | ||
return | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这段代码没必要了?
holmes.go
Outdated
if ch != nil { | ||
close(gc.h.gcEventsCh) | ||
gc.h.setGcEventCh(nil) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这段也可以删了?
holmes.go
Outdated
@@ -189,7 +229,6 @@ func (h *Holmes) Start() { | |||
// Stop the dump loop. | |||
func (h *Holmes) Stop() { | |||
atomic.StoreInt64(&h.stopped, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在这里 close channel,set channel = nil 就可以?在锁的保护下
holmes.go
Outdated
@@ -161,6 +200,7 @@ type gcHeapFinalizer struct { | |||
} | |||
|
|||
func (h *Holmes) startGCCycleLoop() { | |||
h.setGcEventCh(make(chan struct{}, 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些 channel 初始化,可以统一放到 Start 函数里?
holmes.go
Outdated
continue | ||
} | ||
opts.reporter.Report(evt.PType, evt.Buf, evt.Reason, evt.EventID) // nolint: errcheck | ||
ch := h.getRptEventsCh() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里直接通过函数参数传参,更好
holmes.go
Outdated
return | ||
} | ||
|
||
if h.opts.rptOpts.allowDiscarding { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个配置很鸡肋呢,要是不允许丢失,就阻塞主流程了,这个是更坏的影响
设计上来说,就不应该允许这种破坏主流程的行为
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
要想不丢失消息,有的是办法,消费者直接起协程来处理就可以
holmes.go
Outdated
return | ||
} | ||
// Waiting to be sent | ||
ch <- rptEvent{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这种有问题的,如果发生了阻塞写,此时再 close 就 panic 了
holmes_test.go
Outdated
@@ -102,6 +102,9 @@ func createThread(n int, blockTime time.Duration) { | |||
} | |||
|
|||
func TestWithShrinkThread(t *testing.T) { | |||
h.Stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为啥要在这里加呢?单独起个用例更好一些
go.mod
Outdated
@@ -6,5 +6,5 @@ require ( | |||
github.com/gin-gonic/gin v1.7.7 | |||
github.com/shirou/gopsutil v3.20.11+incompatible | |||
github.com/stretchr/testify v1.7.0 | |||
mosn.io/pkg v0.0.0-20211217101631-d914102d1baf | |||
mosn.io/pkg v0.0.0-20220308091858-ea728aacbe63 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请不要更新这个
pType, | ||
buf, | ||
reason, | ||
eventID} | ||
eventID}: | ||
default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
channel 满了的时候,最好加个 infof log
holmes.go
Outdated
} | ||
|
||
// startReporter starts a background goroutine to consume event channel, | ||
// and finish it at after receive from cancel channel. | ||
func (h *Holmes) startReporter() { | ||
func (h *Holmes) startReporter(ch chan rptEvent) { | ||
if ch == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个不能是 nil 了吧?
holmes.go
Outdated
@@ -191,17 +222,35 @@ func (h *Holmes) Start() { | |||
h.Errorf("Holmes has started, please don't start it again.") | |||
return | |||
} | |||
h.setGcEventCh(make(chan struct{}, 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 这里的 channel 也可以当做参数往下传,可以简化很多逻辑
- 这两个 set channel,不用搞封装了,直接在这里显示 lock 就行了
holmes.go
Outdated
// getGcEventCh get gcEventsCh | ||
func (h *Holmes) getGcEventCh() chan struct{} { | ||
h.Lock() | ||
defer h.Unlock() | ||
return h.gcEventsCh | ||
} | ||
|
||
// setGcEventCh set gcEventsCh | ||
func (h *Holmes) setGcEventCh(ch chan struct{}) { | ||
h.Lock() | ||
defer h.Unlock() | ||
h.gcEventsCh = ch | ||
} | ||
|
||
// getRptEventsCh get rptEventsCh | ||
func (h *Holmes) getRptEventsCh() chan rptEvent { | ||
h.Lock() | ||
defer h.Unlock() | ||
return h.rptEventsCh | ||
} | ||
|
||
// setRptEventsCh set rptEventsCh | ||
func (h *Holmes) setRptEventsCh(ch chan rptEvent) { | ||
h.Lock() | ||
defer h.Unlock() | ||
h.rptEventsCh = ch | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些应该都没什么必要了,你可以捋一下的,传参可以简化很多
提了一些改动在这个 PR:#85 |
No description provided.