A panic occurred when used HTTP probe #222
Comments
@mazing80 Thanks a lot of reporting the issue. This panic is very strange though. goroutine that you pasted above is running the prometheus surfacer and is in bufio.Writer.Flush(). I can't imagine a reason for a panic in that function as the data is already prepared by the time that function is called (unless there is a bug in net/http or bufio -- extremely unlikely). I think there must be other goroutines in the stack trace. Can you please paste them too. Panic may occurring in some other goroutine. Thanks once again. |
@manugarg Thank you for the quick reply! Thank you! |
@mazing80 If you could share your config (privately if you'd prefer that), I can keep a cloudprober running with that config, or even look at the code for venues for possible panics. Also, if you have any other info, for example if adding of a specific probe or surfacer causes it, that will also help. |
@manugarg Sorry for the late reply.
BTW, I found the cause. If you look at the code, I have created PR(#229) to fix this so please take a look. |
@mazing80 That's a great find. Excellent work :) This bug is quite hard to reproduce because even if serving multiple requests in parallel, it depends a bit on chance -- which handler will read from the "done channel" first. Logic is clearly flawed though. I'll review your PR. Thank you! |
Using a shared "done" channel may result in panic if an ongoing HTTP handler reads the channel before the handler for which the request has actually finished. See #222 for more details. PiperOrigin-RevId: 242071124
Using a shared "done" channel may result in panic if an ongoing HTTP handler reads the channel before the handler for which the request has actually finished. See #222 for more details. PiperOrigin-RevId: 242071124
Fixed by: e086f8e. |
Prefect! I also encountered the same problem. I will verify if it can be solved effectively. |
@MrDragon1122 Thanks for reporting. Please do let us know what you find :) |
Using a shared "done" channel may result in panic if an ongoing HTTP handler reads the channel before the handler for which the request has actually finished. See google#222 for more details. PiperOrigin-RevId: 242071124
I have used multiple HTTP probes with one cloudprober.
But, after some time, panic occurs. (about 30 minutes or 1 hour later)
Environment:
OS: CentOS 7.6.1810
Docker: 18.09.3
Cloudprober: v0.10.1
The text was updated successfully, but these errors were encountered: