Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collector Shutdown will not be honored if called to early #4843

Closed
cpheps opened this issue Feb 10, 2022 · 6 comments · Fixed by #4878
Closed

Collector Shutdown will not be honored if called to early #4843

cpheps opened this issue Feb 10, 2022 · 6 comments · Fixed by #4878
Labels
bug Something isn't working

Comments

@cpheps
Copy link
Contributor

cpheps commented Feb 10, 2022

Describe the bug
There's a race condition in when calling Collector.Shutdown right after calling Collector.Run . Since the shutdownChan isn't initialized until startup has finished here. Any call to Shutdown while Run is in this section of execution will cause run to attempt to close a nil channel. This causes a panic which is caught but the shutdown call is effectively ignored and the collector will run as normal even though shutdown was called.

Steps to reproduce
Because this is a race condition it's hard to time but possibly rapidly calling Run then Shutdown could trigger it.

What did you expect to see?
Calling Shutdown at any point after calling Run should cause the collector to shutdown.

What did you see instead?
Again race condition but if you get the timing right you can call Shutdown and the collector will continue to run.

What version did you use?
v0.44.0

What config did you use?
Any valid config will work

Environment
OS: macOS Monterey 12.2

Additional context

I think a full solution would add a new state to collector states that indicate the collector has been created but never run. Maybe just a Created state. With that you can init the shutdownChan with the collector and Shutdown can check to close the channel only if the collector is in a Started or Running state, else it's a noop.

@cpheps cpheps added the bug Something isn't working label Feb 10, 2022
@buptubuntu
Copy link

Why don't we create shutdownChan in New

@cpheps
Copy link
Contributor Author

cpheps commented Feb 11, 2022

Why don't we create shutdownChan in New

I think we should but I also think we might need a new state to indicate the collector has been created but not started. If Shutdown is called in this new state it should not closed the channel or else the collector will never be able to be run. I think Shutdown should be a noop if the collector is not in a Starting or Running state.

@buptubuntu
Copy link

Why don't we create shutdownChan in New

I think we should but I also think we might need a new state to indicate the collector has been created but not started. If Shutdown is called in this new state it should not closed the channel or else the collector will never be able to be run. I think Shutdown should be a noop if the collector is not in a Starting or Running state.

That's right, we should also recreate shutdownChan after shutdown,in case of Running collector after shutdown

@cpheps
Copy link
Contributor Author

cpheps commented Feb 14, 2022

That's right, we should also recreate shutdownChan after shutdown,in case of Running collector after shutdown

I think I might be missing the reasoning for this. The collector is meant to be one time use.

Recreating the shutdownChan could cause a race condition and not shutdown if it's recreated before it's processed by the collector in runAndWaitForShutdownEvent. Mocked up this playground example to point out the race condition.

Please let me know if I'm missing something on the reasoning for recreating the shutdownChan. I can get started on this work but I don't want to omit that if we do need it.

@buptubuntu
Copy link

I think I might be missing the reasoning for this. The collector is meant to be one time use.

Recreating the shutdownChan could cause a race condition and not shutdown if it's recreated before it's processed by the collector in runAndWaitForShutdownEvent. Mocked up this playground example to point out the race condition.

Please let me know if I'm missing something on the reasoning for recreating the shutdownChan. I can get started on this work but I don't want to omit that if we do need it.

If the collector is meant to be one time use, it's unnecessary to recreate shutdownChan. But we should check state in Run

@cpheps
Copy link
Contributor Author

cpheps commented Feb 16, 2022

Makes sense. I'll try to start on a solution this week for this. Thanks for discussing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants