-
Notifications
You must be signed in to change notification settings - Fork 963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PushMeterRegistry metrics lost on close if scheduled publish is in progress #3872
Comments
After looking into #3863, I think the issue you describe only happens when the registry is closed while a scheduled publish is in progress, which should be a somewhat rare case. Otherwise, we block on the thread that calls |
Can you confirm if my assumption in my previous comment is true for what you're seeing? |
This is in line with what you wrote above. I originally saw the "Duplicate call to publish" message, meaning the scheduled publish was in progress while I tried to shut the app down. I have not had problems with unfinished exports when the shutdown-publish is responsible. |
Edit: A bit more background:
We would really like to be able to ensure the last export (independent of whether it is triggered by periodic export or shutdown signal) completes before the app shuts down. I understand that blocking the app from shutting down is against Micrometers' philosophy, but I could see this as a config option (i.e., an opt-in like: I am willing to wait for the export to finish before my app shuts down). Any thoughts on this? |
@shakuzen I am thinking about how this could best be tackled since it potentially can have quite the impact (e.g. losing data about how many background queues were shut down before the app shut down because the publishing does not complete). I was able to come up with two ways of dealing with this:
Do you think adding a (customizable?) timeout is a viable way forward? This could be set to |
Since this is pretty serious for some of our customers, I have investigated further. Some of them are pretty firm on not upgrading to newer Micrometer versions until this is addressed, so I would be really happy if we could get this moving before 1.12. I have created a reproducer here: d0154b1 It shows that when Are there any objections to adding such a setting? Otherwise, I will open a PR to add such a feature. CC @shakuzen @jonatan-ivanov |
We have moved the reproducer from a single commit to an actual branch and opened a PR (#4062). The PR contains two test cases, one that shows the current behavior (and how the default behavior does not change when merging this PR; the test is called |
Just leaving a comment to say I'll be looking at this again. Thanks for the investigation and test cases and pull request so far. Much appreciated. |
Thank you, please let me know if you need any more information! |
Sorry for the late response to this. I thought I left a comment regarding this somewhere else, but I haven't been able to find it (there are a lot of comments in a lot of different threads related to this issue, so I may have missed it or I never made it). Given that we do a final publish in a blocking manner on |
Thanks for taking a look! I actually prefer this to be the default, but wasn't sure if there was appetite for it by others as well. That's why I made it optional in my proposal. The only feedback I have for #4287 is that this will now block indefinitely if there is a long-running request. I don't assume this to be a problem, since HTTP requests usually time out in a reasonable timeframe compared to an application shutdown, but I wanted to point it out. |
Thanks for taking a look and sharing your thoughts. I appreciate it. We can of course consider feedback if users have some issue related to this behavior, but I don't remember anything reported against the behavior of blocking for publish on close that we've had since I believe Micrometer 1.1. |
As pointed out in #3832 (comment):
Therefore, in some cases, the last export is not finished because the JVM shuts down before the export is completed. This can happen for exports triggered by a timer (periodic exports), as well as for exports triggered by a shutdown signal (e.g. hitting the actuator shutdown endpoint in a Spring Boot app).
I think adding a configuration option to wait for the exporter thread to finish its export would be helpful. This could be combined with a timeout to define the maximum time to wait until the app shuts down even if the data is not exported, in order to not block the shutdown indefinitely in case of network issues. By setting this timeout to 0, Micrometer would behave in the same way as it does today.
Rationale
From my research on #3832, I realized that it is possible that apps shut down without exporting data collected between the previous export and the shutdown signal. For certain metrics, this might be relevant information. I believe it should be the choice of the person running the application whether to have the application run for a few extra cycles in order to ensure data completeness. Additionally, for apps running only for a short period of time, there is a chance that no data is exported if the first export would only happen after the app is shut down. By ensuring the export is complete before shutting down, even short-lived apps will always be able to export their data.
The text was updated successfully, but these errors were encountered: