New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto machine removal timing does not take into account if Octopus Server is offline #3924
Comments
I wonder what the right interpretation is here. Octopus isn't guaranteeing that the machine has been down for the whole period of the window - just that it saw it go down and hasn't seen it alive since, or something else? For example, if the health check window is 1hr and the removal window is 2hrs. If a machine goes down just before a health check, but comes back straight after, and then has a network blip 1hr later during health check, we'd remove it even though it's been up for all of the two hours except a couple of blips. Also, what about: machine goes down, health check runs, machine comes back, long deployments start that max out the tasks on the server (so health checks aren't runnings, cause waiting for deployments to finish), but auto removal isn't held up by deployments, so it will delete the machine - again even though it's been up the whole time. These and above are edge cases, but what's the promise Octopus is trying to make about this? |
Release Note: Auto machine removal now happens as part of health checks. Minor breaking change API endpoint for machine removal logs is removed and machine removal logs are no longer stored on the Octopus server. |
Pulled from 4.1.5 because it requires a change in the API. Planned for 4.2.0. |
@MichaelJCompton does this mean that there will now be no log entry anywhere for machines being removed? |
Hi @JesseNaranjo, testing for machine removal now happens as part of health checks and the health check logs record that and decisions to remove machines. If a machine is removed a machine removal event is added to the audit events. |
This thread has been automatically locked since there has not been any recent activity after it was closed. If you think you've found a related issue, please contact our support team so we can triage your issue, and make sure it's handled appropriately. |
Issue
Proposed resolution
Automated machine removal process should check that last health check run is after latest server restart before triggering machine removal
Reference: https://secure.helpscout.net/conversation/467497373/20922?folderId=1465198
The text was updated successfully, but these errors were encountered: