-
Notifications
You must be signed in to change notification settings - Fork 456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make tests resilient to Windows service manager errors #5608
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
nimanch
approved these changes
Oct 1, 2021
yophilav
approved these changes
Oct 1, 2021
6 tasks
kodiakhq bot
pushed a commit
that referenced
this pull request
Oct 20, 2021
A recent PR (#5608) added retry logic to the end-to-end tests on Windows when they try to stop the IoT Edge service but the service manager isn't ready. This PR expands that one to include another case: when the tests try to stop the IoT Edge service but the service is already stopped. ``` X QuickstartCerts [7s 488ms] Error Message: System.InvalidOperationException : Cannot stop iotedge service on computer '.'. ----> System.ComponentModel.Win32Exception : The service has not been started. ``` The code path that stops the service first checks its status, and only issues the stop command if the service isn't already stopped. However, checking the service status + stopping the service is not an atomic operation, so there is a small window of opportunity to call "stop" on an already-stopped service. This change handles that window by checking the service status on every retry, not just the first time through. I was unable to get the condition to repro again after several runs in the pipeline, but I at least confirmed that these changes don't disrupt the happy path. ## Azure IoT Edge PR checklist: This checklist is used to make sure that common guidelines for a pull request are followed. ### General Guidelines and Best Practices - [x] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing). - [x] Title of the pull request is clear and informative. - [x] Description of the pull request includes a concise summary of the enhancement or bug fix. ### Testing Guidelines - [x] Pull request includes test coverage for the included changes. - Description of the pull request includes - [x] concise summary of tests added/modified - [x] local testing done. ### Draft PRs - Open the PR in `Draft` mode if it is: - Work in progress or not intended to be merged. - Encountering multiple pipeline failures and working on fixes. _Note: We use the kodiakhq bot to merge PRs once the necessary checks and approvals are in place. When it merges a PR, kodiakhq converts the PR title to the commit title, PR description to the commit description, and squashes all the commits in the PR to a single commit. The net effect is that entire PR becomes a single commit. Please follow the best practices mentioned [here](https://chris.beams.io/posts/git-commit/#:~:text=The%20seven%20rules%20of%20a%20great%20Git%20commit,what%20and%20why%20vs.%20how%20For%20example%3A%20) for the PR title and description_
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In a failed end-to-end test run on Windows earlier today, I noticed that all test failures looked like this:
I've noticed this before as well. Sometimes, the service manager in Windows isn't ready when you stop a service, but waiting a little while and trying again tends to solve it (e.g., see this). I noticed we weren't retrying when we get this exception, so this change adds the retry logic.
The error is not very common, so I was unable to confirm that my change works around it. But I ran the Windows jobs in the pipeline 7 times (with several jobs running in parallel too), and the stop logic still works, so at least generally I know I didn't make it worse. I added a verbose log so that we can gather more data if we still see this error in the future.