Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure helper: Increase Azure Endpoint HTTP retries #619

Conversation

johnsonshi
Copy link
Contributor

@johnsonshi johnsonshi commented Oct 20, 2020

Proposed Commit Message

Azure helper: Increase Azure Endpoint HTTP retries

Increase Azure Endpoint HTTP retries to handle
occasional platform network blips.

Introduce a common method http_with_retries
in the azure.py helper, which will serve as
the common HTTP request handler for
all HTTP requests with the Azure endpoint.
This method has builtin retries and
reporting diagnostics logic.

Additional Context

Additional context above.

Test Steps

Ensure no regressions in pre-existing Azure endpoint interaction for happy path code

  • Create an Azure VM from the latest Ubuntu 18.04 LTS image.
  • Install this PR's cloud-init.
  • Capture the VM into a custom image.
  • Deploy a VM from this custom image. Deployment succeeds.
  • No warnings/errors. Manually inspected cloud-init logs and ensured interaction with the Azure endpoint was seamless.

Ensure that long delay log message is logged, if there is a long delay (>5 sec) that eventually succeeds during interaction with the Azure endpoint.

  • Same steps as above. Modify http_with_retries to fail during the first 15 attempts, and only succeed at the 16th attempt.
  • Capture the VM into a custom image.
  • Deploy a VM from this custom image. Deployment succeeds.
  • No warnings/errors. Manually inspected cloud-init logs and ensured interaction with the Azure endpoint succeeded but with the new success diagnostic message present.

Ensure that error messages are logged upon failure despite max attempt retries

  • Same as above, except modify http_with_retries so that it fails after max attempts.
  • Inspect cloud-init logs to ensure that the new failure diagnostic message is present.

Checklist:

  • My code follows the process laid out in the documentation
  • I have updated or added any unit tests accordingly
  • I have updated or added any documentation accordingly

@johnsonshi
Copy link
Contributor Author

Pending internal discussions on what the error messages should be.

@mitechie
Copy link
Contributor

mitechie commented Nov 2, 2020

@johnsonshi is this ready for review? Are we happy with the error messages and such at this point?

Thanks!

@johnsonshi
Copy link
Contributor Author

@mitechie yes we've discussed internally and these will be the final error messages.

@TheRealFalcon TheRealFalcon self-assigned this Nov 3, 2020
Copy link
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much @johnsonshi for this branch. Minor nits on review. If this is acceptable, I can apply them or you can then we will land it.

cloudinit/sources/helpers/azure.py Outdated Show resolved Hide resolved
cloudinit/sources/helpers/azure.py Outdated Show resolved Hide resolved
cloudinit/sources/helpers/azure.py Outdated Show resolved Hide resolved
cloudinit/sources/helpers/azure.py Outdated Show resolved Hide resolved
cloudinit/sources/helpers/azure.py Show resolved Hide resolved
cloudinit/sources/helpers/azure.py Outdated Show resolved Hide resolved
cloudinit/sources/helpers/azure.py Outdated Show resolved Hide resolved
@johnsonshi
Copy link
Contributor Author

johnsonshi commented Nov 18, 2020

Thanks you so much @blackboxsw! I've addressed the requested changes above. Let's wait for #594 to land before merging this.

#594 is also about to be merged. Just addressed OddBloke's final set of minor requested changes and NITs in that other PR, so it should be mergeable once he reviews it.

Given that #594 is more complicated than this PR, it will be easier to fix this PR if there are any merge conflicts.

Also as a final validation, I built a deb package from this and performed Azure deployment tests. Confirmed that everything works fine with live deployment tests.

Copy link
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @johnsonshi this it good to land today.

@blackboxsw blackboxsw merged commit 6df0230 into canonical:master Nov 18, 2020
@johnsonshi johnsonshi deleted the azure-increase-wireserver-interaction-retries branch November 18, 2020 18:12
This was referenced May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants