Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use retry logic for HTTP communication with IMDS #107

Merged
merged 1 commit into from
Aug 21, 2024

Conversation

dongsupark
Copy link
Collaborator

@dongsupark dongsupark commented Jul 18, 2024

Use retry logic and max duration for HTTP communication with IMDS, creating a generic wrapper retry_loop.
Doing that, it is possible for azure-init to retry to send HTTP requests, when requests failed for some reason.

For now we define a fixed const for timeout, max retries, for simplicity.

Make imds::query() capable of receiving input URL, so that both main binary and unit tests can send requests to different endpoints.

Fixes #63

Testing done

TBD

@dongsupark dongsupark force-pushed the imds-http-retry branch 3 times, most recently from 10f8dae to 7b58355 Compare July 19, 2024 11:48
@anhvoms
Copy link
Contributor

anhvoms commented Jul 19, 2024

Do we know how much this increases the final build size of azure-init?

src/main.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@anhvoms anhvoms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how much we gain from using azure sdk vs just implementing our own retry policy here. Can you verify the final code size?

src/main.rs Outdated Show resolved Hide resolved
@dongsupark
Copy link
Collaborator Author

I'm not sure how much we gain from using azure sdk vs just implementing our own retry policy here. Can you verify the final code size?

  1. debug:
  • without Azure SDK
69084 -rwxrwxr-x 2 dpark dpark 70734488 Jul 19 14:17 target/debug/azure-init*
  • with Azure SDK
114888 -rwxrwxr-x 2 dpark dpark 117639624 Jul 19 13:48 target/debug/azure-init*

+66%

  1. release:
  • without Azure SDK
6904 -rwxrwxr-x 2 dpark dpark 7069624 Jul 19 14:19 target/release/azure-init*
  • with Azure SDK
10356 -rwxrwxr-x 2 dpark dpark 10603568 Jul 19 14:15 target/release/azure-init*

+49%

Tried also turning off default features of Azure SDK crates, but not much difference.

Copy link
Member

@jeremycline jeremycline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on the fence about the SDK - the lack of error codes to retry seems like it might be a deal-breaker. Since we want a fixed retry policy without jitter and so on, the implementation could be a loop with a constant sleep in it, wrapped in https://docs.rs/tokio/latest/tokio/time/fn.timeout.html.

src/main.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
@dongsupark
Copy link
Collaborator Author

I'm on the fence about the SDK - the lack of error codes to retry seems like it might be a deal-breaker. Since we want a fixed retry policy without jitter and so on, the implementation could be a loop with a constant sleep in it, wrapped in https://docs.rs/tokio/latest/tokio/time/fn.timeout.html.

Ok, let me rewrite this PR based on https://docs.rs/tokio/latest/tokio/time/fn.timeout.html or any simpler ones.

@dongsupark dongsupark changed the title Use Azure SDK for HTTP communication with IMDS Use retry logic for HTTP communication with IMDS Jul 25, 2024
src/main.rs Outdated Show resolved Hide resolved
libazureinit/src/util.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
Copy link
Member

@jeremycline jeremycline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few very minor notes, but it looks good to me. You should also be able to write a unit test for the timeout case, I think, with https://docs.rs/tokio/latest/tokio/time/fn.advance.html although I've not personally done this before so I'm just guessing based on the documentation.

libazureinit/src/error.rs Outdated Show resolved Hide resolved
libazureinit/src/imds.rs Outdated Show resolved Hide resolved
libazureinit/src/imds.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
@dongsupark dongsupark force-pushed the imds-http-retry branch 2 times, most recently from 0cf0ba6 to 86085a1 Compare August 8, 2024 14:05
@dongsupark dongsupark marked this pull request as ready for review August 8, 2024 14:05
@dongsupark
Copy link
Collaborator Author

A few very minor notes, but it looks good to me. You should also be able to write a unit test for the timeout case, I think, with https://docs.rs/tokio/latest/tokio/time/fn.advance.html although I've not personally done this before so I'm just guessing based on the documentation.

Added unit tests using tokio::time::advance etc.

Copy link
Member

@jeremycline jeremycline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks! I had two little optional notes, but this is good to go from my perspective.

libazureinit/src/imds.rs Outdated Show resolved Hide resolved
libazureinit/Cargo.toml Outdated Show resolved Hide resolved
Copy link
Member

@vmarcella vmarcella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the most part this LGTM, but I do want @anhvoms thoughts on exponentially increasing the retry interval when making repeated calls to IMDS before approving these changes.

libazureinit/src/imds.rs Outdated Show resolved Hide resolved
let imds_body = response.text().await?;
let metadata: InstanceMetadata = serde_json::from_str(&imds_body)?;
if statuscode.is_success() && statuscode == StatusCode::OK {
return response.error_for_status();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would error_for_status() return here given that the status code is a success?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, it would just return the inner response of type Response itself, where the outer response would be Ok.

libazureinit/src/imds.rs Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
Use retry logic and max duration for HTTP communication with IMDS,
creating a retry loop over getting response from IMDS.
Doing that, it is possible for azure-init to retry to send HTTP
requests, when requests failed for some reason.

For now we define a fixed const for timeout, max retries for simplicity.

Make imds::query() capable of receiving input URL, so that both main
binary and unit tests can send requests to different endpoints.
@dongsupark dongsupark merged commit 1f64f1c into Azure:main Aug 21, 2024
5 checks passed
@dongsupark dongsupark deleted the imds-http-retry branch August 21, 2024 08:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RFE] azure-init should add retries around IMDS and Wireserver operations
4 participants