Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use kube-rs resource watcher instead of Api::watch #378

Merged
merged 6 commits into from Sep 13, 2021

Conversation

kate-goldenring
Copy link
Contributor

@kate-goldenring kate-goldenring commented Sep 9, 2021

What this PR does / why we need it:
closes #374

Special notes for your reviewer:
We updated to using the Api::watch object in kube-rs v0.59.0 in #361, expecting it to work similarly as the informer we were previously using to watch for actions on Pods, Nodes, Instances, Configurations. However, we have experienced different behavior, resulting in ref #372. and using Api::watch instead of kube::runtime::watcher. I was able to reproduce unexpected behavior locally when using the current implementation. If there are pre-existing configurations, handle_config_add is called twice, once from where we handle pre-existing configs and the other from the Api::watch. Api::watch appears to pull previous events before the Agent starts, which is fairly unpredictable. Using watcher should resolve this.

Note: currently not handling restart events other than logging them. This is due to:
1. The fact that we handled WatchEvent::Error previously just by logging
2. A restart event occurs on start up (aka is a start event too). We'd need to know it is the first restart event and handle that differently than others. We could pass a first_event bool in and error so long as it is not the first event.

In 19a44f9, made it so the watchers throw an error if they receive a restarted event that is not the initial start event. This will cause the controller or agent containers to error and restart and properly reconcile.

If applicable:

  • this PR has an associated PR with documentation in akri-docs
  • this PR contains unit tests
  • added code adheres to standard Rust formatting (cargo fmt)
  • code builds properly (cargo build)
  • code is free of common mistakes (cargo clippy)
  • all Akri tests succeed (cargo test)
  • inline documentation builds (cargo doc)
  • version has been updated appropriately (./version.sh)

@@ -228,7 +227,7 @@ async fn handle_config_delete(
kube_interface: &impl KubeInterface,
config: &Configuration,
config_map: ConfigMap,
) -> Result<(), Box<dyn std::error::Error + Send + Sync + 'static>> {
) -> anyhow::Result<()> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yay!

}
if *first_event {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: not sure what is more expensive, branching logic or setting logic, but it might be simpler to just always set *first_event=false here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. From a quick search i cant find an exploration someone has done on that -- might be fun to test. In the meantime, I think you're right that it makes more sense to just set it.

info!("handle_config - watcher started");
} else {
return Err(anyhow::anyhow!(
"Configuration watcher restarted - throwing error"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe say "Configuration watcher restarted - throwing error to restart agent"

@kate-goldenring kate-goldenring merged commit c635430 into project-akri:main Sep 13, 2021
@kate-goldenring kate-goldenring deleted the use-watcher branch September 13, 2021 17:02
vincepnguyen pushed a commit that referenced this pull request Nov 23, 2021
Signed-off-by: vincepnguyen <70007233+vincepnguyen@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use resource watcher instead of Api::watch
2 participants