Skip to content
This repository has been archived by the owner on Nov 1, 2023. It is now read-only.

If a supervisor fails after starting a task, it will pick up another task and execute it. #96

Closed
bmc-msft opened this issue Oct 3, 2020 · 1 comment
Labels
bug Something isn't working

Comments

@bmc-msft
Copy link
Contributor

bmc-msft commented Oct 3, 2020

Information

  • Onefuzz version: 1.0.0
  • OS: linux

Provide detailed reproduction steps (if any)

  1. Start task
  2. Wait until it's running on a node
  3. Login to node (debug ssh works here)
  4. Update service to raise an Exception in the agent callback URL handlers
  5. Wait until supervisor exits due to HTTP errors
  6. Undo change in step 4.
  7. Watch supervisor restart and pick up a new task.

Expected result

Supervisor picks up a new task for the node, even though it's already running one

Actual result

Supervisor is resilient to http comms issues and has multiple retry attempts with backoff logic.

@bmc-msft bmc-msft added the bug Something isn't working label Oct 3, 2020
@bmc-msft
Copy link
Contributor Author

This is done.

@ghost ghost locked as resolved and limited conversation to collaborators Dec 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant