Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

increment calculation on release can hide failures #6475

Closed
wouterdb opened this issue Sep 8, 2023 · 0 comments
Closed

increment calculation on release can hide failures #6475

wouterdb opened this issue Sep 8, 2023 · 0 comments
Assignees

Comments

@wouterdb
Copy link
Contributor

wouterdb commented Sep 8, 2023

Problem

  1. a version v1 is deploying
  2. a new version v2 is released
  3. resource a[k=x],v=2 is marked as deployed for known good state
  4. resource a[k=x],v=1 fails

Now, the good state of v2 has masked the bad state of v1.

Current solution proposal

  1. when a deploy fails
  2. on a version that is not the latest
  3. and the corresponding resource on the latest version is set to deploy
    • this can only be due to increment calculation, because the agent can only start deploying the latest released version at any time, so the newer version has never been deployed by the agent, so is was set by the increment. We should be able to assert that the attribute has MUST be identical
  4. then we set it to failed instead. (with a message: setting to failed because of failure in previous version xyz)

The first two conditions are fairly cheap to check

Transactional behavior

  • The idea would be to execute this code, in a separate transaction, at the end of the 'deploy_done' method
  • However, it races with the actual release_version (and other full) increment calculation
  • We can however retry this write until we catch up
  • question: is it sufficient to make this a retryable serializable transaction to make it retry on this
@wouterdb wouterdb self-assigned this Sep 11, 2023
wouterdb added a commit that referenced this issue Sep 15, 2023
…yments for older versions (Issue #6475, PR #6486)

I try to solve the race condition between deploy and release by patching up the latest released version

closes #6475

Strike through any lines that are not applicable (`~~line~~`) then check the box

- [x] Attached issue to pull request
- [x] Changelog entry
- [x] Type annotations are present
- [x] Code is clear and sufficiently documented
- [x] No (preventable) type errors (check using make mypy or make mypy-diff)
- [x] Sufficient test cases (reproduces the bug/tests the requested feature)
- [x] Correct, in line with design
- [ ] End user documentation is included or an issue is created for end-user documentation (add ref to issue here: )
- [ ] If this PR fixes a race condition in the test suite, also push the fix to the relevant stable branche(s) (see [test-fixes](https://internal.inmanta.com/development/core/tasks/build-master.html#test-fixes) for more info)
inmantaci pushed a commit that referenced this issue Sep 19, 2023
…yments for older versions (Issue #6475, PR #6514)

# Description

cherry pick of #6486
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant