increment calculation on release can hide failures #6475

wouterdb · 2023-09-08T14:28:16Z

Problem

a version v1 is deploying
a new version v2 is released
resource a[k=x],v=2 is marked as deployed for known good state
resource a[k=x],v=1 fails

Now, the good state of v2 has masked the bad state of v1.

Current solution proposal

when a deploy fails
on a version that is not the latest
and the corresponding resource on the latest version is set to deploy
- this can only be due to increment calculation, because the agent can only start deploying the latest released version at any time, so the newer version has never been deployed by the agent, so is was set by the increment. We should be able to assert that the attribute has MUST be identical
then we set it to failed instead. (with a message: setting to failed because of failure in previous version xyz)

The first two conditions are fairly cheap to check

Transactional behavior

The idea would be to execute this code, in a separate transaction, at the end of the 'deploy_done' method
However, it races with the actual release_version (and other full) increment calculation
We can however retry this write until we catch up
question: is it sufficient to make this a retryable serializable transaction to make it retry on this

The text was updated successfully, but these errors were encountered:

…yments for older versions (Issue #6475, PR #6486) I try to solve the race condition between deploy and release by patching up the latest released version closes #6475 Strike through any lines that are not applicable (`~~line~~`) then check the box - [x] Attached issue to pull request - [x] Changelog entry - [x] Type annotations are present - [x] Code is clear and sufficiently documented - [x] No (preventable) type errors (check using make mypy or make mypy-diff) - [x] Sufficient test cases (reproduces the bug/tests the requested feature) - [x] Correct, in line with design - [ ] End user documentation is included or an issue is created for end-user documentation (add ref to issue here: ) - [ ] If this PR fixes a race condition in the test suite, also push the fix to the relevant stable branche(s) (see [test-fixes](https://internal.inmanta.com/development/core/tasks/build-master.html#test-fixes) for more info)

…yments for older versions (Issue #6475, PR #6514) # Description cherry pick of #6486

wouterdb self-assigned this Sep 11, 2023

wouterdb mentioned this issue Sep 11, 2023

Issue/6475 release masking failure #6486

Closed

9 tasks

inmantaci closed this as completed in 69d362b Sep 15, 2023

inmantaci pushed a commit that referenced this issue Sep 19, 2023

Ensure releasing a new version can not hide failures in ongoing deplo…

d91b020

…yments for older versions (Issue #6475, PR #6514) # Description cherry pick of #6486

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

increment calculation on release can hide failures #6475

increment calculation on release can hide failures #6475

wouterdb commented Sep 8, 2023

increment calculation on release can hide failures #6475

increment calculation on release can hide failures #6475

Comments

wouterdb commented Sep 8, 2023

Problem

Current solution proposal

Transactional behavior