Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(HTTP code 409) conflict - conflict: unable to delete (cannot be forced) - image is being used by running container #841

Open
willswire opened this issue Dec 11, 2018 · 13 comments
Labels
Needs more investigation More investigation is needed to find the cause or fix type/bug

Comments

@willswire
Copy link

willswire commented Dec 11, 2018

When implementing the delete-then-download application update strategy, devices are unable to remove the existing container due to the following error. Confirmed supervisor version running >= v2.5.1.

10.12.18 14:46:53 (-0500) Killing service 'main'
10.12.18 14:46:53 (-0500) Deleting image
10.12.18 14:46:53 (-0500) Failed to delete image due to '(HTTP code 409) conflict - conflict: unable to delete (cannot be forced) - image is being used by running container'

@willswire
Copy link
Author

It is possible to bypass this issue by first stopping the container, then pushing the update. The supervisor, under the delete-then-download strategy should do this rather than the user having to do so manually.

@willswire
Copy link
Author

Same error on supervisor ver 9.0.1

17.01.19 12:12:52 (-0500) Killing service 'main sha256:08251b0aa8d6c66bd1b240ea4cf172963e81cbd8bac252fb144ba7eaaa0b41a0'
17.01.19 12:12:52 (-0500) Deleting image 'registry2.balena-cloud.com/v2/0f74bea475cd7a34f257da6be491baa8@sha256:d8b60a410da5ab912d725ed19eba5624e71359814dc09f9f83b8b71ba77fc98f'
17.01.19 12:12:52 (-0500) Failed to delete image 'registry2.balena-cloud.com/v2/0f74bea475cd7a34f257da6be491baa8@sha256:d8b60a410da5ab912d725ed19eba5624e71359814dc09f9f83b8b71ba77fc98f' due to '(HTTP code 409) conflict - conflict: unable to delete 08251b0aa8d6 (cannot be forced) - image is being used by running container c3bc0919b429 '

@CameronDiver CameronDiver added type/bug High priority This issue has a high priority, and needs to be fixed ASAP Needs more investigation More investigation is needed to find the cause or fix labels Jan 17, 2019
@CameronDiver
Copy link
Contributor

Hey @willswire thanks for the report. I'll try to reproduce this soon and see what we can do. I imagine it's something like the supervisor not giving the container time to exit, so this would be where I'll start looking.

@willswire
Copy link
Author

@CameronDiver thanks! If it helps at all, our current situation is:

  • Deploying a single container image (base image: balenalib/amd64-node:jessie) with a total payload of 955.08 MB
  • Device Type: WYSE Zx0 (AMD64 Architecture) with 2GB flash storage

The device will stay in a constant loop, reporting the same error.

@CameronDiver
Copy link
Contributor

Thanks for the extra info;

Does the container catch and act upon signals, for example the SIGTERM that docker will send to ask a container to stop running?

I mean even if it does, this is still a bug, because the supervisor shouldn't be trying to remove the image until the container has stopped.

@willswire
Copy link
Author

Any SIGTERM commands sent via the console, prior to initiating an update, are successful. Once the nightmarish update loop starts however, there's no response to any 'restart', 'stop' or 'start' commands.

@CameronDiver
Copy link
Contributor

CameronDiver commented Feb 5, 2019

Hey @willswire sorry for the delay. I finally got some time to do some investigation here. I didn't manage to reproduce, but a colleague of mine did find a potential problem in the way that the state engine handles the delete-then-download strategy.

If possible, would you be able to try a new supervisor image which should fix this, or alternatively provide me with the source code for your project (and I'll try to dig out a device of the same type)?

The changes are implemented in this PR: #893

CameronDiver pushed a commit that referenced this issue Feb 5, 2019
In the original implementation it was possible that the delete did not
wait for the kill step to be finished, so it would not be deleted.

We seperate this process into two steps, to allow for the container to
have stopped before proceeding.

Change-type: patch
Closes: #841
Signed-off-by: Cameron Diver <cameron@balena.io>
@ghost ghost assigned CameronDiver Feb 5, 2019
@ghost ghost added the flow/in-progress label Feb 5, 2019
CameronDiver pushed a commit that referenced this issue Feb 5, 2019
In the original implementation it was possible that the delete did not
wait for the kill step to be finished, so it would not be deleted.

We seperate this process into two steps, to allow for the container to
have stopped before proceeding.

Change-type: patch
Closes: #841
Signed-off-by: Cameron Diver <cameron@balena.io>
@ghost ghost assigned balena-ci Feb 5, 2019
CameronDiver pushed a commit that referenced this issue Feb 5, 2019
In the original implementation it was possible that the delete did not
wait for the kill step to be finished, so it would not be deleted.

We seperate this process into two steps, to allow for the container to
have stopped before proceeding.

Change-type: patch
Closes: #841
Signed-off-by: Cameron Diver <cameron@balena.io>
CameronDiver pushed a commit that referenced this issue Feb 6, 2019
In the original implementation it was possible that the delete did not
wait for the kill step to be finished, so it would not be deleted.

We seperate this process into two steps, to allow for the container to
have stopped before proceeding.

Change-type: patch
Closes: #841
Signed-off-by: Cameron Diver <cameron@balena.io>
@ghost ghost removed the flow/in-progress label Feb 6, 2019
@willswire
Copy link
Author

@CameronDiver we can try the new supervisor image to test! How would we go about deploying the latest image to our machine?

@CameronDiver
Copy link
Contributor

Thanks @willswire I'm pretty sure it should fix your issue (hence the closing) but finding out before release is certainly better.

The way that you could do this is to open a host OS terminal on your device and run update-resin-supervisor -t v9.7.1 -i balena/amd64-supervisor.

@willswire
Copy link
Author

@CameronDiver thanks! The issue has been resolved.

@CameronDiver
Copy link
Contributor

Really happy to hear :)

@jellyfish-bot
Copy link

[cywang117] This issue has attached support thread https://jel.ly.fish/e74a1106-b0eb-4f02-8f46-b78732db1ef9

@cywang117 cywang117 removed the High priority This issue has a high priority, and needs to be fixed ASAP label Feb 16, 2023
@cywang117
Copy link
Contributor

For context, this error message is passed by the Supervisor from the Engine during updates, when an image in the current release needs to be deleted in favor of an image in the target release. The Supervisor should wait for containers to stop before attempting to remove images, but if a container fails to stop even with a balena kill, then this error may appear. Before commenting on or linking to this issue, please investigate if there are any processes in a service that fail to exit, even with a kill -9. If this error occurs in the absence of zombie user container processes, then that is a potential bug of the Supervisor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs more investigation More investigation is needed to find the cause or fix type/bug
Projects
None yet
Development

No branches or pull requests

5 participants