Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: disabe auto remove if debug is enabled #757

Merged

Conversation

Haegi
Copy link
Contributor

@Haegi Haegi commented Dec 21, 2022

fixes: #748

Changes

This PR changes the behaviour of the WithDebugMode option. If it is specified, the container will not be auto removed.

Tests
  • Stop container without WithDebugMode -> should be gone
  • Stop container with WithDebugMode -> should be still visible

@orlangure
Copy link
Owner

This is awesome!

I see one issue with this change: gnomock.Stop() will now not remove the container, if it was created with debug flag, and this isn't the behaviour I would expect in this case. Containers started with debug mode should remain for debugging purposes in case something doesn't work as expected, but if the test completes and calls Stop explicitly, the container should be killed completely. Do you agree?

@codecov-commenter
Copy link

codecov-commenter commented Dec 23, 2022

Codecov Report

Base: 86.32% // Head: 78.52% // Decreases project coverage by -7.80% ⚠️

Coverage data is based on head (64532b3) compared to base (8658ea5).
Patch coverage: 75.00% of modified lines in pull request are covered.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #757      +/-   ##
==========================================
- Coverage   86.32%   78.52%   -7.80%     
==========================================
  Files          49       48       -1     
  Lines        2325     2324       -1     
==========================================
- Hits         2007     1825     -182     
- Misses        165      319     +154     
- Partials      153      180      +27     
Impacted Files Coverage Δ
docker.go 68.44% <73.33%> (-23.77%) ⬇️
gnomock.go 69.33% <100.00%> (-15.34%) ⬇️
parallel.go 0.00% <0.00%> (-100.00%) ⬇️
internal/errors/errors.go 20.00% <0.00%> (-80.00%) ⬇️
internal/gnomockd/stop.go 43.75% <0.00%> (-56.25%) ⬇️
internal/health/health.go 62.50% <0.00%> (-37.50%) ⬇️
internal/cleaner/cleaner.go 66.66% <0.00%> (-33.34%) ⬇️
options.go 69.87% <0.00%> (-30.13%) ⬇️
internal/registry/registry.go 71.42% <0.00%> (-28.58%) ⬇️
... and 6 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@Haegi
Copy link
Contributor Author

Haegi commented Dec 31, 2022

This is awesome!

I see one issue with this change: gnomock.Stop() will now not remove the container, if it was created with debug flag, and this isn't the behaviour I would expect in this case. Containers started with debug mode should remain for debugging purposes in case something doesn't work as expected, but if the test completes and calls Stop explicitly, the container should be killed completely. Do you agree?

Good point. I will take a look next week.
I guess just the container removal logic when calling gnomock.Stop() is missing, right?

@orlangure
Copy link
Owner

I guess just the container removal logic when calling gnomock.Stop() is missing, right?

@Haegi correct, because if the AutoRemove flag is always set, there is no need to implement container removal; it is enough to just stop it.

@Haegi
Copy link
Contributor Author

Haegi commented Jan 5, 2023

I guess just the container removal logic when calling gnomock.Stop() is missing, right?

@Haegi correct, because if the AutoRemove flag is always set, there is no need to implement container removal; it is enough to just stop it.

I implemented it in Haegi@7f3b016.
BUT super rarely I got some message that the deletion is already in progress. But unfortunately, I didn't find a way to reproduce it.

Copy link
Owner

@orlangure orlangure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Haegi, thank you for taking care of container removal. There are a couple of small things I think are still missing. Otherwise this is great😻

docker.go Outdated
d.lock.Lock()
defer d.lock.Unlock()

err := d.client.ContainerRemove(ctx, id, types.ContainerRemoveOptions{})
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to use Force: true here, just in case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it here: a2b478c

@@ -252,7 +252,7 @@ func (g *g) stop(c *Container) error {
}
}

return nil
return cli.removeContainer(context.Background(), id)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please handle the "in progress" error gracefully (return without error), otherwise - fail with error. The error happens most likely due to the AutoRemove flag triggering container removal before the code reaches this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take care of it in a2b478c

gnomock_test.go Show resolved Hide resolved
@Haegi
Copy link
Contributor Author

Haegi commented Jan 9, 2023

@Haegi, thank you for taking care of container removal. There are a couple of small things I think are still missing. Otherwise this is great😻

Thanks :) I also noticed also two other things.

  1. The stopContainer currently doesn't ignore if the container can not be found. Should I also add a line to ignore that?
  2. I noticed that the gnomock-test-image is not stopping without gnomock.Stop() and we therefore don't test the auto remove. My idea would be to add some /shutdown endpoint to the test image and call it before calling gnomock.Stop(). This way wecheck if it still works with the auto remove. What do you think?

@orlangure
Copy link
Owner

Hi @Haegi,

Thanks for the updates and sorry for taking long to reply.

I like both ideas, but I think investing in the second one is a bit of an overkill right now. The first point is much more important. It would be great if you could take care of it.

Waiting for the final fixes to finally release a new version with your fixes 😼

@Haegi
Copy link
Contributor Author

Haegi commented Jan 24, 2023

Hi @Haegi,

Thanks for the updates and sorry for taking long to reply.

I like both ideas, but I think investing in the second one is a bit of an overkill right now. The first point is much more important. It would be great if you could take care of it.

Waiting for the final fixes to finally release a new version with your fixes 😼

No worries :)
I changed the stopContainer logic now to not thrown an error when the container is already gone.
I also had to adapt one test to not use the is already gone error but check it with the client.
But still the TestPreset_duplicateContainerName is failing and I am not sure why the originalContainer is supposed to fail when trying to stop it.

Copy link
Owner

@orlangure orlangure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

The test you are mentioning fails because of the same (or very similar) reason that the other test you fixed was failing: the original test expected an error because it was a "not found" error when trying to stop a container that doesn't exist anymore (it gets replaced with a new container with the same name). Now that we ignore the "not found" error the test doesn't get an expected error, so we basically change the expectations and should not expect an error anymore.

I also left one last comment, and unless I'm missing something, fixing it together with the test will complete that work, and I'll prepare a new release. This is awesome!

docker.go Outdated
defer d.lock.Unlock()

err := d.client.ContainerRemove(ctx, id, types.ContainerRemoveOptions{Force: true})
if err != nil && !client.IsErrNotFound(err) && isDeletionAlreadyInProgessError(err, id) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think isDeletionAlreadyInProgessError should be negated just like "not found" error, we want to ignore it instead of failing container removal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the hint! I had to change the logic of isDeletionAlreadyInProgessError. It seems like it was not really checking for the error. Now I was able to get it right and negate it.

@Haegi
Copy link
Contributor Author

Haegi commented Feb 21, 2023

Thank you!

The test you are mentioning fails because of the same (or very similar) reason that the other test you fixed was failing: the original test expected an error because it was a "not found" error when trying to stop a container that doesn't exist anymore (it gets replaced with a new container with the same name). Now that we ignore the "not found" error the test doesn't get an expected error, so we basically change the expectations and should not expect an error anymore.

I also left one last comment, and unless I'm missing something, fixing it together with the test will complete that work, and I'll prepare a new release. This is awesome!

Thanks for the explanation. I removed the error expectation and instead check with the docker client if the container is done.

Unfortunately, I noticed some other side effect. gnomockd doesn't return 500 anymore, when trying to stop an already stopped container. Should I try to change it to 404 instead of 500 or is 200 also fine? But 200 might be a bit misleading.

@orlangure
Copy link
Owner

I think 200 is fine, the goal is to make sure a container doesn't exist in the end after this endpoint is called, and it really won't exist. Since cleaner was added, Stop and /stop are optional anyway most of the time. I think this is the last fix, and your hard work will be merged and released. Thank you!😻

@Haegi
Copy link
Contributor Author

Haegi commented Feb 22, 2023

I think 200 is fine, the goal is to make sure a container doesn't exist in the end after this endpoint is called, and it really won't exist. Since cleaner was added, Stop and /stop are optional anyway most of the time. I think this is the last fix, and your hard work will be merged and released. Thank you!😻

I just changed the test and I also extracted the container listing by id into some helper function called ListContainerByID.
It seems like the github action test is a little bit flaky but it is green now.
Looks like we are finally ready to merge this PR :)

@Haegi Haegi requested a review from orlangure February 24, 2023 15:13
Copy link
Owner

@orlangure orlangure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing job, thank you for the contribution!

@orlangure orlangure merged commit 5356544 into orlangure:master Feb 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Nearly impossible to debug failing containers
3 participants