Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

up: fix various race/deadlock conditions on exit #10934

Merged
merged 2 commits into from Aug 31, 2023
Merged

Conversation

milas
Copy link
Member

@milas milas commented Aug 25, 2023

What I did
If running up in foreground mode (i.e. not -d), when exiting via Ctrl-C, Compose stops all the
services it launched directly as part of that up command.

In one of the E2E tests (TestUpDependenciesNotStopped), this was occasionally flaking because the stop
behavior was racy: the return might not block on
the stop operation because it gets added to the
error group in a goroutine. As a result, it was
possible for no services to get terminated on exit.

There were a few other related pieces here that
I uncovered and tried to fix while stressing this. For example, the printer could cause a deadlock if an event was sent to it after it stopped.

Also, an error group wasn't really appropriate here; each goroutine is a different operation for printing, signal-handling, etc. If one part fails, we don't
actually want printing to stop, for example. This has been switched to a multierror.Group, which has the same API but coalesces errors instead of canceling a context the moment the first one fails and returning that single error.

(not mandatory) A picture of a cute animal, if possible in relation to what you did
chinchilla

@milas milas requested a review from a team August 25, 2023 22:00
@milas milas self-assigned this Aug 25, 2023
@milas milas requested review from nicksieger, ndeloof, StefanScherer, ulyssessouza, glours and laurazard and removed request for a team August 25, 2023 22:00
@milas milas force-pushed the fix-stop-up branch 3 times, most recently from b931158 to 4347d31 Compare August 25, 2023 22:04
If running `up` in foreground mode (i.e. not `-d`),
when exiting via `Ctrl-C`, Compose stops all the
services it launched directly as part of that `up`
command.

In one of the E2E tests (`TestUpDependenciesNotStopped`),
this was occasionally flaking because the stop
behavior was racy: the return might not block on
the stop operation because it gets added to the
error group in a goroutine. As a result, it was
possible for no services to get terminated on exit.

There were a few other related pieces here that
I uncovered and tried to fix while stressing this.
For example, the printer could cause a deadlock if
an event was sent to it after it stopped.

Also, an error group wasn't really appropriate here;
each goroutine is a different operation for printing,
signal-handling, etc. If one part fails, we don't
actually want printing to stop, for example. This has
been switched to a `multierror.Group`, which has the
same API but coalesces errors instead of canceling a
context the moment the first one fails and returning
that single error.

Signed-off-by: Milas Bowman <milas.bowman@docker.com>
Signed-off-by: Milas Bowman <milas.bowman@docker.com>
@codecov
Copy link

codecov bot commented Aug 28, 2023

Codecov Report

Patch coverage: 85.54% and project coverage change: +0.04% 🎉

Comparison is base (41682ac) 58.24% compared to head (a387e64) 58.28%.
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10934      +/-   ##
==========================================
+ Coverage   58.24%   58.28%   +0.04%     
==========================================
  Files         123      123              
  Lines       10864    10882      +18     
==========================================
+ Hits         6328     6343      +15     
  Misses       3916     3916              
- Partials      620      623       +3     
Files Changed Coverage Δ
pkg/compose/printer.go 81.08% <78.43%> (+4.69%) ⬆️
pkg/compose/up.go 83.33% <96.87%> (-1.97%) ⬇️

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants