Skip to content

bug: container cleanup defer does not run on signal, leaving orphaned containers #2799

@bfirsh

Description

@bfirsh

Description

In pkg/cli/predict.go:297 and pkg/cli/train.go:142, there are FIXME comments noting that the deferred predictor.Stop() calls will not execute when the process receives a signal (e.g. SIGINT from Ctrl+C). This means interrupting cog predict or cog train can leave orphaned Docker containers running.

Affected code

pkg/cli/predict.go:297-304:

// FIXME: will not run on signal
defer func() {
    console.Debugf("Stopping container...")
    if err := predictor.Stop(context.Background()); err != nil {
        console.Warnf("Failed to stop container: %s", err)
    }
}()

pkg/cli/train.go:142-149:

// FIXME: will not run on signal
defer func() {
    console.Debugf("Stopping container...")
    if err := predictor.Stop(context.Background()); err != nil {
        console.Warnf("Failed to stop container: %s", err)
    }
}()

Note: predict.go does have a separate goroutine (lines 260-270) that catches SIGINT and calls predictor.Stop(), but train.go does not have equivalent handling.

Expected behavior

When the user hits Ctrl+C (or the process receives SIGTERM), the Docker container should be stopped and cleaned up before the process exits.

Suggested fix

Install a signal handler (e.g. signal.NotifyContext or signal.Notify on a channel) that triggers container cleanup on SIGINT/SIGTERM, similar to the existing goroutine in predict.go but applied consistently to both commands.

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions