Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursively start a pod if a container is run in it #2295

Merged

Conversation

haircommander
Copy link
Collaborator

@haircommander haircommander commented Feb 8, 2019

Prior, a pod would have to be started immediately when created, leading to confusion about what a pod state should be immediately after creation. The problem was podman run --pod ... would error out if the infra container wasn't started (as it is a dependency). Fix this by allowing for recursive start, where each of the container's dependencies are started prior to the new container. Only apply this when a container is run in a "created" pod

Also rework container_api Start, StartAndAttach, and Init functions, as there was some duplicated code, which made addressing the --recursive problem easier to fix.

Signed-off-by: Peter Hunt pehunt@redhat.com

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 8, 2019
@rhatdan
Copy link
Member

rhatdan commented Feb 9, 2019

I don't like the idea that this is an option. If this is something a user would expect then it should just happen. If I start a container which is in a pod, then the pod should start. I would rather we spend our time figuring out what happens when the container exits. It should be one of three things

1: Pod stops
2: Pod restarts ontainer
3: Nothing

@haircommander
Copy link
Collaborator Author

@rhatdan We can talk about the stop case. In the interest of resolving the problems prompting #2291 I have updated this PR to default to a recursive start when a new container is run in a pod (podman run --pod ...) All other cases recursive start has been disabled, but there's infrastructure to allow it if that is desired in the future. I still need to add tests, but PTAL

@haircommander haircommander force-pushed the recursive-start branch 2 times, most recently from 2e6062c to 108488a Compare February 11, 2019 16:11
@haircommander haircommander changed the title WIP Added --recursive option to run and start Recursively start a pod if a container is run in it Feb 11, 2019
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 11, 2019
@haircommander
Copy link
Collaborator Author

@mheon @rhatdan Okay PR should be ready PTAL

@mheon
Copy link
Member

mheon commented Feb 11, 2019

bot, retest this please

@@ -74,7 +65,8 @@ func (c *Container) Init(ctx context.Context) (err error) {
// started
// Stopped containers will be deleted and re-created in runc, undergoing a fresh
// Init()
func (c *Container) Start(ctx context.Context) (err error) {
// if recursive is set, Start will also start all containers this ctr depends on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add periods at the end of all the sentences in this comment? I think godoc will mangle the newlines, so we need periods to preserve sentence flow.

Also, can you capitalize the if at the beginning here?

// attach call.
// The channel will be closed automatically after the result of attach has been
// sent
// if recursive is set, StartAndAttach will also start all containers this ctr depends on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above - capitalize the if, add periods

}

// helper function that handles dependencies and prints a helpful message
func (c *Container) checkDependenciesAndHandleError(ctx context.Context) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Helper methods into container_internal.go please - we try to keep only public API stuff in container_api.go

// Or initialize it if necessary
if err := c.init(ctx); err != nil {
return nil, err
depCtrs := make([]*Container, 0) //, len(depCtrIDs))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the commented-out bit?

}

if len(ctrErrors) > 0 {
return errors.Wrapf(ErrCtrExists, "error stopping some containers")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make some sort of attempt to get the errors out here - maybe logrus.Errorf them and then return?

Also, swap ErrCtrExists to... probably ErrInternal is better?

@@ -0,0 +1,11 @@
mode: atomic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to .gitignore this file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apparently, will add

}

attachChan := make(chan error)
// Build a dependency graph of containers
graph, err := buildContainerGraph(depCtrs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how well this works if the dependencies have dependencies - it was originally written assuming every container it would ever touch would be passed in as part of the []*Container

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah shoot yeah I forgot to mention that. I haven't tested myself but I'm pretty sure things would get funky. Looking into better solution...

@haircommander
Copy link
Collaborator Author

@mheon updated for comments. PTAL now (especially https://github.com/containers/libpod/blob/af5a41a3d7b4fd69480214102956ebc399585d12/libpod/container_internal.go#L669 which is the meat of the changes I made)

// in looking up dependencies.
// Note: this function is currently meant as a robust solution to a narrow problem: start an infra-container when
// a container in the pod is run. It has not been tested for performance past one level, so expansion of recursive start
// must be tested first.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fundamentally OK with recursive start being really slow when people make absurd dependency chains. Our primary target is infra containers, with one, maybe two levels of dependencies. Someone will have to show a reaaaally convincing usecase for me to think that anything more than 2 levels is desirable. We need to support these configurations, but I don't intend to optimize for them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the note be removed then? It feels worth mentioning

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently recursive start is only allowed in the infra case, and only for a podman run

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, leave the note. If anyone uses the API and hits this case, they'll have an explanation for what's going on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this as recursive, I don't even understand what recusive means in this case. This is simply means if I start a container in a POD and the pod is not running, then start the pod rather then the container. Then the container will get started as part of the pod. If the pod is joining a container then it will start the pod and join the container to the pod.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't an actual pod start - we're hitting the container's dependencies, and starting the bare minimum this container needs to come up.

@haircommander
Copy link
Collaborator Author

tests are happy @rhatdan @baude PTAL :)

@rh-atomic-bot
Copy link
Collaborator

☔ The latest upstream changes (presumably #2319) made this pull request unmergeable. Please resolve the merge conflicts.

@@ -70,10 +70,17 @@ func runCmd(c *cliconfig.RunValues) error {
}
}

var recursive bool
if c.IsSet("pod") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just pass c.IsSet(pod) to the function,

Or if you want the variable simpley right
recursive=c.IsSet("pod")

@@ -108,7 +108,7 @@ func startCmd(c *cliconfig.StartValues) error {
}

// attach to the container and also start it not already running
err = startAttachCtr(ctr, os.Stdout, os.Stderr, inputStream, c.DetachKeys, sigProxy, !ctrRunning)
err = startAttachCtr(ctr, os.Stdout, os.Stderr, inputStream, c.DetachKeys, sigProxy, !ctrRunning, false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should check if the container is in a pod and then set this to true?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking is podman pod start should be used in this case, so it should error out if the pod isn't started. The advantage of setting all of this up for run is the container is supposed to be created, then started in one command, so there isn't time to run pod start in between. adding this functionality to container start and having the infra container start makes less sense to me, and also requires a query to the state that otherwise isn't needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mheon concurrent commenting makes this confusing. Who are you agreeing with?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haircommander I think it's sensible to recursively start here if the container is in a pod

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mheon @rhatdan fixed. Also made the change for podman attach, but opted to not change play_kube

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with play kube, it should be doing ordered start already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(And, honestly, if it fails, it's because the dependency containers just don't exist yet, nevermind they aren't started)

@@ -130,7 +130,7 @@ func startCmd(c *cliconfig.StartValues) error {
continue
}
// Handle non-attach start
if err := ctr.Start(ctx); err != nil {
if err := ctr.Start(ctx, false); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should check if the container is in a pod and then set this to true?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment above^

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -20,7 +20,7 @@ type RawTtyFormatter struct {
}

// Start (if required) and attach to a container
func startAttachCtr(ctr *libpod.Container, stdout, stderr, stdin *os.File, detachKeys string, sigProxy bool, startContainer bool) error {
func startAttachCtr(ctr *libpod.Container, stdout, stderr, stdin *os.File, detachKeys string, sigProxy bool, startContainer bool, recursive bool) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not crazy about the term recursive. Really this is just saying startPod correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Negative - this is an actual recursive start. We don't hit the entire pod, just the dependencies of the specific container in question.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I called it so for extendability (if we actually want a recursive start independent of pods). I guess I can change it for now though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave it as-is - we're not actually starting pods

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yeah that's good reasoning to keep it; I like that @mheon

@@ -449,6 +449,7 @@ Tune the container's pids limit. Set `-1` to have unlimited pids for the contain

Run container in an existing pod. If you want podman to make the pod for you, preference the pod name with `new:`.
To make a pod with more granular options, use the `podman pod create` command before creating a container.
If a container is run with a pod, and the pod has an infra-container, the infra-container will be started before the container is.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do any pods not have an infra-container?
Wouldn't this be more understandable for normal users if the sentence said:

If a container is run with a pod, the entire pod will be started if it was not already running. 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Podman pod allows you to skip making one

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we don't hit the entire pod, just our dependencies - probably the infra container

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah pods have an infra container by default, but they don't need to have one.

@@ -205,6 +120,24 @@ func (c *Container) StartAndAttach(ctx context.Context, streams *AttachStreams,
return attachChan, nil
}

// RestartWithTimeout restarts a running container and takes a given timeout in uint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you move this function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move it closer to the other functions that have to do with starting containers

@@ -169,3 +172,97 @@ func detectCycles(graph *containerGraph) (bool, error) {

return false, nil
}

// Visit a node on a container graph and start the container, or set an error if
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need all of this code if we simply checked if a conainer is in a pod and the pod is not running, then we start the pod?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made it generalized for other cases where starting a containers dependencies with the container makes sense. Otherwise it feels like a hack to me

// in looking up dependencies.
// Note: this function is currently meant as a robust solution to a narrow problem: start an infra-container when
// a container in the pod is run. It has not been tested for performance past one level, so expansion of recursive start
// must be tested first.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this as recursive, I don't even understand what recusive means in this case. This is simply means if I start a container in a POD and the pod is not running, then start the pod rather then the container. Then the container will get started as part of the pod. If the pod is joining a container then it will start the pod and join the container to the pod.

@haircommander
Copy link
Collaborator Author

/retest

@rh-atomic-bot
Copy link
Collaborator

☔ The latest upstream changes (presumably #2335) made this pull request unmergeable. Please resolve the merge conflicts.

…arted, or attached.

Prior, a pod would have to be started immediately when created, leading to confusion about what a pod state should be immediately after creation. The problem was podman run --pod ... would error out if the infra container wasn't started (as it is a dependency). Fix this by allowing for recursive start, where each of the container's dependencies are started prior to the new container. This is only applied to the case where a new container is attached to a pod.

Also rework container_api Start, StartAndAttach, and Init functions, as there was some duplicated code, which made addressing the problem easier to fix.

Signed-off-by: Peter Hunt <pehunt@redhat.com>
@rhatdan
Copy link
Member

rhatdan commented Feb 17, 2019

LGTM
@baude @mheon @vrothberg @giuseppe PTAL

@mheon
Copy link
Member

mheon commented Feb 17, 2019

/lgtm

@rhatdan
Copy link
Member

rhatdan commented Feb 17, 2019

/approve

@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: haircommander, rhatdan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 17, 2019
@rhatdan
Copy link
Member

rhatdan commented Feb 17, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 17, 2019
@openshift-merge-robot openshift-merge-robot merged commit 6aaf8d3 into containers:master Feb 17, 2019
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 27, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 27, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants