Refactor Container State #20355

Closed
wants to merge 1 commit into
from

Projects

None yet

8 participants

@WeiZhang555
Contributor

Using enum value format to represent container's state as:

type Cstate int64
const (
    Stopped Cstate = 0
    Running Cstate = (1 << iota)
    Paused
    Restarting
    Dead
)

so as to replace redundant bool values such as Running, Paused bool

Currently we have some places didn't handle state restarting right, which is caused by confused container state from my point of view. . IsRunning() includes IsPaused() and IsRestarting(), this is ridiculous and I guess this is for easier code implementation before.

This PR will end this design, it will confine container in a single and precise state, a container can only be either running or restarting but can't be both at the same time, it will make code more clear, and easier to control.

As docker inspect makes use of the container state, I have done some translation in code so it WON'T break backward compatibility.

Unfinished:

  • Fix test cases.
  • Check whether expecting container state is right what we really want.
  • Clean some unused daemon error, maybe throw same error ErrorCodeInvalidState instead of specific error for each.

This is a initial commit for showing the design

Signed-off-by: Zhang Wei zhangwei555@huawei.com

@thaJeztah
Member

Looks like there's some related failures

@WeiZhang555
Contributor

@thaJeztah Yes, I'm fixing. 😄

Can we turn off Gorden's notice, feel a little annoying.

@thaJeztah
Member

@WeiZhang555 yeah, they can be; they'e automatically removed though if the failure is resolved

@duglin
Contributor
duglin commented Feb 16, 2016

We may need a "Stopping" state too for when we issue a kill but its taking a while to happen.

I was wondering if we need a "Created" state that's separate from "Stopped" but we may not need it as long as there are other fields to tell us whether or not the container has even been executed - e.g. start time and there are good util funcs to help. What do others think?

@WeiZhang555
Contributor

Ohh, I want to mention one more thing. Previous docker inspect result will return state like:

 "State": {
            "Status": "restarting",
            "Running": true,
            "Paused": false,
            "Restarting": true,
            "OOMKilled": false,
            "Dead": false,
           ...
        },

See that both "Running" and "Restarting" are true, after applying this PR:

 "State": {
            "Status": "restarting",
            "Running": false,
            "Paused": false,
            "Restarting": true,
            "OOMKilled": false,
            "Dead": false,
          ......
        },

Only "Restarting" is true, "Running" is false for a restarting container.

I'm doing this on purpose, I think it won't have bad impact but not sure if this will break some kind of backward compatibility.
I can change it back to previous bahaviour If this is problematic.

@jhowardmsft jhowardmsft commented on an outdated diff Feb 16, 2016
daemon/commit.go
@@ -105,11 +106,11 @@ func (daemon *Daemon) Commit(name string, c *types.ContainerCommitConfig) (strin
}
// It is not possible to commit a running container on Windows
- if runtime.GOOS == "windows" && container.IsRunning() {
+ if runtime.GOOS == "windows" && container.InStateLocking(state.Running|state.Paused|state.Restarting) {
@jhowardmsft
jhowardmsft Feb 16, 2016 Contributor

Why Paused (Windows does not support pausing containers currently)

Edit - but not a problem keeping it as it to support if/when we do support it.

@WeiZhang555
Contributor

docker-py tests failed in this line: https://github.com/docker/docker-py/blob/master/tests/integration/container_test.py#L989

As I said, I'm doing this on purpose(for a paused contaner, docker inspect shows state Paused=true and Running=false, previous result will show Running=true).

Other test cases pass In addition to docker-py error, so it works.

@thaJeztah
Member

As I said, I'm doing this on purpose(for a paused contaner, docker inspect shows state Paused=true and Running=false, previous result will show Running=true).

I fully agree that having both "Paused=true" and "Running=true" is confusing. Having said that, it is possible that people rely on this weird behavior; if this is part of the API response; should we make the dependent on the API version?

@WeiZhang555
Contributor

@thaJeztah I understand your worries, it seems possible.
I'll make it API version dependent. For API version <= 1.22(?) return "Running=true" for paused and restarting container, but for API version > 1.22, return "Running=false" for them.

@thaJeztah
Member

@WeiZhang555 SGTM! (just my 0.02c 😄)

@WeiZhang555
Contributor

@thaJeztah Updated. inspect will get same result for API version < 1.23, see https://github.com/docker/docker/pull/20355/files#diff-9d7e088e8ca8f96f2ac0af044001ae04R137

@duglin Not sure about Stopping and Created state, guess there may be some obstacles to implementing stopping state, for example, if stop failed, which state should container restore to? We may need to keep old state somewhere. Created seems good. And I think it's better to add new state in a separate PR.

And what do other maintainers think of this?

@thaJeztah
Member

ping @calavera ptal

@thaJeztah thaJeztah and 1 other commented on an outdated diff Feb 18, 2016
daemon/inspect_unix.go
@@ -21,7 +22,7 @@ func setPlatformSpecificContainerFields(container *container.Container, contJSON
}
// containerInspectPre120 gets containers for pre 1.20 APIs.
-func (daemon *Daemon) containerInspectPre120(name string) (*v1p19.ContainerJSON, error) {
+func (daemon *Daemon) containerInspectPre120(name string, version version.Version) (*v1p19.ContainerJSON, error) {
@thaJeztah
thaJeztah Feb 18, 2016 Member

Bit unsure about having both a "versioned" function here (containerInspectPre120), and a version passed. Perhaps someone has a better/cleaner suggestion?

@thaJeztah
thaJeztah Feb 18, 2016 Member

Possibly passing the container as an argument?

@WeiZhang555
WeiZhang555 Feb 19, 2016 Contributor

How about NOT passing a version to containerInspectPre120 and containerInspect120, but give them a fake version number?

containerInspectPre120(name string)   // Make a assumption that the version is '1.19'
containerInspect120(name string)        // version is '1.20'
containerInspectCurrent(name string, size bool, version version.Version)   // pass the real version
@calavera
Contributor

a lot of change makes sense, but there are two things that I don't think we should do:

  1. Replace well defined functions like IsRunning with more abstracts like IsInLockingState(Running|Pause).
  2. Change the inspect behavior.
@WeiZhang555
Contributor

@calavera

a lot of change makes sense, but there are two things that I don't think we should do:

1.Replace well defined functions like IsRunning with more abstracts like IsInLockingState(Running|Pause).

OK, I can wrap several functions e.g. IsRunning/IsRunningLocking, IsPausedLocking/IsPaused, something like this.

2.Change the inspect behavior.

I think it's very confusing to show that a Paused container is also Running, I've changed this part API denpendent, the behavior before API version 1.23 will keep the same but I hope we can do something for later API. Let Paused container to be only Paused but set Running=false.

What do you think of this?

@WeiZhang555 WeiZhang555 commented on an outdated diff Feb 21, 2016
daemon/exec.go
@@ -206,8 +194,8 @@ func (d *Daemon) ContainerExecStart(name string, stdin io.ReadCloser, stdout io.
}
// Maybe the container stopped while we were trying to exec
- if !c.IsRunning() {
- return derr.ErrorCodeExecContainerStopped
+ if !c.IsRunningLocking() {
@WeiZhang555
WeiZhang555 Feb 21, 2016 Contributor

Self comment:
exec can only work on running container, exclude paused and restarting

@thaJeztah
Member

ping @calavera can you have another look?

@WeiZhang555 looks like this needs a rebase

@WeiZhang555
Contributor

Rebased, it conflicts a lot with #20699, but I believe this PR is rebased properly. 😄

@WeiZhang555
Contributor

Windows tests are weired, can't be sure whether it's real or not.

@jhowardmsft
Contributor

windowsTP4 just took too long. Hopefully someone can restart - I'm not in front of a computer to do that from here.

@thaJeztah
Member

Restarted Windows CI

@calavera
Contributor
calavera commented Mar 1, 2016

Tbh, I always found the Paused == Running behavior strange, although I understand why it's that.

This change LGTM, but I'd like to hear from other maintainers about that topic.

@thaJeztah
Member

@calavera yes, the paused==running is super confusing, but I think we should keep it for older API versions? Or do you disagree there?

@calavera
Contributor
calavera commented Mar 2, 2016

Or do you disagree there?

I agree. I think that's already fixed here.

@WeiZhang555
Contributor

ping @duglin @cpuguy83 @vdemeester @docker/maintainers
Could you take a look at this? Thank you!

@cpuguy83 cpuguy83 commented on the diff Mar 7, 2016
container/state/state.go
+ return s.InState(Restarting)
+}
+
+// IsRestartingLocking locks State and checks whether restarting flag is set
+func (s *State) IsRestartingLocking() bool {
+ return s.InStateLocking(Restarting)
+}
+
+// IsDead returns whether the dead flag is set.
+// Used by Container to check whether a container is dead
+func (s *State) IsDead() bool {
+ return s.InState(Dead)
+}
+
+// IsDeadLocking locks State and checks whether dead flag is set
+func (s *State) IsDeadLocking() bool {
@cpuguy83
cpuguy83 Mar 7, 2016 Contributor

This fn name sounds horrible :)

@WeiZhang555
WeiZhang555 Mar 8, 2016 Contributor

Yes, I also dislike it :-(. It sounds like some "dead lock" is gonna happen.
I once considered to use "IsDeadL", "IsRunningL" but they are also horrible, hard to know its meaning from single function name.
So I have to leave it here :(

@cpuguy83 cpuguy83 commented on the diff Mar 7, 2016
integration-cli/docker_api_exec_test.go
@@ -56,10 +56,10 @@ func (s *DockerSuite) TestExecApiCreateContainerPaused(c *check.C) {
dockerCmd(c, "pause", name)
status, body, err := sockRequest("POST", fmt.Sprintf("/containers/%s/exec", name), map[string]interface{}{"Cmd": []string{"true"}})
c.Assert(err, checker.IsNil)
- c.Assert(status, checker.Equals, http.StatusConflict)
+ c.Assert(status, checker.Equals, http.StatusInternalServerError)
@cpuguy83
cpuguy83 Mar 7, 2016 Contributor

This resp code should not be changed.

@WeiZhang555
WeiZhang555 Mar 8, 2016 Contributor

You're right, it's my bad. I'll modify this.

I remember why I'm doing this now...
https://github.com/docker/docker/pull/20355/files#diff-3ed197c0a6f7a9ca73a3ee424e448ac1R51

This should be regarded as a reasonable change, because Exec can only work on a Running container, before this refactor, for example, exec on a stopped/created/dead container will return http.StatusInternalServerError, but on a Paused/Restarting container, it returns http.StatusConflict.

I have seperated IsPaused and IsRestarting from IsRunning in this commit, so I think it's not necessary to distinguish IsPaused/Restarting from IsStopped/IsDead, they are all "not running", no one is special from my point of view.

What do you think?

@cpuguy83 cpuguy83 commented on the diff Mar 7, 2016
integration-cli/docker_api_exec_test.go
@@ -80,7 +80,7 @@ func (s *DockerSuite) TestExecApiStart(c *check.C) {
// make sure exec is created before pausing
id = createExec(c, "test")
dockerCmd(c, "pause", "test")
- startExec(c, id, http.StatusConflict)
+ startExec(c, id, http.StatusInternalServerError)
@cpuguy83
cpuguy83 Mar 7, 2016 Contributor

same as above

@cpuguy83 cpuguy83 commented on the diff Mar 7, 2016
container/state/state.go
@@ -1,4 +1,4 @@
-package container
@cpuguy83
cpuguy83 Mar 7, 2016 Contributor

Is there some benefit of moving this to a separate package?

@WeiZhang555
WeiZhang555 Mar 8, 2016 Contributor

I'm doing this for better code readability.
In our existing code, we usually call state checking function like this:

var container *container.Container
if container.IsRunningLocking() {...}

Before I move the state.go to a separate directory, on a first sight, container.IsRunningLocking() can be a global function under package container, or a function belonging to container.Container, that confused me from time to time.

@thaJeztah
Member

hm weird permission failures on janky https://jenkins.dockerproject.org/job/Docker-PRs/24740/console, experimental https://jenkins.dockerproject.org/job/Docker-PRs-experimental/15950/console, and gccgo https://jenkins.dockerproject.org/job/Docker-PRs-gccgo/2846/console

08:04:39 
08:04:47 [Docker-PRs] $ /bin/sh -xe /tmp/hudson7055858508936919284.sh
08:04:47 + tar -czf bundles.tar.gz bundles
08:04:59 tar: bundles/1.11.0-dev/test-integration-cli/d72922152/root/volumes/test/opts.json: Cannot open: Permission denied
08:05:01 tar: bundles/1.11.0-dev/test-integration-cli/d73340976/docker.sock: socket ignored
08:05:03 tar: bundles/1.11.0-dev/test-integration-cli/d5312935/docker.sock: socket ignored
08:05:24 tar: bundles/1.11.0-dev/test-integration-cli/d29825342/docker.sock: socket ignored
08:05:29 tar: Exiting with failure status due to previous errors
08:05:30 Build step 'Execute shell' marked build as failure
08:05:30 Performing Post build task...
08:05:30 Match found for : : True
08:05:30 Logical operation result is TRUE
08:05:30 Running script  : docker run --rm -v "$(pwd):/workspace" busybox chown -R "$(id -u):$(id -g)" 
@thaJeztah
Member

@WeiZhang555 looks like this needs another rebase 😢

@WeiZhang555 WeiZhang555 Refactor Container State
Using enum value format to represent container's state as:

```
type Cstate int64
const (
    Stopped Cstate = 0
    Running Cstate = (1 << iota)
    Paused
    Restarting
    Dead
)
```

so as to replace redundant bool values such as `Running, Paused bool`

Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
488734d
@WeiZhang555
Contributor

@thaJeztah Rebased. WIndows tests are broken.

ping @cpuguy83 , could you take another look at this? Thank you.

@thaJeztah
Member

Really weird indeed; only on win2lin (also seen that on another PR)
https://jenkins.dockerproject.org/job/Docker-PRs-Win2Lin/23356/console

15:00:10 Building: bundles/1.11.0-dev/binary/docker-1.11.0-dev.exe
15:00:24 # github.com/docker/docker/cliconfig/credentials
15:00:24 cliconfig\credentials\native_store.go:82: auth.IdentityToken undefined (type types.AuthConfig has no field or method IdentityToken)
15:00:24 cliconfig\credentials\native_store.go:82: creds.IdentityToken undefined (type types.AuthConfig has no field or method IdentityToken)
15:00:24 cliconfig\credentials\native_store.go:96: ac.IdentityToken undefined (type types.AuthConfig has no field or method IdentityToken)
15:00:24 cliconfig\credentials\native_store.go:96: creds.IdentityToken undefined (type types.AuthConfig has no field or method IdentityToken)
15:00:24 cliconfig\credentials\native_store.go:110: authConfig.IdentityToken undefined (type types.AuthConfig has no field or method IdentityToken)
15:00:24 cliconfig\credentials\native_store.go:125: config.IdentityToken undefined (type types.AuthConfig has no field or method IdentityToken)
15:00:24 cliconfig\credentials\native_store.go:127: config.IdentityToken undefined (type types.AuthConfig has no field or method IdentityToken)
15:00:24 cliconfig\credentials\native_store.go:173: ret.IdentityToken undefined (type types.AuthConfig has no field or method IdentityToken)
15:00:24 # github.com/docker/docker/runconfig/opts
15:00:24 runconfig\opts\parse.go:413: unknown container.HostConfig field 'UsernsMode' in struct literal
15:00:29 # github.com/docker/docker/registry
15:00:29 registry\auth.go:87: lcs.authConfig.IdentityToken undefined (type *types.AuthConfig has no field or method IdentityToken)
15:00:29 registry\auth.go:91: lcs.authConfig.IdentityToken undefined (type *types.AuthConfig has no field or method IdentityToken)
15:00:29 registry\auth.go:167: credentialAuthConfig.IdentityToken undefined (type types.AuthConfig has no field or method IdentityToken)
15:00:34 # github.com/docker/docker/api/server/router/system
15:00:34 api\server\router\system\system_routes.go:124: unknown types.AuthResponse field 'IdentityToken' in struct literal
15:01:40 + ec=1
15:01:40 + set +x
15:01:40 ERROR: Build of binary on Windows failed
@thaJeztah
Member

aw, hm, looks like this needs a rebase now 😢

@WeiZhang555
Contributor

I'm closing this temporarily because this will need a huge change based on new code, will reopen it once I get enough time to handle this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment