Skip to content
This repository was archived by the owner on May 6, 2020. It is now read-only.

Conversation

@dlespiau
Copy link
Contributor

@dlespiau dlespiau commented May 5, 2017

A somewhat long-winded PR that should remove the need to start the shim as stopped. We finish the work to started to synchronise runtime, shim and the process inside the VM without queuing data. It can be a bit fiddly so the code is somewhat heavily documented and tested.

@dlespiau dlespiau force-pushed the 20170504-sync-stdin-newcontainer-execcmd branch from 862b9b1 to af0fec4 Compare May 5, 2017 16:50
@coveralls
Copy link

coveralls commented May 5, 2017

Coverage Status

Coverage increased (+1.3%) to 71.872% when pulling af0fec4 on dlespiau:20170504-sync-stdin-newcontainer-execcmd into 70866ba on clearcontainers:master.

1 similar comment
@coveralls
Copy link

Coverage Status

Coverage increased (+1.3%) to 71.872% when pulling af0fec4 on dlespiau:20170504-sync-stdin-newcontainer-execcmd into 70866ba on clearcontainers:master.

@coveralls
Copy link

coveralls commented May 5, 2017

Coverage Status

Coverage increased (+1.7%) to 72.292% when pulling af0fec4 on dlespiau:20170504-sync-stdin-newcontainer-execcmd into 70866ba on clearcontainers:master.

vm.go Outdated

var waitForExecTimeout = 30 * time.Second

// WaitForExec will wait until the process inside the VM is fully started. If
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/WaitForExec/WaitForProcess/

@sboeuf
Copy link
Contributor

sboeuf commented May 5, 2017

One comment about function comment not in sync with the function name, but other than that LGTM.

@dlespiau dlespiau force-pushed the 20170504-sync-stdin-newcontainer-execcmd branch from af0fec4 to 5eb191d Compare May 7, 2017 13:55
@coveralls
Copy link

coveralls commented May 7, 2017

Coverage Status

Coverage increased (+1.3%) to 71.872% when pulling 5eb191d on dlespiau:20170504-sync-stdin-newcontainer-execcmd into 70866ba on clearcontainers:master.

@dlespiau
Copy link
Contributor Author

dlespiau commented May 7, 2017

argh, I can't merge it because the LGTM isn't at the start of the line. Grmbl (against tooling getting in the way, of course).

vm.go Outdated
return nil
}

var waitForExecTimeout = 30 * time.Second

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const? Also, should this name refer to the process (waitForProcessTimeout)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't make this one const, I override the timeout in tests. But sure will rename it.

// smallWaitForShimTimeout overrides the default timeout for the tests
const smallWaitForShimTimeout = 20 * time.Millisecond
// smallWaitTimeout overrides the default timeout for the tests
const smallWaitTimeout = 20 * time.Millisecond

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :)

vm.go Outdated
// timeout.
func (session *ioSession) WaitForProcess(shouldReset bool) error {
session.vm.infof(1, "session",
"waiting for runtime to execute the process for token %s (timeout %s)",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that %s correct? Even if it is, it might look less odd if the format was specified as %v.

Copy link
Contributor Author

@dlespiau dlespiau May 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup that's correct (It'll call String() on any type that satisfies the Stringer interface).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went digging a bit in the go source code. %s and %v will end up doing the same thing when the argument implements the stringer interface:

@dlespiau dlespiau force-pushed the 20170504-sync-stdin-newcontainer-execcmd branch from 5eb191d to 1124a6f Compare May 8, 2017 17:07
@dlespiau
Copy link
Contributor Author

dlespiau commented May 8, 2017

PR updated:

  • Renamed waitForExecTimeout to waitForProcessTimeout.

@coveralls
Copy link

coveralls commented May 8, 2017

Coverage Status

Coverage increased (+1.2%) to 72.292% when pulling 1124a6f on dlespiau:20170504-sync-stdin-newcontainer-execcmd into 4db5a15 on clearcontainers:master.

if newClient.session != nil {
proxy.Lock()
info := proxy.tokenToVM[newClient.token]
info.state = tokenStateAllocated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this session being kept round forever. If the shim disconnects, we should wait for a reconnection for a specific time, after which we should discard the session.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but that's not a problem introduced in this PR, we've always had it. It's related to #24. Currently we need a DisconnectShim to clean up shim-related data structures. I've extended issue #24 to cover this.

vm.go Outdated
// tokens. Starpod isn't handled as it's not currently use to start processes
// and indicated as deprecated in the hyperstart API.
func (vm *vm) relocateHyperCommand(hyper *api.Hyper) error {
// RelocateHyperCommand will return the Session of the process in question if

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The comment should refer to relocateHyperCommand.

@dlespiau
Copy link
Contributor Author

PR updated:

  • Renamed RelocateHyperCommand to relocateHyperCommand in comment.

@dlespiau dlespiau force-pushed the 20170504-sync-stdin-newcontainer-execcmd branch 2 times, most recently from ba3b9b5 to b9948be Compare May 17, 2017 11:28
@coveralls
Copy link

coveralls commented May 17, 2017

Coverage Status

Coverage increased (+1.7%) to 72.241% when pulling b9948be on dlespiau:20170504-sync-stdin-newcontainer-execcmd into 800cc89 on clearcontainers:master.

@coveralls
Copy link

coveralls commented May 17, 2017

Coverage Status

Coverage increased (+1.3%) to 71.823% when pulling ba3b9b5 on dlespiau:20170504-sync-stdin-newcontainer-execcmd into 800cc89 on clearcontainers:master.

That's useful only in tests really at this point as the shim is written
in C. But one never knows!

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
@dlespiau
Copy link
Contributor Author

This needs to be rebased on top of the logrus port before merging.

Damien Lespiau added 4 commits May 18, 2017 15:01
Turns out it's really useful to know when we are hitting timeouts when
debugging!

Updates: clearcontainers#37

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
We close the shim connection when something bad happens:
  - We receive an error trying to write to the socket (most likely
  because the shim died or exited).
  - We have an other kind of unrecoverable error for that client (we
  don't have one right now but we will in the near future)

We now want to progress a bit on the recovery side of things. If the
shim dies, we want to allow it to reconnect and re-claim the session.
This commit does just that.

This is tested by a subsequent unit test: TestShimSendStdinAfterExeccmd

Updates: clearcontainers#4

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
As we don't queue any data in the proxy today, we need the process
inside the VM to exist before we can send stdin data to it.

Updates: clearcontainers#39

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Just like:

    commit 3d897a7
    Author: Damien Lespiau <damien.lespiau@intel.com>
    Date:   Fri May 5 15:57:01 2017 +0100

        proxy: Don't forward stdin data until the process is started

but for signals.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
@dlespiau dlespiau force-pushed the 20170504-sync-stdin-newcontainer-execcmd branch from b9948be to 70af5ec Compare May 18, 2017 14:10
@dlespiau
Copy link
Contributor Author

And now rebased on top of the logrus changes.

@coveralls
Copy link

coveralls commented May 18, 2017

Coverage Status

Coverage increased (+1.6%) to 72.904% when pulling 70af5ec on dlespiau:20170504-sync-stdin-newcontainer-execcmd into 2925d0a on clearcontainers:master.

@dlespiau
Copy link
Contributor Author

@amshinde mind having another look at this one?

@amshinde
Copy link
Contributor

amshinde commented May 19, 2017

lgtm

Approved with PullApprove Approved with PullApprove

@dlespiau dlespiau merged commit 0c71a01 into clearcontainers:master May 19, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants