New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[full ci] Attach support for exec'ed session #4339
Conversation
cmd/tether/attach.go
Outdated
@@ -148,6 +154,19 @@ func (t *attachServerSSH) start() error { | |||
} | |||
|
|||
if t.Enabled() { | |||
// HACK | |||
if t.serverConn.ServerConn != nil { | |||
log.Infof("asking portlayer to get container ids") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all these three infos has to be debugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subject says WIP :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, I get that, I just want to make sure it is not missed.
cmd/tether/attach.go
Outdated
for req := range reqchan { | ||
var pendingFn func() | ||
var payload []byte | ||
payload := []byte{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this causes allocation of a byte array pointer using var payload []byte
is actually better especially if there is a chance that no data will be added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
cmd/tether/attach.go
Outdated
i++ | ||
for k, v := range t.config.Sessions { | ||
if v.Active { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary whitespace line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
cmd/tether/attach.go
Outdated
ok := true | ||
|
||
log.Infof("received global request type %v", req.Type) | ||
|
||
switch req.Type { | ||
case msgs.ContainersReq: | ||
keys := make([]string, len(t.config.Sessions)) | ||
keys := make([]string, len(t.config.Sessions)+len(t.config.Execs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you sure you want to allocate all strings? you will end up with an array that has many empty elements at the end if there are not active elements exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
err = c.containerProxy.AttachStreams(context.Background(), vc, eid, stdin, stdout, stderr, ca) | ||
if err != nil { | ||
if _, ok := err.(DetachError); ok { | ||
log.Infof("Detach detected, tearing down connection") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This chunk of logic wants to be moved out into a separate method, current method is heavily overloaded already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -1583,7 +1630,7 @@ func (c *Container) containerAttach(name string, ca *backend.ContainerAttachConf | |||
|
|||
handle, ok = unbind.Payload.Handle.(string) | |||
if !ok { | |||
return InternalServerError("type assertion failed") | |||
return InternalServerError("type assOertion failed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want to rollback this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
lib/communication/interactor.go
Outdated
@@ -12,13 +12,12 @@ | |||
// See the License for the specific language governing permissions and | |||
// limitations under the License. | |||
|
|||
package attach | |||
package communication |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please name it comm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? Comm sounds like common
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
communication is too long. using word common for naming packages is very bad, so comm can lead to communication only. I just want it, nothing else. Feel free to resist, but not too much ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass review - added comments, but likely easiest to talk about it in the morning.
cmd/tether/attach.go
Outdated
@@ -148,6 +154,19 @@ func (t *attachServerSSH) start() error { | |||
} | |||
|
|||
if t.Enabled() { | |||
// HACK | |||
if t.serverConn.ServerConn != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
presume this is going to be replaced with a portlayer symmetric container-ids call on the global command channel?
cmd/tether/attach.go
Outdated
serverConn.Unlock() | ||
t.serverConn.Lock() | ||
// set serverConn to nil if not an exec session | ||
if _, ok = t.config.Sessions[sessionid]; ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't recall why we were clearing this - do we need ref counting if we ever have multiple sessions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so that we can see whether the connection is established or not, this value being nil is the looping condition
cmd/tether/msgs/messages.go
Outdated
@@ -142,3 +142,6 @@ func (s *ContainersMsg) Marshal() []byte { | |||
func (s *ContainersMsg) Unmarshal(payload []byte) error { | |||
return ssh.Unmarshal(payload, s) | |||
} | |||
|
|||
// AskContainersReq | |||
const AskContainersReq = "ask-container-ids" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presume this is going to be switched out with the direct reporting of current ID set when attach.Reload is called rather than this solicitation of a request for the updated set?
lib/portlayer/exec/commit.go
Outdated
@@ -172,6 +172,7 @@ func Commit(ctx context.Context, sess *session.Session, h *Handle, waitTime *int | |||
if err != nil { | |||
// NOTE: not sure how to handle this error - the change is already applied, it's just not picked up in the container | |||
} | |||
h.reload = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember you asking about this - but handles should never be reused, so I do not understand why this is present.
We've got worse problems if we're somehow attempting to reuse a handle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a debugging artifact, will remove it (and see what will happen)
@@ -74,6 +76,7 @@ func (handler *TaskHandlersImpl) JoinHandler(params tasks.JoinParams) middleware | |||
Tty: params.Config.Tty, | |||
Attach: params.Config.Attach, | |||
OpenStdin: params.Config.OpenStdin, | |||
RunBlock: params.Config.Attach, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check out this change @sflxn had to make in a pending PR. compose
sets all of config.AttachStdin || config.AttachStdout || config.AttachStderr
to false if running with -d
, which prevents us from attaching later.
We need a different mechanism of populating RunBlocking - we were using the ordering of the attach and start calls to determine if this was required previously - has that changed or is this just how it was being communicated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. This is lacking a FIXME comment and we need to fix Bind to handle Exec'ed sessions
@@ -146,7 +147,7 @@ func (i *InteractionHandlersImpl) UnbindHandler(params interaction.InteractionUn | |||
// ContainerResizeHandler calls resize | |||
func (i *InteractionHandlersImpl) ContainerResizeHandler(params interaction.ContainerResizeParams) middleware.Responder { | |||
// See whether there is an active session to the container | |||
session, err := i.attachServer.Get(context.Background(), params.ID, 0) | |||
session, err := i.server.Interaction(context.Background(), params.ID, interactionTimeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timeout here was set to 0
previously because we were finding resize getting called either at somewhat arbitrary times and it was adding a 30s delay on the CLI end (IIRC it was the stop path - this may have changed given the 1.13 use of events, but needs confirming).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grabbed this change from Hasan's PR :)
if !ok { | ||
return nil, fmt.Errorf("Type assertion failed for %#+v", handle) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to pull in the changes from the vSPC work:
https://github.com/vmware/vic/commit/23ed0b0a41dec1c7e91288db2c37e072595cb931
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I already rebased on top of that, what's missing?
var ids []string | ||
ids, err = SSHls(client) | ||
// ask the IDs | ||
ids, err := ContainerIDs(clientConn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be below the registration of the mux handlers - I'm uncertain whether it's significant in this case but I'm very wary of potentially making a call that expects a reply without having the mux's running - I seem to recall that leads to potential deadlocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i believe it blocks until someone handles the request
go c.chans(chans) | ||
|
||
// create the connections | ||
c.ids(clientConn, ids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah - this may be why the runblocking on all sessions isn't impact us currently. I'm unsure what happens if we don't drain the stdout streams of the "un-attached" connections in this model
We should think about, then document, what the behaviour should be in the case of multiple sessions where we only care about output from one of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#4950 created to track this
@@ -470,6 +470,8 @@ func (t *tether) Start() error { | |||
return err | |||
} | |||
|
|||
// Danger, Will Robinson! There is a strict ordering here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately t.extensions is a map - with no ordering guarantees. What is the impact of this ordering being incorrect?
We cannot merge this unless either the comment is wrong, we resolve the dependency, or we enforce the ordering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ordering is between attach server and ProcessSession, will clarify it
@@ -172,3 +166,27 @@ func (t *attachSSH) Resize(cols, rows, widthpx, heightpx uint32) error { | |||
} | |||
return nil | |||
} | |||
|
|||
// Ping checks the liveleness of the connection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"liveliness"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vendor changes approved. Still requires code review approval.
} | ||
|
||
// URL returns the listener's URL | ||
func (c *Connector) URL() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this being used anywhere any more - it's would/should be called by the vSPC code if it's going to be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kreamyx do you remember how you are constructing the URL in the vps code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@caglar10ur are you talking about this?
https://github.com/vmware/vic/blob/master/lib/vspc/helpers.go#L85
c.mutex.RUnlock() | ||
|
||
if ok { | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should there be a ping test here? we'll never reject a dead connection at this point, is this covered elsewhere?
connection := &Connection{ | ||
spty: si, | ||
id: id, | ||
if err := si.Ping(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's viable to trust that the connection is solid if we've only just created it - if we release the runblock after the reader/writers are bound in channel setup instead of via an explicit message then we can move this to be validation only.
CloseStdin: true, | ||
} | ||
|
||
err = c.containerProxy.AttachStreams(context.Background(), ac, stdin, stdout, stderr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't know if we should call this if all of the UseX are false (-d
case). AttachStreams
shuts down stdin on return which will kill the transport to the client. I don't know if we need that transport to return launch errors - my guess is we do.
Suggest we check for successful launch immediately after taskinspect (ec.Running
for success/failure, ec.ProcessConfig.ErrMessage
for detail) and only perform the attach prep if launch was successful.
I know this PR is big so here are the main changes. I'm also willing to answer any questions you may have.
Those changes live in lib/apiservers/engine/backends/ directory. There are changes in container_proxy functions since exec and attach shares some common code paths. We started to pass ids instead of viccontainer structs around to them to support both workflows. AttachStreams function splits the cancelation logic into different parts. Closing stdin cancels other active streams. We also ensure that the API client connection is shutdown on exit.
Those changes live in lib/apiservers/portlayer/ directory. One important change there is context changes. In the past we were calculating the timeouts manually and then passing timeouts as a parameter to Get call. now each stream is creating its own context and passing it to the function - eliminating manual calculations and redundant timeout value.
Those changes live in lib/portlayer/attach/. Some of them just renames and splitting monolithic file (they are now bind.go/join.go etc.) into pieces. Connector has new set of features. We now have a different unblock signal for signaling tether (called unblock) for starting the processes inside the container. We also now have a SSH channel mux running on client side to receive container ids from tether. Normally portlayer initiates the connection and asks container ids. This is still the case for the primary session. But after that primary session start event, any exec'ed session will cause tether to restart and as a part of that tether will push container ids to the portlayer. Additionally we now started to use channel mux to evict closed connections from the connection map (#4367). Lastly we started to use singleflight to make sure that there is only one call in-flight for each ID at any given time.
As noted above, tether now pushes container ids to portlayer when it receives reload signal. It supports new unblock signal. Fixes a bug caused by wrong usage of cleanup functions. Fixes races causing incorrect behaviors.
dio reader now has a PropagateEOF function toggles whether to return EOF when all readers return EOF. Setting this to true will result in an EOF if there are no readers available when Read is next called. dio writer now calls CloseWrite if the writes implements CloseWriter. |
cvm, exists := h.vspc.cvmFromTelnetConnUnlocked(tc) | ||
if exists { | ||
cvm.Lock() | ||
log.Debugf("(vspc) detected closed connection for VM %s", cvm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use cvm.vmUUID instead of cvm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a String() method on cvm instead
log.Debugf("(vspc) detected closed connection for VM %s", cvm) | ||
log.Debugf("(vspc) closing connection to the AttachServer") | ||
cvm.remoteConn.Close() | ||
log.Debugf("(vspc) deleting vm records from the vm manager %s", cvm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use cvm.vmUUID instead of cvm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see ^
|
||
addr, err := net.ResolveTCPAddr("tcp", fmt.Sprintf("%s:%d", n.ip, n.port)) | ||
if err != nil { | ||
return fmt.Errorf("Attach server error %s: %s", addr, errors.ErrorStack(err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably print n.ip and n.port here
* Attach support for exec'ed session Includes @hmahmood's and @hickeng's various fixed to dio/persona and portlayer Fixes vmware#4263 Fixes vmware#4367 Fixes vmware#4598 Fixes vmware#4410 Fixes vmware#4339 Fixes vmware#4881 Requires vmware#4288 Implements exec functionality for personality Those changes live in lib/apiservers/engine/backends/ directory. There are changes in container_proxy functions since exec and attach shares some common code paths. We started to pass ids instead of viccontainer structs around to them to support both workflows. AttachStreams function splits the cancelation logic into different parts. Closing stdin cancels other active streams. We also ensure that the API client connection is shutdown on exit. Implements exec support for portlayer rest api Those changes live in lib/apiservers/portlayer/ directory. One important change there is context changes. In the past we were calculating the timeouts manually and then passing timeouts as a parameter to Get call. session, err := attachServer.Get(context.Background(), params.ID, timeout) now each stream is creating its own context and passing it to the function - eliminating manual calculations and redundant timeout value. session, err := i.server.Interaction(ctx, params.ID) Portlayer attach subsystem re-organization Those changes live in lib/portlayer/attach/. Some of them just renames and splitting monolithic file (they are now bind.go/join.go etc.) into pieces. Connector has new set of features. We now have a different unblock signal for signaling tether (called unblock) for starting the processes inside the container. We also now have a SSH channel mux running on client side to receive container ids from tether. Normally portlayer initiates the connection and asks container ids. This is still the case for the primary session. But after that primary session start event, any exec'ed session will cause tether to restart and as a part of that tether will push container ids to the portlayer. Additionally we now started to use channel mux to evict closed connections from the connection map (vmware#4367). Lastly we started to use singleflight to make sure that there is only one call in-flight for each ID at any given time. tether support for exec As noted above, tether now pushes container ids to portlayer when it receives reload signal. It supports new unblock signal. Fixes a bug caused by wrong usage of cleanup functions. Fixes races causing incorrect behaviors. dio changes dio reader now has a PropagateEOF function toggles whether to return EOF when all readers return EOF. Setting this to true will result in an EOF if there are no readers available when Read is next called. dio writer now calls CloseWrite if the writes implements CloseWriter.
* Attach support for exec'ed session Includes @hmahmood's and @hickeng's various fixed to dio/persona and portlayer Fixes vmware#4263 Fixes vmware#4367 Fixes vmware#4598 Fixes vmware#4410 Fixes vmware#4339 Fixes vmware#4881 Requires vmware#4288 Implements exec functionality for personality Those changes live in lib/apiservers/engine/backends/ directory. There are changes in container_proxy functions since exec and attach shares some common code paths. We started to pass ids instead of viccontainer structs around to them to support both workflows. AttachStreams function splits the cancelation logic into different parts. Closing stdin cancels other active streams. We also ensure that the API client connection is shutdown on exit. Implements exec support for portlayer rest api Those changes live in lib/apiservers/portlayer/ directory. One important change there is context changes. In the past we were calculating the timeouts manually and then passing timeouts as a parameter to Get call. session, err := attachServer.Get(context.Background(), params.ID, timeout) now each stream is creating its own context and passing it to the function - eliminating manual calculations and redundant timeout value. session, err := i.server.Interaction(ctx, params.ID) Portlayer attach subsystem re-organization Those changes live in lib/portlayer/attach/. Some of them just renames and splitting monolithic file (they are now bind.go/join.go etc.) into pieces. Connector has new set of features. We now have a different unblock signal for signaling tether (called unblock) for starting the processes inside the container. We also now have a SSH channel mux running on client side to receive container ids from tether. Normally portlayer initiates the connection and asks container ids. This is still the case for the primary session. But after that primary session start event, any exec'ed session will cause tether to restart and as a part of that tether will push container ids to the portlayer. Additionally we now started to use channel mux to evict closed connections from the connection map (vmware#4367). Lastly we started to use singleflight to make sure that there is only one call in-flight for each ID at any given time. tether support for exec As noted above, tether now pushes container ids to portlayer when it receives reload signal. It supports new unblock signal. Fixes a bug caused by wrong usage of cleanup functions. Fixes races causing incorrect behaviors. dio changes dio reader now has a PropagateEOF function toggles whether to return EOF when all readers return EOF. Setting this to true will result in an EOF if there are no readers available when Read is next called. dio writer now calls CloseWrite if the writes implements CloseWriter.
Fixes #4263
Fixes #4367
Fixes #4598
Fixes #4410
Fixes #4339
Fixes #4881
Requires #4288