Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: fix gomote ssh to COS-based Linux Docker images #26969

Closed
bradfitz opened this issue Aug 13, 2018 · 5 comments

Comments

Projects
None yet
3 participants
@bradfitz
Copy link
Member

commented Aug 13, 2018

When I moved the Linux Docker-based container images from Kubernetes to COS I accidentally broke gomote ssh support.

We used to ssh proxy to the POD's port 22, which worked, but now we ssh to the COS node's ssh server, which is the wrong one. We're also running an SSH server inside the container (which is also listening on port 22 in its private network namespace), and it's only that inner SSH server that's authenticated.

We need to configure the COS node's konlet YAML to either forward some different port (e.g. host 2200 to container 22) or just make the container image listen on 2200 instead, and then configure that in x/build/dashboard/builders.go and make the coordinator respect that in its remote.go when it calls rb.buildlet.ConnectSSH.

@bradfitz bradfitz added the NeedsFix label Aug 13, 2018

@gopherbot gopherbot added this to the Unreleased milestone Aug 13, 2018

@gopherbot gopherbot added the Builders label Aug 13, 2018

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Aug 13, 2018

The current konlet config is:

        if hconf.IsContainer() {
                addMeta("gce-container-declaration", fmt.Sprintf(`spec:
  containers:
    - name: buildlet
      image: 'gcr.io/%s/%s'
      volumeMounts:
        - name: tmpfs-0
          mountPath: /workdir
      securityContext:
        privileged: true
      stdin: false
      tty: false
  restartPolicy: Always
  volumes:
    - name: tmpfs-0
      emptyDir:
        medium: Memory
`, opts.ProjectID, hconf.ContainerImage))
        }

I see nothing about even mapping port 80, so maybe all ports not otherwise free are mapped.

Note that konlet and its YAML config is basically not documented at all. I guess it's new.

You'll need to read the code to see if it's configurable:

https://github.com/GoogleCloudPlatform/konlet/blob/master/gce-containers-startup/gce-containers-startup.go

But I think the answer is the network is not configurable:

https://github.com/GoogleCloudPlatform/konlet/blob/master/gce-containers-startup/types/api.go

So we should probably just make our OpenSSH sshd in the container listen on a high port.

@gopherbot

This comment has been minimized.

Copy link

commented Aug 14, 2018

Change https://golang.org/cl/129335 mentions this issue: cmd/buildlet: use a high ssh port on Linux when running under COS

@gopherbot

This comment has been minimized.

Copy link

commented Aug 14, 2018

Change https://golang.org/cl/129356 mentions this issue: cmd/buildlet: add optional X-Go-Ssh-Port header to control sshd port

gopherbot pushed a commit to golang/build that referenced this issue Aug 14, 2018

cmd/buildlet: use a high ssh port on Linux when running under COS
When running in GCE's Container-Optimized OS (COS), we can't use
port 22, as the system's sshd is already using it. Our container
runs in the system network namespace, not isolated as is typical
in Docker or Kubernetes. So use port 2200 instead.

Remove an unnecessary type conversion.

Updates golang/go#26969.

Change-Id: Ic85e1f14529175106b9c7397186d3e9b5cb39c1c
Reviewed-on: https://go-review.googlesource.com/129356
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Aug 14, 2018

Turns out CL 129356 was all that was needed.

@bradfitz bradfitz closed this Aug 14, 2018

@dmitshur

This comment has been minimized.

Copy link
Member

commented Aug 14, 2018

CL 129356 and deploying the new version of cmd/buildlet were sufficient to resolve the issue. We just needed to move the SSH server being started to a non-22 port to avoid overlapping with the host's sshd. The communication to the new port happens completely inside the cmd/buildlet's /connect-ssh HTTP handler. cmd/coordinator doesn't need to know about the new port, and hence there's nothing more do.

I tested, and gomote ssh now works for COS-based Linux images (e.g., linux-amd64).

Closing since the issue is resolved. Huge thanks to @bradfitz for the help with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.