Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(*): add readiness & liveness probes #149

Merged
merged 36 commits into from
Feb 16, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
64cde22
fix(pkg/healthz): add readiness & liveness probes
Feb 8, 2016
21ad624
fix(pkg/healthsrv/server.go): only access port
Feb 8, 2016
557bf6e
feat(boot.go,pkg/sshd/config.go): wire up the health server
Feb 8, 2016
0dbe716
fix(pkg/healthsrv/healthz_handler.go): pass in address of ListBuckets…
Feb 8, 2016
78ae20d
fix(pkg/healthsrv/healthz_handler.go): use internal log package
Feb 8, 2016
c375e8a
fix(boot.go,fetcher/fetcher.go,pkg/sshd/config.go): unify all configs…
Feb 8, 2016
20c7317
fix(manifests/*): bring manifests up to date with those in the deis-d…
Feb 8, 2016
d733f4c
fix(manifests/deis-builder-rc.yaml): add health server port and env var
Feb 8, 2016
8891583
fix(manifests/deis-builder-rc.yaml): use proper port name
Feb 8, 2016
9214a24
fix(pkg/healthsrv/server.go): build the right host string
Feb 8, 2016
c1aebfc
fix(boot.go,pkg/healthsrv): add kubernetes API checks in the healthz …
Feb 9, 2016
f58ac45
ref(pkg/healthsrv): refactor *s3.S3 out of the health server
Feb 9, 2016
f3755f6
feat(pkg/sshd/circuit*): add a circuit primitive
Feb 9, 2016
96ec946
fix(*): don't report the server up until the SSH server is running
Feb 9, 2016
e35609a
fix(pkg/sshd/circuit_test.go): add concurrency test for the circuit
Feb 9, 2016
e35e855
fix(pkg/sshd/server_test.go): pass circuit to the server
Feb 9, 2016
b3dba97
fix(pkg/healthsrv): add dummy bucket and namespace listers
Feb 9, 2016
0dae532
fix(glide.yaml,glide.lock): add shameless plug assert package for tes…
Feb 9, 2016
b039013
fix(pkg/healthsrv): add tests to complete all failure cases in the he…
Feb 9, 2016
4f1d853
doc(pkg/healthsrv/healthz_handler.go): clarifying docs
Feb 9, 2016
b6a29a5
fix(pkg/healthsrv/healthz_handler_test.go): adding success case test
Feb 9, 2016
4808fe1
fix(manifests/deis-builder-rc.yaml): add probes to the RC manifest
Feb 9, 2016
b3fbb2b
fix(pkg/healthsrv): do the 3 health checks concurrently
Feb 10, 2016
928a064
fix(glide.lock,glide.yaml,pkg/healthsrv/healthz_handler.go): use pkg/log
Feb 11, 2016
99f9d81
fix(boot.go): pass correct parameters to the right place
Feb 11, 2016
c9113b1
ref(pkg/sshd/server_test.go): make the server address constant
Feb 11, 2016
d774e5f
ref(pkg/sshd/server_test.go): remove unused func
Feb 11, 2016
5081ba5
fix(rootfs/Dockerfile): move the git home and user env vars up
Feb 11, 2016
5a209b6
fix(boot.go): use standard log package for startup
Feb 11, 2016
f8fa37a
fix(manifests/deis-builder-rc.yaml): shorten the delay and timeout
Feb 11, 2016
ac6e83b
fix(boot.go): fix additional debug message
Feb 11, 2016
666bb24
ref(pkg/healthsrv/healthz_handler.go): simplify the healthcheck handler
Feb 11, 2016
c925148
fix(Makefile): speed up all makefile builds
Feb 11, 2016
2b8fc3b
fix(pkg/healthsrv/server.go): remove unused constants
Feb 12, 2016
8952b25
doc(pkg/sshd/circuit.go): add more godocs
Feb 12, 2016
958e0b1
fix(manifests/deis-builder-rc.yaml): remove fetcher port
Feb 12, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,6 @@ RC := manifests/deis-${SHORT_NAME}-rc.yaml
SVC := manifests/deis-${SHORT_NAME}-service.yaml
IMAGE := ${DEIS_REGISTRY}${IMAGE_PREFIX}/${SHORT_NAME}:${VERSION}

TEST_PACKAGES := $(shell ${DEV_ENV_CMD} glide nv)

all:
@echo "Use a Makefile to control top-level building of the project."

Expand All @@ -46,7 +44,7 @@ build:
@$(call check-static-binary,$(BINARY_DEST_DIR)/boot)

test:
${DEV_ENV_CMD} go test ${TEST_PACKAGES}
${DEV_ENV_CMD} sh -c 'go test $$(glide nv)'

docker-build:
docker build --rm -t ${IMAGE} rootfs
Expand Down
46 changes: 41 additions & 5 deletions boot.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package main

import (
"log"
"os"
"runtime"

Expand All @@ -9,8 +10,11 @@ import (
"github.com/deis/builder/pkg"
"github.com/deis/builder/pkg/conf"
"github.com/deis/builder/pkg/gitreceive"
"github.com/deis/builder/pkg/gitreceive/storage"
"github.com/deis/builder/pkg/healthsrv"
"github.com/deis/builder/pkg/sshd"
pkglog "github.com/deis/pkg/log"
kcl "k8s.io/kubernetes/pkg/client/unversioned"
)

const (
Expand All @@ -26,8 +30,8 @@ func main() {
if os.Getenv("DEBUG") == "true" {
pkglog.DefaultLogger.SetDebug(true)
cookoolog.Level = cookoolog.LogDebug
log.Printf("Running in debug mode")
}
pkglog.Debug("Running in debug mode")

app := cli.NewApp()

Expand All @@ -42,8 +46,40 @@ func main() {
pkglog.Err("getting config for %s [%s]", serverConfAppName, err)
os.Exit(1)
}
pkglog.Info("starting SSH server on %s:%d", cnf.SSHHostIP, cnf.SSHHostPort)
os.Exit(pkg.Run(cnf.SSHHostIP, cnf.SSHHostPort, "boot"))
circ := sshd.NewCircuit()

s3Client, err := storage.GetClient(cnf.HealthSrvTestStorageRegion)
if err != nil {
log.Printf("Error getting s3 client (%s)", err)
os.Exit(1)
}
kubeClient, err := kcl.NewInCluster()
if err != nil {
log.Printf("Error getting kubernetes client [%s]", err)
os.Exit(1)
}
log.Printf("Starting health check server on port %d", cnf.HealthSrvPort)
healthSrvCh := make(chan error)
go func() {
if err := healthsrv.Start(cnf.HealthSrvPort, kubeClient.Namespaces(), s3Client, circ); err != nil {
healthSrvCh <- err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to check if namepsace is present or not ? builder won't start if namespace is not present or k8s cluster is not working

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the goal here is to check that the k8s API is accessible. Given that the builder has started, we can assume that the namespace under which it runs (also specified in POD_NAMESPACE) exists. The call that the health server makes is to just get a list of namespaces, and the goal there is only to see if the master is accessible.

I think @aledbf had an idea for a cheaper method of determining whether the API server was accessible, but if you have ideas I'd like to hear them too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arschles if the goal is just check the status of API server why not just run:

    res := c.Get().AbsPath("/healthz").Do()
    if res.Error() != nil{
        // not possible to reach the server
    }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aledbf that's much better - thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aledbf here's the rendered error when that code runs:

couldn't get version/kind; json parse error: invalid character 'o' looking for beginning of value

I'd like to keep this code as-is for now, and put in an issue to use the /healthz way later. Objections?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Objections?

None

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
}()

log.Printf("Starting SSH server on %s:%d", cnf.SSHHostIP, cnf.SSHHostPort)
sshCh := make(chan int)
go func() {
sshCh <- pkg.RunBuilder(cnf.SSHHostIP, cnf.SSHHostPort, circ)
}()

select {
case err := <-healthSrvCh:
log.Printf("Error running health server (%s)", err)
os.Exit(1)
case i := <-sshCh:
log.Printf("Unexpected SSH server stop with code %d", i)
os.Exit(i)
}
},
},
{
Expand All @@ -53,13 +89,13 @@ func main() {
Action: func(c *cli.Context) {
cnf := new(gitreceive.Config)
if err := conf.EnvConfig(gitReceiveConfAppName, cnf); err != nil {
pkglog.Err("Error getting config for %s [%s]", gitReceiveConfAppName, err)
log.Printf("Error getting config for %s [%s]", gitReceiveConfAppName, err)
os.Exit(1)
}
cnf.CheckDurations()

if err := gitreceive.Run(cnf); err != nil {
pkglog.Err("running git receive hook [%s]", err)
log.Printf("Error running git receive hook [%s]", err)
os.Exit(1)
}
},
Expand Down
27 changes: 19 additions & 8 deletions glide.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions glide.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ import:
version: eca94c41d994ae2215d455ce578ae6e2dc6ee516
- package: github.com/pborman/uuid
- package: github.com/deis/pkg
version: b8679a4d7fe2fe0393c66c620eff7df86c9f7982
subpackages:
- time
- log
Expand All @@ -35,3 +36,5 @@ import:
- service/s3
- package: k8s.io/kubernetes
version: ~1.1
- package: github.com/arschles/assert
version: 6882f85ccdc7c1822b146d1a6b0c2c48f91b5140
42 changes: 25 additions & 17 deletions manifests/deis-builder-rc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ apiVersion: v1
kind: ReplicationController
metadata:
name: deis-builder
namespace: deis
labels:
heritage: deis
spec:
Expand All @@ -15,43 +16,50 @@ spec:
spec:
containers:
- name: deis-builder
imagePullPolicy: Always
image: quay.io/deisci/builder:v2-beta
imagePullPolicy: Always
ports:
- containerPort: 2223
- containerPort: 3000
name: ssh
- containerPort: 8092
name: healthsrv
env:
- name: BUILDER_FETCHER_PORT
value: "3000"
- name: BUILDER_SSH_HOST_IP
value: "0.0.0.0"
- name: BUILDER_SSH_HOST_PORT
value: "2223"
- name: "HEALTH_SERVER_PORT"
value: "8092"
- name: "EXTERNAL_PORT"
value: "2223"
- name: POD_NAMESPACE
# This var needs to be passed so that the minio client (https://github.com/minio/mc) will work in Alpine linux
- name: "DOCKERIMAGE"
value: "1"
- name: "DEBUG"
value: "true"
- name: "POD_NAMESPACE"
valueFrom:
fieldRef:
fieldPath: metadata.namespace
livenessProbe:
httpGet:
path: /healthz
port: 8092
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you mind just removing this code entirely for now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, it's removed here ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I meant like removing the commented-out code entirely rather than leaving it as a comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, it's removed entirely. Here is the file on my branch, which is represented in the PR - notice no mention of SSL

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow. I was really derping hard there. Carry on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it happens ;)

initialDelaySeconds: 2
timeoutSeconds: 1
readinessProbe:
httpGet:
path: /healthz
port: 8092
initialDelaySeconds: 2
timeoutSeconds: 1
volumeMounts:
- name: minio-user
mountPath: /var/run/secrets/object/store
readOnly: true
- name: builder-key-auth
mountPath: /var/run/secrets/api/auth
readOnly: true
# not currently running minio with SSL support. see https://github.com/deis/minio/pull/22 for more detail
# - name: minio-ssl
# mountPath: /var/run/secrets/object/ssl
# readOnly: true
volumes:
- name: minio-user
secret:
secretName: minio-user
- name: builder-key-auth
secret:
secretName: builder-key-auth
# not currently running minio with SSL support. see https://github.com/deis/minio/pull/22 for more detail
# - name: minio-ssl
# secret:
# secretName: minio-ssl
10 changes: 3 additions & 7 deletions manifests/deis-builder-service.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,13 @@ apiVersion: v1
kind: Service
metadata:
name: deis-builder
namespace: deis
labels:
heritage: deis
release: 2.0.0
spec:
ports:
- port: 2222
- name: ssh
port: 2222
targetPort: 2223
name: ssh
protocol: TCP
- port: 3000
name: fetcher
protocol: TCP
selector:
app: deis-builder
9 changes: 4 additions & 5 deletions pkg/builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,12 @@ package pkg

import (
"fmt"
"log"
"os"

"github.com/Masterminds/cookoo"
clog "github.com/Masterminds/cookoo/log"
"github.com/deis/builder/pkg/sshd"

"log"
"os"
)

// Return codes that will be sent to the shell.
Expand All @@ -28,7 +27,7 @@ const (
// Git.
//
// Run returns on of the Status* status code constants.
func Run(sshHostIP string, sshHostPort int, cmd string) int {
func RunBuilder(sshHostIP string, sshHostPort int, sshServerCircuit *sshd.Circuit) int {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason we're renaming this from builder.Run to builder.RunBuilder? Not too sure how this relates to healthchecks

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed cmd and changed the name to add specificity to this function. Objections to leaving it in?

For context, cmd's purpose was to tell Run what cookoo route to run, but none were supported except for "boot" (which corresponds to running the builder SSH server).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the reason I had to touch this function in the first place was to add the sshServerCircuit parameter. That's the part I missed in my last comment, and that's what related to health checks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. No objections here, just curious :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Thanks for letting me slip this sorta-not-related change in ;)

reg, router, ocxt := cookoo.Cookoo()
log.SetFlags(0) // Time is captured elsewhere.

Expand Down Expand Up @@ -59,7 +58,7 @@ func Run(sshHostIP string, sshHostPort int, cmd string) int {
// Start the SSH service.
// TODO: We could refactor Serve to be a command, and then run this as
// a route.
if err := sshd.Serve(reg, router, cxt); err != nil {
if err := sshd.Serve(reg, router, sshServerCircuit, cxt); err != nil {
clog.Errf(cxt, "SSH server failed: %s", err)
return StatusLocalError
}
Expand Down
Loading