- graceful-shutdown: implement configurable sutdown delay (e5c13c8df0144c30e308cc833da8f6f0caac38ef)
prevents in-flight requests to land on a pod that has already stopped.
- shutdown: implement 5 seconds shutdown period (cef5f2effeb1de8ee41602c74a20d8d57e5f450d)
will already help prevent false positives from other kubenurse pods when trying to reach me_ingress through the ingress controller during teardown. without this 5sec wait, in-flight requests from e.g. the ingress controller will reach a pod that is already terminated. Might not be sufficient for similar for "path" errors, as there is no filter for terminating pods.
- shutdown: make shutdown duration configurable (a9d101a4fc3607ebd914b462776b43462d354e0d)
- shutdown: stop querying pending/terminating neighbors (3d6050c652bd6115a8be7f7879fa316c4bbad8d5)
prevents false positive path_error when checks are made to pending or terminating pods
- helm: make shutdown duration configurable (a518f56288836c025346d63ef154239bcfafdc3e)
- common: Use current main branch naming for the helm releaser (4dd5eded)
- common: use new ingress spefification (#52) (8b896f4c)
- helm: chart should respect
-n <namespace>
flag (#53) (a5a3a792) - helm: parse error when using extraEnvs (#48) (3a56edbb)
- common: Implement helm chart releaser (#47) (7f52b474)
- helm: add dnsConfig option (#50) (3fed2690)
- helm: add support for volumes and volumeMounts (#49) (986d3dc9)
- helm: make KUBENURSE_INSECURE configurable (#51) (4d4dc397)
- common: enforce timeouts in the kubenurse http.Server to avoid possible goroutine/memory leaks (d07df3bc)
- common: expose metrics from the kubenurse httpclient (#31) (ebb07646)
The following new metrics were added:
- kubenurse_httpclient_requests_total - Total issued requests by kubenurse, partitioned by http code/method.
- kubenurse_httpclient_trace_request_duration_seconds - Latency histogram for requests from the kubenurse httpclient, partitioned by event.
- httpclient_request_duration_seconds - Latency histogram of request latencies from the kubenurse httpclient.
- common
- 7beac307:
rewrite and cleanup kubenurse server code (#29)
- refactor!: rewrite and cleanup kubenurse server code
By using a package and multiple separate files the code is easier to understand and test. A new /ready handler was added so we can configure a readiness probe to allow seamless updates of kubenurse.
-
build: update golangci-lint version
-
build: update golangci-lint timeout, default is too short
-
build: extract lint step and use go version 1.17
-
feat: configure new readinessprobe in kustomize and helm templates
-
fix: linter errors
-
chore: cleanup, remove not needed WaitGroup
-
refactor!: move pkg/kubediscovery to internal/kubediscovery
-
refactor!: move pkg/checker to internal/servicecheck
-
refactor!: incorporate pkg/metrics in internal/servicecheck
-
refactor!: more refactorings to allow easier unit testing
-
feat: more unit tests and coverage calculation in workflows
-
docs: include ci and coverage badges in readme
-
docs: fix coverage status URL
- 7beac307:
rewrite and cleanup kubenurse server code (#29)
- common: expose metrics from the kubenurse httpclient (#31) (ebb07646)
The following new metrics were added:
- kubenurse_httpclient_requests_total - Total issued requests by kubenurse, partitioned by http code/method.
- kubenurse_httpclient_trace_request_duration_seconds - Latency histogram for requests from the kubenurse httpclient, partitioned by event.
- httpclient_request_duration_seconds - Latency histogram of request latencies from the kubenurse httpclient.
- examples: Bump kubenurse version to v1.4.0 (6f1228c0)
- examples: Bump kubenurse version to v1.3.4 (4e0a4c33)
- discovery: Prevent panic when checking for schedulable nodes only (2243226b)
- examples: Bump kubenurse version to v1.3.3 (c13ebc11)
- common: CI improvements and RBAC fixes (394daf19)
- common: Flag to consider kubenurses on unschedulable nodes (cd9ac29b)
- common: remove unwanted linter configuration (d9284394)
- common: exclude nodes which are not schedulable from neighbour checks (b6acb939)