Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jittering periods of some kubelet's sync loops: #20726

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
13 changes: 11 additions & 2 deletions pkg/kubelet/pod_workers.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ import (
"k8s.io/kubernetes/pkg/kubelet/util/queue"
"k8s.io/kubernetes/pkg/types"
"k8s.io/kubernetes/pkg/util/runtime"
"k8s.io/kubernetes/pkg/util/wait"
)

// PodWorkers is an abstract interface for testability.
Expand All @@ -39,6 +40,14 @@ type PodWorkers interface {

type syncPodFnType func(*api.Pod, *api.Pod, *kubecontainer.PodStatus, kubetypes.SyncPodType) error

const (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think @yujuhong is the expert on which timers to mod here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a smaller factor such as 0.5? I think that should be enough to distribute the sync times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both factors decreased to 0.5

// jitter factor for resyncInterval
workerResyncIntervalJitterFactor = 0.5

// jitter factor for backOffPeriod
workerBackOffPeriodJitterFactor = 0.5
)

type podWorkers struct {
// Protects all per worker fields.
podLock sync.Mutex
Expand Down Expand Up @@ -209,10 +218,10 @@ func (p *podWorkers) wrapUp(uid types.UID, syncErr error) {
switch {
case syncErr == nil:
// No error; requeue at the regular resync interval.
p.workQueue.Enqueue(uid, p.resyncInterval)
p.workQueue.Enqueue(uid, wait.Jitter(p.resyncInterval, workerResyncIntervalJitterFactor))
default:
// Error occurred during the sync; back off and then retry.
p.workQueue.Enqueue(uid, p.backOffPeriod)
p.workQueue.Enqueue(uid, wait.Jitter(p.backOffPeriod, workerBackOffPeriodJitterFactor))
}
p.checkForUpdates(uid)
}
Expand Down
8 changes: 7 additions & 1 deletion pkg/kubelet/prober/worker.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ limitations under the License.
package prober

import (
"math/rand"
"time"

"github.com/golang/glog"
Expand Down Expand Up @@ -93,7 +94,8 @@ func newWorker(

// run periodically probes the container.
func (w *worker) run() {
probeTicker := time.NewTicker(time.Duration(w.spec.PeriodSeconds) * time.Second)
probeTickerPeriod := time.Duration(w.spec.PeriodSeconds) * time.Second
probeTicker := time.NewTicker(probeTickerPeriod)

defer func() {
// Clean up.
Expand All @@ -105,6 +107,10 @@ func (w *worker) run() {
w.probeManager.removeWorker(w.pod.UID, w.container.Name, w.probeType)
}()

// If kubelet restarted the probes could be started in rapid succession.
// Let the worker wait for a random portion of tickerPeriod before probing.
time.Sleep(time.Duration(rand.Float64() * float64(probeTickerPeriod)))

probeLoop:
for w.doProbe() {
// Wait for next probe tick.
Expand Down