Skip to content

lalithsuresh/distributed-system-testing-for-kubernetes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Research project to explore distributed system testing techniques in Kubernetes.

Workflow

For now, we seek to statically extract dependencies between reflectors/informers in Kubernetes. It has only been tested with the scheduler.

Collector

Kubetorch will first collect all callings to the AddEventHandlers and use the corresponding handlers as the starting point for the tracker. For the below example, collector will recognize sched.addPodToSchedulingQueue as the handler for ADD of the podInformer.

podInformer.Informer().AddEventHandler(
    cache.FilteringResourceEventHandler{
        FilterFunc: ...,
        Handler: cache.ResourceEventHandlerFuncs{
            AddFunc:    sched.addPodToSchedulingQueue,
            UpdateFunc: sched.updatePodInSchedulingQueue,
            DeleteFunc: sched.deletePodFromSchedulingQueue,
        },
    },
)

Tracker

Tracker will start from analyzing the handlers.

For each handler, we identify all the writing point of all the non-local variables. Basically, tracker pushes all the non-local variables which will be modified inside the handler to a queue.

In the below example, sched.SchedulingQueue will be pushed into the queue.

func (sched *Scheduler) addPodToSchedulingQueue(obj interface{}) {
	pod := obj.(*v1.Pod)
	klog.V(3).Infof("add event for unscheduled pod %s/%s", pod.Namespace, pod.Name)
	if err := sched.SchedulingQueue.Add(pod); err != nil {
		utilruntime.HandleError(fmt.Errorf("unable to queue %T: %v", obj, err))
	}
}

For each variable in the quue, then we identify the reading point. Basically, tracker finds where the the variables are used.

In the below example, sched.NextPod() is reading from the sched.SchedulingQueue (some details are omitted here).

func (sched *Scheduler) scheduleOne(ctx context.Context) {
    podInfo := sched.NextPod()
    // pod could be nil when schedulerQueue is closed
    if podInfo == nil || podInfo.Pod == nil {
        return
    }
    pod := podInfo.Pod
    ...
}

Then we perform taint analysis from the reading point. For the above example, we start from podInfo and tracks all the variables tainted by it.

Tracker keeps tracking until it hits the predefined termination. Those terminations will be the methods which calls a RESTful api to change some resource on apiserver. For the below example, pod is tainted by the previous podInfo and extender.Bind will call RESTful POST and change pod resource. (There wil be more details about how we identify those terminations)

func (sched *Scheduler) extendersBinding(pod *v1.Pod, node string) (bool, error) {
    for _, extender := range sched.Algorithm.Extenders() {
        if !extender.IsBinder() || !extender.IsInterested(pod) {
            continue
        }
        return true, extender.Bind(&v1.Binding{
            ObjectMeta: metav1.ObjectMeta{Namespace: pod.Namespace, Name: pod.Name, UID: pod.UID},
            Target:     v1.ObjectReference{Kind: "Node", Name: node},
        })
    }
    return false, nil
}

After that, we have a chain starting from the handler addPodToSchedulingQueue ending at the RESTful POST call extendersBinding. Combining the result of the collector, we know that the ADD event of the podInformer will lead to extendersBinding which changes the pod resources in the system.

How to run

About

No description, website, or topics provided.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published