Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write AppsV1DaemonSet resource lifecycle test - +5 endpoint coverage #90877

Closed
1 of 5 tasks
riaankleinhans opened this issue May 8, 2020 · 17 comments
Closed
1 of 5 tasks
Assignees
Labels
area/conformance Issues or PRs related to kubernetes conformance tests sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/testing Categorizes an issue or PR as relevant to SIG Testing.

Comments

@riaankleinhans
Copy link
Contributor

riaankleinhans commented May 8, 2020

This issue is created to allow edit by @Riaankl

/wip
/hold

Identifying an untested feature Using APISnoop

According to this APIsnoop query, there are still some remaining DaemonSet endpoints which are untested.

SELECT
  operation_id,
  -- k8s_action,
  -- path,
  -- description,
  kind
  -- FROM untested_stable_core_endpoints
  FROM untested_stable_endpoints
  where path not like '%volume%'
  and kind like 'DaemonSet'
  -- and operation_id ilike '%%'
 ORDER BY kind,operation_id desc
 -- LIMIT 25
       ;
               operation_id                |   kind    
-------------------------------------------+-----------
 replaceAppsV1NamespacedDaemonSetStatus    | DaemonSet
 replaceAppsV1NamespacedDaemonSet          | DaemonSet
 readAppsV1NamespacedDaemonSetStatus       | DaemonSet
 readAppsV1NamespacedDaemonSet             | DaemonSet
 patchAppsV1NamespacedDaemonSetStatus      | DaemonSet
 patchAppsV1NamespacedDaemonSet            | DaemonSet
 listAppsV1DaemonSetForAllNamespaces       | DaemonSet
 deleteAppsV1NamespacedDaemonSet           | DaemonSet
 deleteAppsV1CollectionNamespacedDaemonSet | DaemonSet
 createAppsV1NamespacedDaemonSet           | DaemonSet
(10 rows)

API Reference and feature documentation

The mock test

Test outline

  1. Create a DaemonSet with a static label
  2. Patch the DaemonSet with a new Label and updated data
  3. Get the DaemonSet to ensure it's patched
  4. Update the DaemonSet
  5. List all DaemonSets in all Namespaces find the DaemonSet(1) ensure that the DaemonSet is found and is updated
  6. Delete Namespaced DaemonSet(1) via a Collection with a LabelSelector

Test the functionality in Go

package main

import (
  "encoding/json"
  "fmt"
  "flag"
  "os"
  v1 "k8s.io/api/core/v1"
  appsv1 "k8s.io/api/apps/v1"
  // "k8s.io/client-go/dynamic"
  // "k8s.io/apimachinery/pkg/runtime/schema"
  metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
  "k8s.io/client-go/kubernetes"
  "k8s.io/apimachinery/pkg/types"
  "k8s.io/client-go/tools/clientcmd"
  watch "k8s.io/apimachinery/pkg/watch"
)

func main() {
  // uses the current context in kubeconfig
  kubeconfig := flag.String("kubeconfig", fmt.Sprintf("%v/%v/%v", os.Getenv("HOME"), ".kube", "config"), "(optional) absolute path to the kubeconfig file")
  flag.Parse()
  config, err := clientcmd.BuildConfigFromFlags("", *kubeconfig)
  if err != nil {
      fmt.Println(err)
      return
  }
  // make our work easier to find in the audit_event queries
  config.UserAgent = "live-test-writing"
  // creates the clientset
  ClientSet, _ := kubernetes.NewForConfig(config)
  // DynamicClientSet, _ := dynamic.NewForConfig(config)
  // podResource := schema.GroupVersionResource{Group: "", Version: "v1", Resource: "pods"}

  // TEST BEGINS HERE

  testDaemonSetName := "testdaemonset"
  testDaemonSetImageInitial := "nginx"
  testDaemonSetImagePatch := "alpine"
  testDaemonSetImageUpdate := "httpd"
  testDaemonSetStaticLabel := map[string]string{"test-static": "true"}
  testDaemonSetStaticLabelFlat := "test-static=true"
  testDaemonSetSelector := map[string]string{"app": testDaemonSetName}
  testNamespaceName := "default"

  fmt.Println("creating a DaemonSet")
  testDaemonSet := appsv1.DaemonSet{
      ObjectMeta: metav1.ObjectMeta{
          Name: testDaemonSetName,
          Labels: testDaemonSetStaticLabel,
      },
      Spec: appsv1.DaemonSetSpec{
          Selector: &metav1.LabelSelector{
              MatchLabels: testDaemonSetSelector,
          },
          Template: v1.PodTemplateSpec{
              ObjectMeta: metav1.ObjectMeta{
                  Labels: testDaemonSetSelector,
              },
              Spec: v1.PodSpec{
                  Containers: []v1.Container{{
                      Name: testDaemonSetName,
                      Image: testDaemonSetImageInitial,
                  }},
              },
          },
      },
  }
  _, err = ClientSet.AppsV1().DaemonSets(testNamespaceName).Create(&testDaemonSet)

  fmt.Println("watching for the DaemonSet to be added")
  resourceWatchTimeoutSeconds := int64(180)
  resourceWatch, err := ClientSet.AppsV1().DaemonSets(testNamespaceName).Watch(metav1.ListOptions{LabelSelector: testDaemonSetStaticLabelFlat, TimeoutSeconds: &resourceWatchTimeoutSeconds})
  if err != nil {
      fmt.Println(err, "failed to setup watch on newly created DaemonSet")
      return
  }

  resourceWatchChan := resourceWatch.ResultChan()
  for watchEvent := range resourceWatchChan {
      if watchEvent.Type == watch.Added {
          break
      }
  }	
  fmt.Println("watching for DaemonSet readiness count to be equal to the desired count")
  for watchEvent := range resourceWatchChan {
      daemonset, ok := watchEvent.Object.(*appsv1.DaemonSet)
      if ok == false {
          fmt.Println("failed to convert watchEvent.Object type")
          return
      }
      if daemonset.Status.NumberReady == daemonset.Status.DesiredNumberScheduled {
          break
      }
  }	
  defer func() {
      fmt.Println("deleting the DaemonSet")
      err = ClientSet.AppsV1().DaemonSets(testNamespaceName).DeleteCollection(&metav1.DeleteOptions{}, metav1.ListOptions{LabelSelector: testDaemonSetStaticLabelFlat})
      if err != nil {
          fmt.Println(err)
          return
      }
      for watchEvent := range resourceWatchChan {
          daemonset, ok := watchEvent.Object.(*appsv1.DaemonSet)
          if ok != true {
              fmt.Println("unable to convert watchEvent.Object type")
              return
          }
          if watchEvent.Type == watch.Deleted && daemonset.ObjectMeta.Name == testDaemonSetName {
              break
          }
      }
  }()

  fmt.Println("patching the DaemonSet")
  resourcePatch, err := json.Marshal(map[string]interface{}{
      "metadata": map[string]interface{}{
          "labels": map[string]string{"test-resource": "patched"},
      },
      "spec": map[string]interface{}{
          "template": map[string]interface{}{
              "spec": map[string]interface{}{
                  "containers": []map[string]interface{}{{
                      "name": testDaemonSetName,
                      "image": testDaemonSetImagePatch,
                      "command": []string{"/bin/sleep", "100000"},
                  }},
              },
          },
      },
  })
  if err != nil {
      fmt.Println(err, "failed marshal resource patch")
      return
  }
  _, err = ClientSet.AppsV1().DaemonSets(testNamespaceName).Patch(testDaemonSetName, types.StrategicMergePatchType, []byte(resourcePatch))
  if err != nil {
      fmt.Println(err, "failed patch resource")
      return
  }
  for watchEvent := range resourceWatchChan {
     if watchEvent.Type == watch.Modified {
         break
     }
  }
  fmt.Println("watching for DaemonSet readiness count to be equal to the desired count")
  for watchEvent := range resourceWatchChan {
      daemonset, ok := watchEvent.Object.(*appsv1.DaemonSet)
      if ok == false {
          fmt.Println("failed to convert watchEvent.Object type")
          return
      }
      if daemonset.Status.NumberReady == daemonset.Status.DesiredNumberScheduled {
          break
      }
  }	

  fmt.Println("fetching the DaemonSet")
  ds, err := ClientSet.AppsV1().DaemonSets(testNamespaceName).Get(testDaemonSetName, metav1.GetOptions{})
  if err != nil {
      fmt.Println(err, "failed fetch resource")
      return
  }
  if ds.ObjectMeta.Labels["test-resource"] != "patched" {
      fmt.Println("failed to patch resource - missing patched label")
      return
  }
  if ds.Spec.Template.Spec.Containers[0].Image != testDaemonSetImagePatch {
      fmt.Println("failed to patch resource - missing patched image")
      return
  }
  if ds.Spec.Template.Spec.Containers[0].Command[0] != "/bin/sleep" {
      fmt.Println("failed to patch resource - missing patched command")
      return
  }

  fmt.Println("updating the DaemonSet")
  dsUpdate := ds
  dsUpdate.ObjectMeta.Labels["test-resource"] = "updated"
  dsUpdate.Spec.Template.Spec.Containers[0].Image = testDaemonSetImageUpdate
  dsUpdate.Spec.Template.Spec.Containers[0].Command = []string{}
  _, err = ClientSet.AppsV1().DaemonSets(testNamespaceName).Update(dsUpdate)
  if err != nil {
      fmt.Println(err, "failed to update resource")
      return
  }
  fmt.Println("watching for DaemonSet readiness count to be equal to the desired count")
  for watchEvent := range resourceWatchChan {
      daemonset, ok := watchEvent.Object.(*appsv1.DaemonSet)
      if ok == false {
          fmt.Println("failed to convert watchEvent.Object type")
          return
      }
      if daemonset.Status.NumberReady == daemonset.Status.DesiredNumberScheduled {
          break
      }
  }	

  fmt.Println("listing DaemonSets")
  dss, err := ClientSet.AppsV1().DaemonSets("").List(metav1.ListOptions{LabelSelector: testDaemonSetStaticLabelFlat})
  if err != nil {
      fmt.Println(err, "failed to list DaemonSets")
      return
  }
  if len(dss.Items) == 0 {
      fmt.Println("there are no DaemonSets found")
      return
  }
  for _, ds := range dss.Items {
      if ds.ObjectMeta.Labels["test-resource"] != "updated" {
          fmt.Println("failed to patch resource - missing updated label")
          return
      }
      if ds.Spec.Template.Spec.Containers[0].Image != testDaemonSetImageUpdate {
          fmt.Println("failed to patch resource - missing updated image")
          return
      }
      if len(ds.Spec.Template.Spec.Containers[0].Command) != 0 {
          fmt.Println("failed to patch resource - missing updated command")
          return
      }
  }

  // TEST ENDS HERE

  // write test here
  fmt.Println("[status] complete")

}
creating a DaemonSet
watching for the DaemonSet to be added
watching for DaemonSet readiness count to be equal to the desired count
patching the DaemonSet
watching for DaemonSet readiness count to be equal to the desired count
fetching the DaemonSet
updating the DaemonSet
watching for DaemonSet readiness count to be equal to the desired count
listing DaemonSets
[status] complete
deleting the DaemonSet

Verifying increase it coverage with APISnoop

Discover useragents:

select distinct useragent from audit_event where bucket='apisnoop' and useragent not like 'kube%' and useragent not like 'coredns%' and useragent not like 'kindnetd%' and useragent like 'live%';
     useragent     
-------------------
 live-test-writing
(1 row)

List endpoints hit by the test:

select * from endpoints_hit_by_new_test where useragent like 'live%'; 
     useragent     |               operation_id                | hit_by_ete | hit_by_new_test 
-------------------+-------------------------------------------+------------+-----------------
 live-test-writing | createAppsV1NamespacedDaemonSet           | f          |               1
 live-test-writing | deleteAppsV1CollectionNamespacedDaemonSet | f          |               1
 live-test-writing | listAppsV1DaemonSetForAllNamespaces       | f          |               1
 live-test-writing | listAppsV1NamespacedDaemonSet             | t          |               1
 live-test-writing | patchAppsV1NamespacedDaemonSet            | f          |               1
 live-test-writing | readAppsV1NamespacedDaemonSet             | f          |               1
 live-test-writing | replaceAppsV1NamespacedDaemonSet          | f          |               1
(7 rows)

Display endpoint coverage change:

select * from projected_change_in_coverage;
   category    | total_endpoints | old_coverage | new_coverage | change_in_number 
---------------+-----------------+--------------+--------------+------------------
 test_coverage |             445 |          181 |          187 |                6
(1 row)

Final notes

If a test with these calls gets merged, test coverage will go up by 6 points

This test is also created with the goal of conformance promotion.

/sig testing

/sig architecture

/area conformance

@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label May 8, 2020
@riaankleinhans
Copy link
Contributor Author

Notes from Conformance Meeting:

  • It's not necessary one per node, it has a target set of schedulable nodes, count and compare that
  • taints are considered destructive - there is a utility to get me a list of schedulable nodes
  • Number of pods ready is equal to the number scheduled
  • Daemonset is doing what it is doing VS what the user is expecting
  • As a user I expect it to deploy to all the nodes I schedule the pod to run on
  • Not as far as what nodes should it be on, vs what nodes we don't schedule on
  • keep track of number of nodes it should be scheduled on
  • Look at scheduling tests, which may [Serial] to ensure we don't get stepped on
  • You don't have control of number of nodes that are schedulable
  • Control plane may be tainted
  • Need to someway to describe or express how many nodes it should run on

#89637 (comment)

@riaankleinhans
Copy link
Contributor Author

/assign @raymonddeng99

@k8s-ci-robot
Copy link
Contributor

@Riaankl: GitHub didn't allow me to assign the following users: raymonddeng99.

Note that only kubernetes members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @raymonddeng99

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@riaankleinhans
Copy link
Contributor Author

This mock test needs to be updated with the above for review next conformance meeting.

@riaankleinhans
Copy link
Contributor Author

/sig testing

/sig architecture

/area conformance

@k8s-ci-robot k8s-ci-robot added sig/testing Categorizes an issue or PR as relevant to SIG Testing. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. area/conformance Issues or PRs related to kubernetes conformance tests and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 8, 2020
@hh hh added this to Issues To Triage in conformance-definition May 8, 2020
@raymonddeng99
Copy link

/assign

Thanks @Riaankl !

@raymonddeng99 raymonddeng99 removed their assignment May 30, 2020
@riaankleinhans riaankleinhans moved this from Issues To Triage to Sorted Backlog / On hold Issues in conformance-definition Jun 17, 2020
@riaankleinhans
Copy link
Contributor Author

/assign @BobyMCbobs

@riaankleinhans
Copy link
Contributor Author

Notes from Conformance Meeting:
6 May 2020

  • It's not necessary one per node, it has a target set of schedulable nodes, count and compare that
  • taints are considered destructive - there is a utility to get me a list of schedulable nodes
  • Number of pods ready is equal to the number scheduled
  • Daemonset is doing what it is doing VS what the user is expecting
  • As a user I expect it to deploy to all the nodes I schedule the pod to run on
  • Not as far as what nodes should it be on, vs what nodes we don't schedule on
  • keep track of number of nodes it should be scheduled on
  • Look at scheduling tests, which may [Serial] to ensure we don't get stepped on
  • You don't have control of number of nodes that are schedulable
  • Control plane may be tainted
  • Need to someway to describe or express how many nodes it should run on

@BobyMCbobs
Copy link
Member

Uncertainties:

  • how many DaemonSet Pods should be expected for the test?
  • should Nodes be labeled to cap the number of expected DaemonSet Pods? - is there risk to this? - is it a behaviour that is allowed in conformance?

@riaankleinhans riaankleinhans changed the title Write AppsV1DaemonSet resource lifecycle test+promote - +6 endpoint coverage Write AppsV1DaemonSet resource lifecycle test - +6 endpoint coverage Jun 28, 2020
@hh
Copy link
Member

hh commented Jun 29, 2020

Asked for feedback in slack via #k8s-conformance

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 27, 2020
@riaankleinhans
Copy link
Contributor Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 27, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 27, 2020
@riaankleinhans
Copy link
Contributor Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 6, 2021
@riaankleinhans riaankleinhans moved this from Sorted Backlog / On hold Issues to In Progress /Active Issues in conformance-definition Jan 13, 2021
@riaankleinhans riaankleinhans moved this from In Progress /Active Issues to Sorted Backlog / On hold Issues in conformance-definition Jan 25, 2021
@riaankleinhans
Copy link
Contributor Author

Feedback from the Conformance meeting: 26 Jan 2021
Look like a good test to write once update as proposes around watch events.
Once update a PR will be created for the test.
@spiffxp is a little concerned that this type of test might cause flakes, but will comment if more solid data become available.

@riaankleinhans
Copy link
Contributor Author

/assign @heyste

@riaankleinhans riaankleinhans changed the title Write AppsV1DaemonSet resource lifecycle test - +6 endpoint coverage Write AppsV1DaemonSet resource lifecycle test - +5 endpoint coverage Feb 3, 2021
@riaankleinhans riaankleinhans moved this from Sorted Backlog / On hold Issues to In Progress /Active Issues in conformance-definition Mar 16, 2021
@riaankleinhans
Copy link
Contributor Author

/close
Changes into a "Status" only test

conformance-definition automation moved this from In Progress /Active Issues to Done Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/conformance Issues or PRs related to kubernetes conformance tests sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/testing Categorizes an issue or PR as relevant to SIG Testing.
Projects
Development

No branches or pull requests

7 participants