Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: Fix TestCheckHostIsVMContainerCapable unstablity issue #8389

Merged
merged 1 commit into from Nov 8, 2023

Conversation

justxuewei
Copy link
Member

@justxuewei justxuewei commented Nov 7, 2023

TestCheckHostIsVMContainerCapable removes sysModuleDir to simulate a
case that the kernel modules are not loaded. However,
checkKernelModules() executes modprobe <module> if a module not
found in that directory. Loading those modules is required to be denied
temporarily.

Fixes: #8390

Signed-off-by: Xuewei Niu niuxuewei.nxw@antgroup.com

@katacontainersbot katacontainersbot added the size/small Small and simple task label Nov 7, 2023
@justxuewei justxuewei force-pushed the vm-capable-test branch 3 times, most recently from 2f2709d to 558780d Compare November 7, 2023 08:06
@justxuewei justxuewei changed the title [WIP] runtime: Fix TestCheckHostIsVMContainerCapable unstablity issue runtime: Fix TestCheckHostIsVMContainerCapable unstablity issue Nov 7, 2023
Copy link
Member

@fidencio fidencio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks @justxuewei!

Copy link
Member

@studychao studychao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@studychao
Copy link
Member

/test

Copy link
Member

@gkurz gkurz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking this temporarily as it already has two approvals.

@@ -322,6 +325,30 @@ func TestCheckHostIsVMContainerCapable(t *testing.T) {
err = hostIsVMContainerCapable(details)
assert.Nil(err)

// Remove required kernel modules and add them to denylist
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two remarks here :

  1. the "failure" test already removes the files on disk. This means they won't be loadable anymore and that we don't need to bother with a deny list,
  2. making sure that the modules are unloaded from memory is rather a requirement of the "failure" test. For clarity, I'd move this block under the // remove the modules to force a failure comment.

Copy link
Member Author

@justxuewei justxuewei Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing those simulated module files is not enough for passing the "failure" test. Let me give a more details about that:

  1. sysModuleDir is replaced with a temp dir, e.g. "/tmp/tmp.xxGmSePqj2".
  2. Call createModules(). The function just creates regular files to simulate system modules.
    if !d.isDir {
    err = createFile(d.path, d.contents)
    assert.NoError(err)
    }
  3. Remove sysModuleDir under the "// remove the modules to force a failure" comment.
  4. Call haveKernelModule(). This function checks existence of a kernel module. Three steps are involved:
    1. If a kernel module exists at sysModuleDir, return true. In this case, it won't return as the dir was removed.
      path := filepath.Join(sysModuleDir, module)
      if katautils.FileExists(path) {
      return true
      }
    2. Check if the current user is root. (Our test is started with sudo, it won't return here.)
    3. Try to load the module. Returns false if it fails. Please note that if a module was loaded in the memory or wasn't added to blacklist, loading module will be succeed.
      cmd := exec.Command(modProbeCmd, module)
      if output, err := cmd.CombinedOutput(); err != nil {
      kmodLog.WithError(err).WithField("output", string(output)).Warnf("modprobe insert module failed")
      return false
      }
    4. Return true.

I admit this patch introduces dirty works. But I have no idea how to pass the test without this. Wdyt? @gkurz

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plus, the blacklist file is restored after the function is completed. It won't poison environment.

@gkurz
Copy link
Member

gkurz commented Nov 7, 2023

@justxuewei any idea how this test can even pass without your fix ?

@justxuewei
Copy link
Member Author

justxuewei commented Nov 8, 2023

Things are getting complicated now. Commit dd530ba removes the failure test. Runtime's CI tests, make test and sudo -E PATH="$PATH" make test, will pass at the expense of declining coverage. This patch is still valuable to be reviewed and merged.

@fidencio
Copy link
Member

fidencio commented Nov 8, 2023

Folks, right now we have all the PRs being blocked on this issue.
We have 2 options here, from my personal point of view.

  1. Merge it, unblock other PRs, and figure out what's going on.
  2. Mark the static-checks that are failing as non-required, and have this properly fixed on this PR.

I'm fine with both options, but I'd like to unblock main ASAP, so I need the opinion from the reviewers on what to do.

@gkurz
Copy link
Member

gkurz commented Nov 8, 2023

Folks, right now we have all the PRs being blocked on this issue. We have 2 options here, from my personal point of view.

  1. Merge it, unblock other PRs, and figure out what's going on.
  2. Mark the static-checks that are failing as non-required, and have this properly fixed on this PR.

I'm fine with both options, but I'd like to unblock main ASAP, so I need the opinion from the reviewers on what to do.

Sorry for the delay. @justxuewei's clarifications are enough for now. Let's go with option 1 for the sake of unblocking everyone, but I really want this to be reworked again afterwards.

Copy link
Member

@gkurz gkurz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for unblocking the CI @justxuewei !

@gkurz
Copy link
Member

gkurz commented Nov 8, 2023

/test

TestCheckHostIsVMContainerCapable removes sysModuleDir to simulate a
case that the kernel modules are not loaded. However,
checkKernelModules() executes modprobe <module> if a module not
found in that directory. Loading those modules is required to be denied
temporarily.

Fixes: kata-containers#8390

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
@justxuewei
Copy link
Member Author

/test

@amshinde amshinde merged commit 268d4d6 into kata-containers:main Nov 8, 2023
135 of 142 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-test size/small Small and simple task
Projects
None yet
Development

Successfully merging this pull request may close these issues.

runtime: TestCheckHostIsVMContainerCapable is unstable
6 participants