Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vma stressors are still running after the test #343

Closed
Cypresslin opened this issue Dec 12, 2023 · 0 comments
Closed

vma stressors are still running after the test #343

Cypresslin opened this issue Dec 12, 2023 · 0 comments

Comments

@Cypresslin
Copy link

stress-ng: V0.17.03

This issue was spotted on a dgx2 bare-metal server. With Ubuntu Bionic 4.15.0-213-generic and Ubuntu Focal 5.4.0-169-generic (I haven't test this with newer releases yet)

Steps to reproduce:

  1. Exercise the vma stressor ./stress-ng -v -t 5 --vma 4 --vma-ops 3000 --ignite-cpu --syslog --verbose --verify --oomable
  2. After the test wait about 5 minutes and check for stress-ng-vma processes.

The vma test will pass, but one of the stress-ng-vma process is still alive.

# ./stress-ng -v -t 5 --vma 4 --vma-ops 3000 --ignite-cpu --syslog --verbose --verify --oomable
stress-ng: debug: [5392] invoked with './stress-ng -v -t 5 --vma 4 --vma-ops 3000 --ignite-cpu --syslog --verbose --verify --oomable' by user 0 'root'
stress-ng: debug: [5392] stress-ng 0.17.03 g8c39f5a2d9b1
stress-ng: debug: [5392] system: Linux akis 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64, gcc 7.5.0, glibc 2.27
stress-ng: debug: [5392] RAM total: 1.5T, RAM free: 1.5T, swap free: 0.0
stress-ng: debug: [5392] temporary file path: '/home/ubuntu/autotest/client/tmp/ubuntu_stress_smoke_test/src/stress-ng', filesystem type: ext2 (217607869 blocks available)
stress-ng: debug: [5392] CPUs have 4 idle states: C1, C1E, C6, POLL
stress-ng: debug: [5392] 96 processors online, 96 processors configured
stress-ng: info:  [5392] setting to a 5 secs run per stressor
stress-ng: debug: [5392] CPU data cache: L1: 32K, L2: 1024K, L3: 33792K
stress-ng: debug: [5392] cache allocate: shared cache buffer size: 67584K (LLC size x 2 NUMA nodes)
stress-ng: info:  [5392] dispatching hogs: 4 vma
stress-ng: debug: [5392] starting stressors
stress-ng: debug: [5392] 4 stressors started
stress-ng: debug: [5393] vma: [5393] started (instance 0 on CPU 11)
stress-ng: debug: [5394] vma: [5394] started (instance 1 on CPU 12)
stress-ng: debug: [5395] vma: [5395] started (instance 2 on CPU 13)
stress-ng: debug: [5396] vma: [5396] started (instance 3 on CPU 31)
stress-ng: debug: [5395] vma: [5395] exited (instance 2 on CPU 16)
stress-ng: debug: [5394] vma: [5394] exited (instance 1 on CPU 39)
stress-ng: debug: [5393] vma: [5393] exited (instance 0 on CPU 35)
stress-ng: debug: [5392] vma: [5393] terminated (success)
stress-ng: debug: [5392] vma: [5394] terminated (success)
stress-ng: debug: [5392] vma: [5395] terminated (success)
stress-ng: debug: [5396] vma: [5396] exited (instance 3 on CPU 31)
stress-ng: debug: [5392] vma: [5396] terminated (success)
stress-ng: debug: [5392] metrics-check: all stressor metrics validated and sane
stress-ng: info:  [5392] skipped: 0
stress-ng: info:  [5392] passed: 4: vma (4)
stress-ng: info:  [5392] failed: 0
stress-ng: info:  [5392] metrics untrustworthy: 0
stress-ng: info:  [5392] successful run completed in 0.07 secs

# ps aux | grep vma
root      5414 2005  0.0 359024  2772 pts/0    Sl   05:11  26:24 stress-ng-vma [run]
root      5711  0.0  0.0  14860  1108 pts/0    S+   05:13   0:00 grep --color=auto vma

# strace -p 5414
strace: Process 5414 attached
pause()                                 = ? ERESTARTNOHAND (To be restarted if no handler)
pause()                                 = ? ERESTARTNOHAND (To be restarted if no handler)
pause()                                 = ? ERESTARTNOHAND (To be restarted if no handler)
(repeats)

Bisect shows 4ed65aa is the first bad commit
Author: Colin Ian King colin.i.king@gmail.com
Date: Tue Oct 17 10:42:38 2023 +0100

stress-vma: fix 32/64 bit address generation

The 64 bit address size check was incorrect, it always defaulted to
32 bits, so fix this. Also add better address generation for 32 and
64 bit addresses.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>

In this commit the (void)sleep(15); changed into pause(); before the _exit(0).

Maybe this has something to do with the number of CPU cores? I can reproduce this with a 12 core Focal VM, but not on a Focal VM with just 2 cores.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant