Skip to content

tests: fix schemes-filters cleanup and prcl sub-test#49

Merged
sjp38 merged 2 commits into
damonitor:nextfrom
KunWuChan:next
May 22, 2026
Merged

tests: fix schemes-filters cleanup and prcl sub-test#49
sjp38 merged 2 commits into
damonitor:nextfrom
KunWuChan:next

Conversation

@KunWuChan
Copy link
Copy Markdown

Fix two issues in the schemes-filters test that caused cascade failures
when running ./tests/run.sh.

Problem 1: stale DAMON on test failure

test_filters() starts DAMON via damo start and then checks the
workload RSS. If the check failed, it called exit 1 without
stopping DAMON or killing the workload. The next test then failed:

  FAIL record-validate "sleep 5" 5 none sysfs
  (damo-record command failed with value 254)
  could not turn DAMON on (writing on to .../state failed
  ([Errno 16] Device or resource busy))

Fix: stop DAMON and kill the workload before returning on all three
failure paths, and propagate the return value to the caller. Also
add a defensive DAMON stop at the start of run.sh.

Problem 2: prcl sub-test uses paddr by default

The prcl sub-test used damo start -c prcl_damos.json without a
target, defaulting to paddr monitoring. The scheme targets the 0th
percentile (nr_accesses == 0). After the arm64 kernel TLB fix [1],
access detection works correctly and pages with non-zero access
counts no longer match, so the workload pages are never reclaimed:

  FAIL schemes-filters sysfs (prcl doesn't work: 1049324)

Fix: start the workload first, then target it with its PID, which
switches to vaddr monitoring and tests pageout directly.

[1] https://lore.kernel.org/damon/20260520113934.xxx/

Tested on

  • arm64 (Phytium, 128 cores, 7.0.0-next-20260424+)
  • ./tests/run.sh passes all tests
  • ./damon-tests/corr/run.sh passes all tests

Co-developed-by: Wang Lian lianux.mm@gmail.com
Signed-off-by: Kunwu Chan kunwu.chan@gmail.com

Kunwu Chan added 2 commits May 22, 2026 16:52
test_filters() starts DAMON via 'damo start' and then runs the
workload.  If the subsequent RSS check fails, it called 'exit 1'
without stopping DAMON or killing the workload, leaving kdamond
running.  The next test then failed with:

  $ ./tests/run.sh
  [...]
  FAIL record-validate "sleep 5" 5 none sysfs
  (could not turn DAMON on: [Errno 16] Device or resource busy)

Fix the three failure paths in test_filters() to stop DAMON and kill
the workload before returning, and propagate the return value to the
caller.  Also add a defensive DAMON stop at the start of run.sh.

Co-developed-by: Wang Lian <lianux.mm@gmail.com>
Signed-off-by: Kunwu Chan <kunwu.chan@gmail.com>
The prcl sub-test used 'damo start -c prcl_damos.json' without a
target, which defaults to paddr (physical address space) monitoring.
The scheme targets regions in the 0th percentile (nr_accesses == 0).
After the arm64 TLB fix, access detection works correctly and the
workload's pages have non-zero nr_accesses, so they are no longer in
the 0th percentile and the scheme never applies to them, causing:

  $ ./tests/run.sh
  [...]
  FAIL schemes-filters sysfs (prcl doesn't work: 1049324)

Fix by starting the workload first, then targeting it explicitly with
its PID, which switches to vaddr monitoring and accurately tests
pageout functionality.

The prcl_no_anon and prcl_no_cgroup sub-tests continue to use paddr;
their purpose is to verify that DAMOS filters block pageout, and the
expected result (RSS stays high) is satisfied regardless of whether
the scheme matches.

Co-developed-by: Wang Lian <lianux.mm@gmail.com>
Signed-off-by: Kunwu Chan <kunwu.chan@gmail.com>
@sjp38 sjp38 merged commit e8b83f4 into damonitor:next May 22, 2026
@sjp38
Copy link
Copy Markdown
Contributor

sjp38 commented May 22, 2026

Awesome, thank you for this great PR. Looks great to me. Merged.

@sjp38
Copy link
Copy Markdown
Contributor

sjp38 commented May 22, 2026

Also, this change may even closing this issue: #49

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants