Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rt-migrate test failed on multi-core server #812

Closed
lijunliang opened this issue Apr 25, 2021 · 4 comments
Closed

rt-migrate test failed on multi-core server #812

lijunliang opened this issue Apr 25, 2021 · 4 comments
Labels

Comments

@lijunliang
Copy link

rt-migrate test creates N+1 threads with real time priorities while N is cpu numbers . On linux distro, FIFO priorities range from 1 to 99. In our test scenario (cpu number = 128), rt-migrate stucks .

@metan-ucw
Copy link
Member

Looking at the code it indeed does not seem to work when number of CPUs is higher than maximal sched_priority. I suppose that the test needs redesign.

@metan-ucw metan-ucw added the bug label May 14, 2021
@Martchus
Copy link
Contributor

This is indeed reproducible via ./rt-migrate 99 (./rt-migrate 98 still works). The tests gives task 0 prio 2, task 1 prio 3 and so on so task 97 gets pro 99 which is the highest possible.

Could we simply max out on 98 threads (never attempt to spawn more threads even if there are more CPUs)?

We could also max out on the priority so that all tasks as of 97 get the same priority of 99.

By the way, this test doesn't seem to use the new test API yet. Should one port it to the new API before doing anything or it is ok to do changes within the old structure? I've also noticed that the test definitely needs better error handling because it gets stuck on failures like pthread_create failed: 22 (Invalid argument) and should likely just exist with a broken result instead.

@Martchus
Copy link
Contributor

I went ahead and sent two patches to fix the most important issues with that test (the prio issue and that it can easily segfault when one passes a negative number). Supposedly there's still further refactoring work wanted from your side but maybe I'll better wait for feedback before getting too invested.

Martchus added a commit to Martchus/ltp that referenced this issue Sep 14, 2023
* According to the documentation the value param->sched_priority
  must lie within the range given by sched_get_priority_min(2) and
  sched_get_priority_max(2). This change ensures that this is the
  case without completely restructuring the test yet.
* See linux-test-project#812
@Martchus
Copy link
Contributor

Note that refactoring this test to use the new test API would be a little bit more difficult (even though the helpers from the new API would be very worthwhile to use). The main problem is that the test library and the realtime library define a similar set of macros the needed to be undefined manually to avoid conflicts. See Martchus@725c66c for a first draft and a workaround for the mentioned conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants