Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: add support for moving devices between instances #477

Merged
merged 1 commit into from Feb 8, 2023

Conversation

yarda
Copy link
Contributor

@yarda yarda commented Nov 9, 2022

Extends the runtime API by the method:

  instance_acquire_devices(devices, instance_name)

E.g. consider the following TuneD profile:

[cpus_perf]
type=cpu
devices=cpu0, cpu1
governor=performance

[cpus_idle]
type=cpu
devices=${f:cpulist2devs:2-3}
governor=ondemand

[cpus_idle2]
type=cpu
devices=${f:cpulist2devs:4-5,7}
governor=ondemand

Notice that it's possible to use internal function 'cpulist2devs' to easily
specify cpulists.

After the following API call:

  instance_acquire_devices("cpulist:2-5,7", "cpus_perf")

It will result in cpu0 - cpu5 and cpu7 having the "performance" governor.

It's also possible to specify full device names in the API instead of the
cpulist, e.g.:

  instance_acquire_devices("cpu2,cpu3,cpu4,cpu5,cpu7", "cpus_perf")

In case the comma is part of the device name, it can be escaped by ",".

In case the device name starts with the string "cpulist:", e.g. there is
by accident device named "cpulist:abcd", it's possible to use
"cpulist:cpulist:abcd" to escape it and don't recognize it as the cpulist.

The cpulist expand feature also supports bitmasks variant prefixed by the
0x and written as a string, e.g. "0xffffffff", i.e. all the standard
TuneD cpulist syntaxes are supported.

Resolves: rhbz#2113925

Signed-off-by: Jaroslav Škarvada jskarvad@redhat.com

@lgtm-com
Copy link

lgtm-com bot commented Nov 9, 2022

This pull request introduces 3 alerts when merging e131879 into 420267f - view on LGTM.com

new alerts:

  • 3 for Unused local variable

tuned/daemon/controller.py Fixed Show fixed Hide fixed
tuned/daemon/controller.py Fixed Show fixed Hide fixed
tuned/daemon/controller.py Fixed Show fixed Hide fixed
tuned/daemon/controller.py Fixed Show fixed Hide fixed
@lgtm-com
Copy link

lgtm-com bot commented Dec 6, 2022

This pull request introduces 3 alerts when merging 6116483 into f43df56 - view on LGTM.com

new alerts:

  • 3 for Unused local variable

Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. It looks like GitHub code scanning with CodeQL is already set up for this repo, so no further action is needed 🚀. For more information, please check out our post on the GitHub blog.

@jmencak
Copy link
Contributor

jmencak commented Dec 13, 2022

I did some early testing and it seems to work for the most part. Where it fails, however, is when we include a parent profile that already provides a [cpu] plug-in settings. For example, try to include=throughput-performance. You'll see:

2022-12-13 13:37:04,768 WARNING  tuned.plugins.base: instance cpus_perf: no matching devices available
2022-12-13 13:37:04,768 INFO     tuned.plugins.plugin_cpu: Latency settings from non-first CPU plugin instance 'cpus_perf' will be ignored.
2022-12-13 13:37:04,769 WARNING  tuned.plugins.base: instance cpus_idle: no matching devices available
2022-12-13 13:37:04,769 INFO     tuned.plugins.plugin_cpu: Latency settings from non-first CPU plugin instance 'cpus_idle' will be ignored.

@jmencak
Copy link
Contributor

jmencak commented Jan 13, 2023

A couple of generic comments. Would it make sense to:

  • support CPU lists such as we already do for isolated_cores, e.g.: devices=2,4-7 instead of explicitly saying devices=cpu2, cpu4, cpu5, cpu6, cpu7?
  • when a device is already in a group, automatically remove it and assign it to a new group when you call instance_add_device ?

@jmencak
Copy link
Contributor

jmencak commented Jan 17, 2023

Adding another issue for tracking. When using the following spec:

      [cpus_perf]
      type=cpu
      devices=
      governor=performance

      [cpus_idle]
      type=cpu
      devices=cpu0,cpu1,cpu2,cpu3
      governor=powersave

All devices are assigned to the performance group. I'd expect no devices assigned in the performance group.

tuned/plugins/hotplug.py Outdated Show resolved Hide resolved
or accepted by the instance.
"""
if device_name in self._assigned_devices:
log.error("device '%s' is already assigned to instance" % device_name)
Copy link
Contributor

@jmencak jmencak Jan 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to drop the severity to Warning here? Error seems too high. Also knowing which instance would help.

@yarda yarda force-pushed the api_extend_hotplug branch 2 times, most recently from 533968f to 7a77511 Compare February 2, 2023 23:47
@yarda
Copy link
Contributor Author

yarda commented Feb 2, 2023

I simplified the API, I dropped the deplug/hotplug methods which could lead to races and inconsistent internal states which is not trivial to handle and added more robust method 'instance_acquire_devices(devices, instance_name)'. See the edited comment 0 for description.

@yarda yarda requested a review from CZerta February 2, 2023 23:59
@yarda yarda changed the title API: add support for hotplug/deplug API: add support for moving devices between instances Feb 3, 2023
@jmencak
Copy link
Contributor

jmencak commented Feb 6, 2023

Thank you for the changes, @yarda . I didn't check out your code (yet), but did some testing and I'm getting the following:

   struct {
      boolean false
      string "Instance 'cpu' not found"
   }

When using your example above and running

  dbus-send \
    --system \
    --print-reply \
    --dest="com.redhat.tuned" \
    /Tuned \
    com.redhat.tuned.control.instance_acquire_devices \
    string:cpu1 string:cpus_perf

I also get similar results when using socket API from PR470. Is this expected?

@yarda
Copy link
Contributor Author

yarda commented Feb 6, 2023

Thank you for the changes, @yarda . I didn't check out your code (yet), but did some testing and I'm getting the following:

   struct {
      boolean false
      string "Instance 'cpu' not found"
   }

When using your example above and running

  dbus-send \
    --system \
    --print-reply \
    --dest="com.redhat.tuned" \
    /Tuned \
    com.redhat.tuned.control.instance_acquire_devices \
    string:cpu1 string:cpus_perf

I also get similar results when using socket API from PR470. Is this expected?

It's bug in the error message, it should be:
"Instance 'cpus_perf' not found"

I will PR fix. Do you have the 'cpus_perf' plugin instance defined? Could you provide your testing profile?

Extends the runtime API by the method:
  instance_acquire_devices(devices, instance_name)

E.g. consider the following TuneD profile:
[cpus_perf]
type=cpu
devices=cpu0, cpu1
governor=performance

[cpus_idle]
type=cpu
devices=${f:cpulist2devs:2-3}
governor=ondemand

[cpus_idle2]
type=cpu
devices=${f:cpulist2devs:4-5,7}
governor=ondemand

Notice that it's possible to use internal function 'cpulist2devs' to easily
specify cpulists.

After the following API call:
  instance_acquire_devices("cpulist:2-5,7", "cpus_perf")

It will result in cpu0 - cpu5 and cpu7 having the "performance" governor.

It's also possible to specify full device names in the API instead of the
cpulist, e.g.:
  instance_acquire_devices("cpu2,cpu3,cpu4,cpu5,cpu7", "cpus_perf")

In case the comma is part of the device name, it can be escaped by "\,".

In case the device name starts with the string "cpulist:", e.g. there is
by accident device named "cpulist:abcd", it's possible to use
"cpulist:cpulist:abcd" to escape it and don't recognize it as the cpulist.

The cpulist expand feature also supports bitmasks variant prefixed by the
0x and written as a string, e.g. "0xffffffff", i.e. all the standard
TuneD cpulist syntaxes are supported.

Resolves: rhbz#2113925

Signed-off-by: Jaroslav Škarvada <jskarvad@redhat.com>
@yarda
Copy link
Contributor Author

yarda commented Feb 7, 2023

I tried with the profile from comment 0:

$ dbus-send --system --print-reply --dest="com.redhat.tuned" /Tuned com.redhat.tuned.control.instance_acquire_devices string:cpu1 string:cpus_perf
method return time=1675729198.283921 sender=:1.30 -> destination=:1.32 serial=7 reply_serial=2
   struct {
      boolean true
      string "OK"
   }

Btw cpu1 is already handled by the instance cpus_perf thus the above API call has no effect.

@jmencak
Copy link
Contributor

jmencak commented Feb 7, 2023

Do you have the 'cpus_perf' plugin instance defined? Could you provide your testing profile?

I was using a different profile than I intended. Sorry about that. It seems to work as expected now. Will do a bit more testing.

Copy link
Contributor

@CZerta CZerta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yarda yarda merged commit bc2348b into redhat-performance:master Feb 8, 2023
14 checks passed
@yarda yarda deleted the api_extend_hotplug branch February 8, 2023 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants