-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minimize wait after receiving block threshold event #85
Labels
enhancement
Enhancing the system by adding new feature or improving performance or reliability
storage
virt
Comments
nirs
added
enhancement
Enhancing the system by adding new feature or improving performance or reliability
storage
virt
labels
Feb 23, 2022
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Mar 20, 2022
Minimize message latency by introducing an event mechanism. The unused host 0 mailbox is used now for sending and receiving change events. When a host send mail to the SPM, it writes an event to the event block. The SPM monitor this block 10 times per monitor interval, so it can detect that a new message is ready quickly. The HSM mail monitor monitor its inbox 10 times per monitor interval, so it can detect replies quicker. With both changes, sending an extend request latency reduced from 2-4 seconds to 0.2-0.4 seconds, reducing the risk of pausing VM when writing quickly to fast storage. Testing shows that we can write now 550 MiB/s without pausing a VM when the disk is extended. Before this change, writing we could write only 350 MiB/s without pausing. Here are example logs form this run, showing that total extend time is 1.16-3.14 seconds instead of 2.5-8.3 seconds. <Clock(total=1.16, wait=0.30, extend-volume=0.47, refresh-volume=0.38)> <Clock(total=2.43, wait=1.30, extend-volume=0.73, refresh-volume=0.40)> <Clock(total=1.60, wait=0.81, extend-volume=0.48, refresh-volume=0.30)> <Clock(total=2.61, wait=1.59, extend-volume=0.69, refresh-volume=0.32)> <Clock(total=1.80, wait=0.66, extend-volume=0.76, refresh-volume=0.38)> <Clock(total=3.14, wait=1.89, extend-volume=0.74, refresh-volume=0.51)> <Clock(total=2.17, wait=1.09, extend-volume=0.71, refresh-volume=0.37)> <Clock(total=1.35, wait=0.15, extend-volume=0.70, refresh-volume=0.51)> <Clock(total=2.43, wait=1.32, extend-volume=0.76, refresh-volume=0.35)> <Clock(total=1.76, wait=0.64, extend-volume=0.75, refresh-volume=0.36)> <Clock(total=2.74, wait=1.61, extend-volume=0.76, refresh-volume=0.37)> <Clock(total=2.01, wait=0.72, extend-volume=0.98, refresh-volume=0.31)> <Clock(total=2.35, wait=1.53, extend-volume=0.53, refresh-volume=0.30)> <Clock(total=1.88, wait=0.79, extend-volume=0.75, refresh-volume=0.34)> <Clock(total=1.26, wait=0.10, extend-volume=0.76, refresh-volume=0.40)> <Clock(total=1.90, wait=0.75, extend-volume=0.78, refresh-volume=0.38)> <Clock(total=3.06, wait=1.87, extend-volume=0.77, refresh-volume=0.42)> <Clock(total=1.84, wait=0.68, extend-volume=0.75, refresh-volume=0.41)> <Clock(total=2.54, wait=1.74, extend-volume=0.51, refresh-volume=0.29)> <Clock(total=2.25, wait=1.08, extend-volume=0.70, refresh-volume=0.47)> The largest issue now is the wait time; in the worst case, we waited 1.89 seconds before sending an extend request, which is 60% of the total extend time (3.14). This issue is tracked in oVirt#85. Fixes oVirt#102. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Mar 23, 2022
Minimize message latency by introducing an event mechanism. The unused host 0 mailbox is used now for sending and receiving change events. When a host send mail to the SPM, it writes an event to the event block. The SPM monitor this block every eventInterval (0.5 second) between monitor intervals, so it can detect that a new message is available quickly. The HSM mail monitor monitor its inbox every eventInterval (0.5 seconds) when it is waiting for replies, so it can detect replies quickly. With both changes, sending an extend request latency reduced from 2.0-4.0 seconds to 0.5-1.0 seconds, reducing the risk of pausing VM when writing quickly to fast storage. Reducing event interval increases vdsm CPU usage, since we use dd to read events. To improve this, we need to add a helper process for checking events and reading mailbox data. Testing shows that we can write now 525 MiB/s (the maximum rate on my nested test environment) in the guest without pausing a VM when the disk is extended. Before this change, we could write only 350 MiB/s before the VM starts to pause randomly during the test. Here are example logs form this run, showing that total extend time is 1.14-3.31 seconds instead of 2.5-8.3 seconds. <Clock(total=1.65, wait=0.28, extend-volume=1.09, refresh-volume=0.28)> <Clock(total=2.67, wait=1.80, extend-volume=0.58, refresh-volume=0.29)> <Clock(total=3.10, wait=1.74, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.85, wait=1.55, extend-volume=1.08, refresh-volume=0.22)> <Clock(total=2.02, wait=1.14, extend-volume=0.58, refresh-volume=0.30)> <Clock(total=1.14, wait=0.33, extend-volume=0.56, refresh-volume=0.25)> <Clock(total=2.83, wait=1.42, extend-volume=1.09, refresh-volume=0.32)> <Clock(total=1.68, wait=0.33, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.45, wait=1.47, extend-volume=0.60, refresh-volume=0.38)> <Clock(total=1.44, wait=0.11, extend-volume=1.09, refresh-volume=0.24)> <Clock(total=2.46, wait=1.04, extend-volume=1.13, refresh-volume=0.30)> <Clock(total=1.55, wait=0.17, extend-volume=1.07, refresh-volume=0.31)> <Clock(total=2.02, wait=1.13, extend-volume=0.60, refresh-volume=0.28)> <Clock(total=1.75, wait=0.39, extend-volume=1.12, refresh-volume=0.24)> <Clock(total=2.98, wait=1.61, extend-volume=1.12, refresh-volume=0.25)> <Clock(total=1.28, wait=0.41, extend-volume=0.61, refresh-volume=0.25)> <Clock(total=2.46, wait=1.48, extend-volume=0.62, refresh-volume=0.36)> <Clock(total=3.31, wait=1.97, extend-volume=1.12, refresh-volume=0.21)> <Clock(total=1.81, wait=0.96, extend-volume=0.58, refresh-volume=0.27)> <Clock(total=2.74, wait=1.87, extend-volume=0.58, refresh-volume=0.29)> I tested also a shorter eventInterval (0.2 seconds). This reduces the extend time by 20%, but doubles the cpu usage on the SPM mailbox. With this change, the next improvement is eliminating the wait time. In the worst case, we waited 1.97 seconds before sending an extend request, which is 68% of the total extend time (3.31 seconds). This issue is tracked in oVirt#85. Fixes oVirt#102. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Mar 23, 2022
Minimize message latency by introducing an event mechanism. The unused host 0 mailbox is used now for sending and receiving change events. When a host send mail to the SPM, it writes an event to the event block. The SPM monitor this block every eventInterval (0.5 second) between monitor intervals, so it can detect that a new message is available quickly. The HSM mail monitor monitor its inbox every eventInterval (0.5 seconds) when it is waiting for replies, so it can detect replies quickly. With both changes, sending an extend request latency reduced from 2.0-4.0 seconds to 0.5-1.0 seconds, reducing the risk of pausing VM when writing quickly to fast storage. Reducing event interval increases vdsm CPU usage, since we use dd to read events. To improve this, we need to add a helper process for checking events and reading mailbox data. Testing shows that we can write now 525 MiB/s (the maximum rate on my nested test environment) in the guest without pausing a VM when the disk is extended. Before this change, we could write only 350 MiB/s before the VM starts to pause randomly during the test. Here are example logs form this run, showing that total extend time is 1.14-3.31 seconds instead of 2.5-8.3 seconds. <Clock(total=1.65, wait=0.28, extend-volume=1.09, refresh-volume=0.28)> <Clock(total=2.67, wait=1.80, extend-volume=0.58, refresh-volume=0.29)> <Clock(total=3.10, wait=1.74, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.85, wait=1.55, extend-volume=1.08, refresh-volume=0.22)> <Clock(total=2.02, wait=1.14, extend-volume=0.58, refresh-volume=0.30)> <Clock(total=1.14, wait=0.33, extend-volume=0.56, refresh-volume=0.25)> <Clock(total=2.83, wait=1.42, extend-volume=1.09, refresh-volume=0.32)> <Clock(total=1.68, wait=0.33, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.45, wait=1.47, extend-volume=0.60, refresh-volume=0.38)> <Clock(total=1.44, wait=0.11, extend-volume=1.09, refresh-volume=0.24)> <Clock(total=2.46, wait=1.04, extend-volume=1.13, refresh-volume=0.30)> <Clock(total=1.55, wait=0.17, extend-volume=1.07, refresh-volume=0.31)> <Clock(total=2.02, wait=1.13, extend-volume=0.60, refresh-volume=0.28)> <Clock(total=1.75, wait=0.39, extend-volume=1.12, refresh-volume=0.24)> <Clock(total=2.98, wait=1.61, extend-volume=1.12, refresh-volume=0.25)> <Clock(total=1.28, wait=0.41, extend-volume=0.61, refresh-volume=0.25)> <Clock(total=2.46, wait=1.48, extend-volume=0.62, refresh-volume=0.36)> <Clock(total=3.31, wait=1.97, extend-volume=1.12, refresh-volume=0.21)> <Clock(total=1.81, wait=0.96, extend-volume=0.58, refresh-volume=0.27)> <Clock(total=2.74, wait=1.87, extend-volume=0.58, refresh-volume=0.29)> I tested also a shorter eventInterval (0.2 seconds). This reduces the extend time by 20%, but doubles the cpu usage on the SPM mailbox. With this change, the next improvement is eliminating the wait time. In the worst case, we waited 1.97 seconds before sending an extend request, which is 68% of the total extend time (3.31 seconds). This issue is tracked in oVirt#85. Fixes oVirt#102. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Mar 23, 2022
Minimize message latency by introducing an event mechanism. The unused host 0 mailbox is used now for sending and receiving change events. When a host send mail to the SPM, it writes an event to the event block. The SPM monitor this block every eventInterval (0.5 second) between monitor intervals, so it can detect that a new message is available quickly. The HSM mail monitor monitor its inbox every eventInterval (0.5 seconds) when it is waiting for replies, so it can detect replies quickly. With both changes, sending an extend request latency reduced from 2.0-4.0 seconds to 0.5-1.0 seconds, reducing the risk of pausing VM when writing quickly to fast storage. Reducing event interval increases vdsm CPU usage, since we use dd to read events. To improve this, we need to add a helper process for checking events and reading mailbox data. Testing shows that we can write now 525 MiB/s (the maximum rate on my nested test environment) in the guest without pausing a VM when the disk is extended. Before this change, we could write only 350 MiB/s before the VM starts to pause randomly during the test. Here are example logs form this run, showing that total extend time is 1.14-3.31 seconds instead of 2.5-8.3 seconds. <Clock(total=1.65, wait=0.28, extend-volume=1.09, refresh-volume=0.28)> <Clock(total=2.67, wait=1.80, extend-volume=0.58, refresh-volume=0.29)> <Clock(total=3.10, wait=1.74, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.85, wait=1.55, extend-volume=1.08, refresh-volume=0.22)> <Clock(total=2.02, wait=1.14, extend-volume=0.58, refresh-volume=0.30)> <Clock(total=1.14, wait=0.33, extend-volume=0.56, refresh-volume=0.25)> <Clock(total=2.83, wait=1.42, extend-volume=1.09, refresh-volume=0.32)> <Clock(total=1.68, wait=0.33, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.45, wait=1.47, extend-volume=0.60, refresh-volume=0.38)> <Clock(total=1.44, wait=0.11, extend-volume=1.09, refresh-volume=0.24)> <Clock(total=2.46, wait=1.04, extend-volume=1.13, refresh-volume=0.30)> <Clock(total=1.55, wait=0.17, extend-volume=1.07, refresh-volume=0.31)> <Clock(total=2.02, wait=1.13, extend-volume=0.60, refresh-volume=0.28)> <Clock(total=1.75, wait=0.39, extend-volume=1.12, refresh-volume=0.24)> <Clock(total=2.98, wait=1.61, extend-volume=1.12, refresh-volume=0.25)> <Clock(total=1.28, wait=0.41, extend-volume=0.61, refresh-volume=0.25)> <Clock(total=2.46, wait=1.48, extend-volume=0.62, refresh-volume=0.36)> <Clock(total=3.31, wait=1.97, extend-volume=1.12, refresh-volume=0.21)> <Clock(total=1.81, wait=0.96, extend-volume=0.58, refresh-volume=0.27)> <Clock(total=2.74, wait=1.87, extend-volume=0.58, refresh-volume=0.29)> I tested also a shorter eventInterval (0.2 seconds). This reduces the extend time by 20%, but doubles the cpu usage on the SPM mailbox. With this change, the next improvement is eliminating the wait time. In the worst case, we waited 1.97 seconds before sending an extend request, which is 68% of the total extend time (3.31 seconds). This issue is tracked in oVirt#85. Fixes oVirt#102. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Mar 23, 2022
Minimize message latency by introducing mailbox events mechanism. The unused host 0 mailbox is used now for sending and receiving mailbox events. To control mailbox events, a new "mailbox:events_enable" was added. The option is disabled by default, so we can test this change before we enable it by default, or disable it in production if needed. To enable mailbox events, add a drop-in configuration file to all hosts: $ cat /etc/vdsm/vdsm.conf.d/99-local.conf [mailbox] events_enable = true And restart the vdsmd service. When mailbox:events_enable option is enabled: - Hosts write an event to host 0 mailbox after sending mail to the SPM. - The SPM monitors host 0 mailbox every eventInterval (0.5 seconds) between monitor intervals, so it can handle new messages quickly. - When hosts wait for reply from the SPM, they monitor their inbox every eventInterval (0.5 seconds), so they detect the reply quickly. - Host reports a new "mailbox_events" capability. This can be used by engine to optimize mailbox I/O when all hosts in a data center supports this capability. With this change, extend roundtrip latency reduced from 2.0-4.0 seconds to 0.5-1.0 seconds, reducing the risk of pausing VM when writing quickly to fast storage. Testing shows that we can write now 525 MiB/s (the maximum rate on my nested test environment) in the guest without pausing a VM when the disk is extended. Before this change, we could write only 350 MiB/s before the VM starts to pause randomly during the test. Here are example logs form this run, showing that total extend time is 1.14-3.31 seconds instead of 2.5-8.3 seconds before this change. <Clock(total=1.65, wait=0.28, extend-volume=1.09, refresh-volume=0.28)> <Clock(total=2.67, wait=1.80, extend-volume=0.58, refresh-volume=0.29)> <Clock(total=3.10, wait=1.74, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.85, wait=1.55, extend-volume=1.08, refresh-volume=0.22)> <Clock(total=2.02, wait=1.14, extend-volume=0.58, refresh-volume=0.30)> <Clock(total=1.14, wait=0.33, extend-volume=0.56, refresh-volume=0.25)> <Clock(total=2.83, wait=1.42, extend-volume=1.09, refresh-volume=0.32)> <Clock(total=1.68, wait=0.33, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.45, wait=1.47, extend-volume=0.60, refresh-volume=0.38)> <Clock(total=1.44, wait=0.11, extend-volume=1.09, refresh-volume=0.24)> <Clock(total=2.46, wait=1.04, extend-volume=1.13, refresh-volume=0.30)> <Clock(total=1.55, wait=0.17, extend-volume=1.07, refresh-volume=0.31)> <Clock(total=2.02, wait=1.13, extend-volume=0.60, refresh-volume=0.28)> <Clock(total=1.75, wait=0.39, extend-volume=1.12, refresh-volume=0.24)> <Clock(total=2.98, wait=1.61, extend-volume=1.12, refresh-volume=0.25)> <Clock(total=1.28, wait=0.41, extend-volume=0.61, refresh-volume=0.25)> <Clock(total=2.46, wait=1.48, extend-volume=0.62, refresh-volume=0.36)> <Clock(total=3.31, wait=1.97, extend-volume=1.12, refresh-volume=0.21)> <Clock(total=1.81, wait=0.96, extend-volume=0.58, refresh-volume=0.27)> <Clock(total=2.74, wait=1.87, extend-volume=0.58, refresh-volume=0.29)> I tested also a shorter eventInterval (0.2 seconds). This reduces the extend time by 20%, but doubles the cpu usage on the SPM mailbox. With this change, the next improvement is eliminating the wait time. In the worst case, we waited 1.97 seconds before sending an extend request, which is 68% of the total extend time (3.31 seconds). This issue is tracked in oVirt#85. Fixes oVirt#102. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Mar 23, 2022
Minimize message latency by introducing mailbox events mechanism. The unused host 0 mailbox is used now for sending and receiving mailbox events. To control mailbox events, a new "mailbox:events_enable" was added. The option is disabled by default, so we can test this change before we enable it by default, or disable it in production if needed. To enable mailbox events, add a drop-in configuration file to all hosts: $ cat /etc/vdsm/vdsm.conf.d/99-local.conf [mailbox] events_enable = true And restart the vdsmd service. When mailbox:events_enable option is enabled: - Hosts write an event to host 0 mailbox after sending mail to the SPM. - The SPM monitors host 0 mailbox every eventInterval (0.5 seconds) between monitor intervals, so it can handle new messages quickly. - When hosts wait for reply from the SPM, they monitor their inbox every eventInterval (0.5 seconds), so they detect the reply quickly. - Host reports a new "mailbox_events" capability. This can be used by engine to optimize mailbox I/O when all hosts in a data center supports this capability. With this change, extend roundtrip latency reduced from 2.0-4.0 seconds to 0.5-1.0 seconds, reducing the risk of pausing VM when writing quickly to fast storage. Testing shows that we can write now 525 MiB/s (the maximum rate on my nested test environment) in the guest without pausing a VM when the disk is extended. Before this change, we could write only 350 MiB/s before the VM starts to pause randomly during the test. Here are example logs form this run, showing that total extend time is 1.14-3.31 seconds instead of 2.5-8.3 seconds before this change. <Clock(total=1.65, wait=0.28, extend-volume=1.09, refresh-volume=0.28)> <Clock(total=2.67, wait=1.80, extend-volume=0.58, refresh-volume=0.29)> <Clock(total=3.10, wait=1.74, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.85, wait=1.55, extend-volume=1.08, refresh-volume=0.22)> <Clock(total=2.02, wait=1.14, extend-volume=0.58, refresh-volume=0.30)> <Clock(total=1.14, wait=0.33, extend-volume=0.56, refresh-volume=0.25)> <Clock(total=2.83, wait=1.42, extend-volume=1.09, refresh-volume=0.32)> <Clock(total=1.68, wait=0.33, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.45, wait=1.47, extend-volume=0.60, refresh-volume=0.38)> <Clock(total=1.44, wait=0.11, extend-volume=1.09, refresh-volume=0.24)> <Clock(total=2.46, wait=1.04, extend-volume=1.13, refresh-volume=0.30)> <Clock(total=1.55, wait=0.17, extend-volume=1.07, refresh-volume=0.31)> <Clock(total=2.02, wait=1.13, extend-volume=0.60, refresh-volume=0.28)> <Clock(total=1.75, wait=0.39, extend-volume=1.12, refresh-volume=0.24)> <Clock(total=2.98, wait=1.61, extend-volume=1.12, refresh-volume=0.25)> <Clock(total=1.28, wait=0.41, extend-volume=0.61, refresh-volume=0.25)> <Clock(total=2.46, wait=1.48, extend-volume=0.62, refresh-volume=0.36)> <Clock(total=3.31, wait=1.97, extend-volume=1.12, refresh-volume=0.21)> <Clock(total=1.81, wait=0.96, extend-volume=0.58, refresh-volume=0.27)> <Clock(total=2.74, wait=1.87, extend-volume=0.58, refresh-volume=0.29)> I tested also a shorter eventInterval (0.2 seconds). This reduces the extend time by 20%, but doubles the cpu usage on the SPM mailbox. With this change, the next improvement is eliminating the wait time. In the worst case, we waited 1.97 seconds before sending an extend request, which is 68% of the total extend time (3.31 seconds). This issue is tracked in oVirt#85. Fixes oVirt#102. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Mar 23, 2022
Minimize message latency by introducing mailbox events mechanism. The unused host 0 mailbox is used now for sending and receiving mailbox events. To control mailbox events, a new "mailbox:events_enable" was added. The option is disabled by default, so we can test this change before we enable it by default, or disable it in production if needed. To enable mailbox events, add a drop-in configuration file to all hosts: $ cat /etc/vdsm/vdsm.conf.d/99-local.conf [mailbox] events_enable = true And restart the vdsmd service. When mailbox:events_enable option is enabled: - Hosts write an event to host 0 mailbox after sending mail to the SPM. - The SPM monitors host 0 mailbox every eventInterval (0.5 seconds) between monitor intervals, so it can handle new messages quickly. - When hosts wait for reply from the SPM, they monitor their inbox every eventInterval (0.5 seconds), so they detect the reply quickly. - Host reports a new "mailbox_events" capability. This can be used by engine to optimize mailbox I/O when all hosts in a data center supports this capability. With this change, extend roundtrip latency reduced from 2.0-4.0 seconds to 0.5-1.0 seconds, reducing the risk of pausing VM when writing quickly to fast storage. Testing shows that we can write now 525 MiB/s (the maximum rate on my nested test environment) in the guest without pausing a VM when the disk is extended. Before this change, we could write only 350 MiB/s before the VM starts to pause randomly during the test. Here are example logs form this run, showing that total extend time is 1.14-3.31 seconds instead of 2.5-8.3 seconds before this change. <Clock(total=1.65, wait=0.28, extend-volume=1.09, refresh-volume=0.28)> <Clock(total=2.67, wait=1.80, extend-volume=0.58, refresh-volume=0.29)> <Clock(total=3.10, wait=1.74, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.85, wait=1.55, extend-volume=1.08, refresh-volume=0.22)> <Clock(total=2.02, wait=1.14, extend-volume=0.58, refresh-volume=0.30)> <Clock(total=1.14, wait=0.33, extend-volume=0.56, refresh-volume=0.25)> <Clock(total=2.83, wait=1.42, extend-volume=1.09, refresh-volume=0.32)> <Clock(total=1.68, wait=0.33, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.45, wait=1.47, extend-volume=0.60, refresh-volume=0.38)> <Clock(total=1.44, wait=0.11, extend-volume=1.09, refresh-volume=0.24)> <Clock(total=2.46, wait=1.04, extend-volume=1.13, refresh-volume=0.30)> <Clock(total=1.55, wait=0.17, extend-volume=1.07, refresh-volume=0.31)> <Clock(total=2.02, wait=1.13, extend-volume=0.60, refresh-volume=0.28)> <Clock(total=1.75, wait=0.39, extend-volume=1.12, refresh-volume=0.24)> <Clock(total=2.98, wait=1.61, extend-volume=1.12, refresh-volume=0.25)> <Clock(total=1.28, wait=0.41, extend-volume=0.61, refresh-volume=0.25)> <Clock(total=2.46, wait=1.48, extend-volume=0.62, refresh-volume=0.36)> <Clock(total=3.31, wait=1.97, extend-volume=1.12, refresh-volume=0.21)> <Clock(total=1.81, wait=0.96, extend-volume=0.58, refresh-volume=0.27)> <Clock(total=2.74, wait=1.87, extend-volume=0.58, refresh-volume=0.29)> I tested also a shorter eventInterval (0.2 seconds). This reduces the extend time by 20%, but doubles the cpu usage on the SPM mailbox. With this change, the next improvement is eliminating the wait time. In the worst case, we waited 1.97 seconds before sending an extend request, which is 68% of the total extend time (3.31 seconds). This issue is tracked in oVirt#85. Fixes oVirt#102. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Mar 24, 2022
Minimize message latency by introducing mailbox events mechanism. The unused host 0 mailbox is used now for sending and receiving mailbox events. To control mailbox events, a new "mailbox:events_enable" was added. The option is disabled by default, so we can test this change before we enable it by default, or disable it in production if needed. To enable mailbox events, add a drop-in configuration file to all hosts: $ cat /etc/vdsm/vdsm.conf.d/99-local.conf [mailbox] events_enable = true And restart the vdsmd service. When mailbox:events_enable option is enabled: - Hosts write an event to host 0 mailbox after sending mail to the SPM. - The SPM monitors host 0 mailbox every eventInterval (0.5 seconds) between monitor intervals, so it can handle new messages quickly. - When hosts wait for reply from the SPM, they monitor their inbox every eventInterval (0.5 seconds), so they detect the reply quickly. - Host reports a new "mailbox_events" capability. This can be used by engine to optimize mailbox I/O when all hosts in a data center supports this capability. With this change, extend roundtrip latency reduced from 2.0-4.0 seconds to 0.5-1.0 seconds, reducing the risk of pausing VM when writing quickly to fast storage. Testing shows that we can write now 525 MiB/s (the maximum rate on my nested test environment) in the guest without pausing a VM when the disk is extended. Before this change, we could write only 350 MiB/s before the VM starts to pause randomly during the test. Here are example logs form this run, showing that total extend time is 1.14-3.31 seconds instead of 2.5-8.3 seconds before this change. <Clock(total=1.65, wait=0.28, extend-volume=1.09, refresh-volume=0.28)> <Clock(total=2.67, wait=1.80, extend-volume=0.58, refresh-volume=0.29)> <Clock(total=3.10, wait=1.74, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.85, wait=1.55, extend-volume=1.08, refresh-volume=0.22)> <Clock(total=2.02, wait=1.14, extend-volume=0.58, refresh-volume=0.30)> <Clock(total=1.14, wait=0.33, extend-volume=0.56, refresh-volume=0.25)> <Clock(total=2.83, wait=1.42, extend-volume=1.09, refresh-volume=0.32)> <Clock(total=1.68, wait=0.33, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.45, wait=1.47, extend-volume=0.60, refresh-volume=0.38)> <Clock(total=1.44, wait=0.11, extend-volume=1.09, refresh-volume=0.24)> <Clock(total=2.46, wait=1.04, extend-volume=1.13, refresh-volume=0.30)> <Clock(total=1.55, wait=0.17, extend-volume=1.07, refresh-volume=0.31)> <Clock(total=2.02, wait=1.13, extend-volume=0.60, refresh-volume=0.28)> <Clock(total=1.75, wait=0.39, extend-volume=1.12, refresh-volume=0.24)> <Clock(total=2.98, wait=1.61, extend-volume=1.12, refresh-volume=0.25)> <Clock(total=1.28, wait=0.41, extend-volume=0.61, refresh-volume=0.25)> <Clock(total=2.46, wait=1.48, extend-volume=0.62, refresh-volume=0.36)> <Clock(total=3.31, wait=1.97, extend-volume=1.12, refresh-volume=0.21)> <Clock(total=1.81, wait=0.96, extend-volume=0.58, refresh-volume=0.27)> <Clock(total=2.74, wait=1.87, extend-volume=0.58, refresh-volume=0.29)> I tested also a shorter eventInterval (0.2 seconds). This reduces the extend time by 20%, but doubles the cpu usage on the SPM mailbox. With this change, the next improvement is eliminating the wait time. In the worst case, we waited 1.97 seconds before sending an extend request, which is 68% of the total extend time (3.31 seconds). This issue is tracked in oVirt#85. Fixes oVirt#102. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Mar 25, 2022
Minimize message latency by introducing mailbox events mechanism. The unused host 0 mailbox is used now for sending and receiving mailbox events. To control mailbox events, a new "mailbox:events_enable" was added. The option is disabled by default, so we can test this change before we enable it by default, or disable it in production if needed. To enable mailbox events, add a drop-in configuration file to all hosts: $ cat /etc/vdsm/vdsm.conf.d/99-local.conf [mailbox] events_enable = true And restart the vdsmd service. When mailbox:events_enable option is enabled: - Hosts write an event to host 0 mailbox after sending mail to the SPM. - The SPM monitors host 0 mailbox every eventInterval (0.5 seconds) between monitor intervals, so it can handle new messages quickly. - When hosts wait for reply from the SPM, they monitor their inbox every eventInterval (0.5 seconds), so they detect the reply quickly. - Host reports a new "mailbox_events" capability. This can be used by engine to optimize mailbox I/O when all hosts in a data center supports this capability. With this change, extend roundtrip latency reduced from 2.0-4.0 seconds to 0.5-1.0 seconds, reducing the risk of pausing VM when writing quickly to fast storage. Testing shows that we can write now 525 MiB/s (the maximum rate on my nested test environment) in the guest without pausing a VM when the disk is extended. Before this change, we could write only 350 MiB/s before the VM starts to pause randomly during the test. Here are example logs form this run, showing that total extend time is 1.14-3.31 seconds instead of 2.5-8.3 seconds before this change. <Clock(total=1.65, wait=0.28, extend-volume=1.09, refresh-volume=0.28)> <Clock(total=2.67, wait=1.80, extend-volume=0.58, refresh-volume=0.29)> <Clock(total=3.10, wait=1.74, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.85, wait=1.55, extend-volume=1.08, refresh-volume=0.22)> <Clock(total=2.02, wait=1.14, extend-volume=0.58, refresh-volume=0.30)> <Clock(total=1.14, wait=0.33, extend-volume=0.56, refresh-volume=0.25)> <Clock(total=2.83, wait=1.42, extend-volume=1.09, refresh-volume=0.32)> <Clock(total=1.68, wait=0.33, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.45, wait=1.47, extend-volume=0.60, refresh-volume=0.38)> <Clock(total=1.44, wait=0.11, extend-volume=1.09, refresh-volume=0.24)> <Clock(total=2.46, wait=1.04, extend-volume=1.13, refresh-volume=0.30)> <Clock(total=1.55, wait=0.17, extend-volume=1.07, refresh-volume=0.31)> <Clock(total=2.02, wait=1.13, extend-volume=0.60, refresh-volume=0.28)> <Clock(total=1.75, wait=0.39, extend-volume=1.12, refresh-volume=0.24)> <Clock(total=2.98, wait=1.61, extend-volume=1.12, refresh-volume=0.25)> <Clock(total=1.28, wait=0.41, extend-volume=0.61, refresh-volume=0.25)> <Clock(total=2.46, wait=1.48, extend-volume=0.62, refresh-volume=0.36)> <Clock(total=3.31, wait=1.97, extend-volume=1.12, refresh-volume=0.21)> <Clock(total=1.81, wait=0.96, extend-volume=0.58, refresh-volume=0.27)> <Clock(total=2.74, wait=1.87, extend-volume=0.58, refresh-volume=0.29)> I tested also a shorter eventInterval (0.2 seconds). This reduces the extend time by 20%, but doubles the cpu usage on the SPM mailbox. With this change, the next improvement is eliminating the wait time. In the worst case, we waited 1.97 seconds before sending an extend request, which is 68% of the total extend time (3.31 seconds). This issue is tracked in oVirt#85. Fixes oVirt#102. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
that referenced
this issue
Mar 28, 2022
Minimize message latency by introducing mailbox events mechanism. The unused host 0 mailbox is used now for sending and receiving mailbox events. To control mailbox events, a new "mailbox:events_enable" was added. The option is disabled by default, so we can test this change before we enable it by default, or disable it in production if needed. To enable mailbox events, add a drop-in configuration file to all hosts: $ cat /etc/vdsm/vdsm.conf.d/99-local.conf [mailbox] events_enable = true And restart the vdsmd service. When mailbox:events_enable option is enabled: - Hosts write an event to host 0 mailbox after sending mail to the SPM. - The SPM monitors host 0 mailbox every eventInterval (0.5 seconds) between monitor intervals, so it can handle new messages quickly. - When hosts wait for reply from the SPM, they monitor their inbox every eventInterval (0.5 seconds), so they detect the reply quickly. - Host reports a new "mailbox_events" capability. This can be used by engine to optimize mailbox I/O when all hosts in a data center supports this capability. With this change, extend roundtrip latency reduced from 2.0-4.0 seconds to 0.5-1.0 seconds, reducing the risk of pausing VM when writing quickly to fast storage. Testing shows that we can write now 525 MiB/s (the maximum rate on my nested test environment) in the guest without pausing a VM when the disk is extended. Before this change, we could write only 350 MiB/s before the VM starts to pause randomly during the test. Here are example logs form this run, showing that total extend time is 1.14-3.31 seconds instead of 2.5-8.3 seconds before this change. <Clock(total=1.65, wait=0.28, extend-volume=1.09, refresh-volume=0.28)> <Clock(total=2.67, wait=1.80, extend-volume=0.58, refresh-volume=0.29)> <Clock(total=3.10, wait=1.74, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.85, wait=1.55, extend-volume=1.08, refresh-volume=0.22)> <Clock(total=2.02, wait=1.14, extend-volume=0.58, refresh-volume=0.30)> <Clock(total=1.14, wait=0.33, extend-volume=0.56, refresh-volume=0.25)> <Clock(total=2.83, wait=1.42, extend-volume=1.09, refresh-volume=0.32)> <Clock(total=1.68, wait=0.33, extend-volume=1.10, refresh-volume=0.25)> <Clock(total=2.45, wait=1.47, extend-volume=0.60, refresh-volume=0.38)> <Clock(total=1.44, wait=0.11, extend-volume=1.09, refresh-volume=0.24)> <Clock(total=2.46, wait=1.04, extend-volume=1.13, refresh-volume=0.30)> <Clock(total=1.55, wait=0.17, extend-volume=1.07, refresh-volume=0.31)> <Clock(total=2.02, wait=1.13, extend-volume=0.60, refresh-volume=0.28)> <Clock(total=1.75, wait=0.39, extend-volume=1.12, refresh-volume=0.24)> <Clock(total=2.98, wait=1.61, extend-volume=1.12, refresh-volume=0.25)> <Clock(total=1.28, wait=0.41, extend-volume=0.61, refresh-volume=0.25)> <Clock(total=2.46, wait=1.48, extend-volume=0.62, refresh-volume=0.36)> <Clock(total=3.31, wait=1.97, extend-volume=1.12, refresh-volume=0.21)> <Clock(total=1.81, wait=0.96, extend-volume=0.58, refresh-volume=0.27)> <Clock(total=2.74, wait=1.87, extend-volume=0.58, refresh-volume=0.29)> I tested also a shorter eventInterval (0.2 seconds). This reduces the extend time by 20%, but doubles the cpu usage on the SPM mailbox. With this change, the next improvement is eliminating the wait time. In the worst case, we waited 1.97 seconds before sending an extend request, which is 68% of the total extend time (3.31 seconds). This issue is tracked in #85. Fixes #102. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Apr 11, 2022
When receiving a block threshold event or when pausing because of ENOSPC, extend the drive as soon as possible on the periodic executor. To reuse the periodic monitor infrastructure, the VolumeWatermarkMonitor provides now a dispatch() class method. It runs monitor_volumes() with urgent=True on the periodic executor to ensure that this invocation will extend drives immediately, even if the last extend started less than 2.0 seconds ago. This change decrease the wait before sending an extend request from 0-2 seconds to 10 milliseconds, and the total time to extend to 0.8-1.3 seconds. With this we can write up to 1300 MiB/s to a thin disk without pausing the VM. XXX stats with this change Unfinished: - Some tests fail because the periodic executor is not during the tests. Fixes oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Apr 11, 2022
When receiving a block threshold event or when pausing because of ENOSPC, extend the drive as soon as possible on the periodic executor. To reuse the periodic monitor infrastructure, the VolumeWatermarkMonitor provides now a dispatch() class method. It runs monitor_volumes() with urgent=True on the periodic executor to ensure that this invocation will extend drives immediately, even if the last extend started less than 2.0 seconds ago. This change decrease the wait before sending an extend request from 0.0-2.0 seconds to 10 milliseconds, and the total time to extend to 0.66-1.30 seconds. With this we can write 50 GiB at rate of 1320 MiB/s[1] to a thin disk without pausing the VM. The theoretical limit is 1538 MiB/s but my NVMe drive is not fast enough. Extend stats with this change: | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.66 | 0.97 | 1.30 | | extend | 0.53 | 0.79 | 1.11 | | refresh | 0.08 | 0.18 | 0.23 | | wait | 0.01 | 0.01 | 0.01 | Unfinished: - Some tests fail because the periodic executor is not during the tests. - Need to add tests for new behavior Fixes oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Apr 11, 2022
When receiving a block threshold event or when pausing because of ENOSPC, extend the drive as soon as possible on the periodic executor. To reuse the periodic monitor infrastructure, the VolumeWatermarkMonitor provides now a dispatch() class method. It runs monitor_volumes() with urgent=True on the periodic executor to ensure that this invocation will extend drives immediately, even if the last extend started less than 2.0 seconds ago. This change decrease the wait before sending an extend request from 0.0-2.0 seconds to 10 milliseconds, and the total time to extend to 0.66-1.30 seconds. With this we can write 50 GiB at rate of 1320 MiB/s[1] to a thin disk without pausing the VM. The theoretical limit is 1538 MiB/s but my NVMe drive is not fast enough. Extend stats with this change: | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.66 | 0.97 | 1.30 | | extend | 0.53 | 0.79 | 1.11 | | refresh | 0.08 | 0.18 | 0.23 | | wait | 0.01 | 0.01 | 0.01 | Unfinished: - Some tests fail because the periodic executor is not during the tests. - Need to add tests for new behavior Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Apr 11, 2022
When receiving a block threshold event or when pausing because of ENOSPC, extend the drive as soon as possible on the periodic executor. To reuse the periodic monitor infrastructure, the VolumeWatermarkMonitor provides now a dispatch() class method. It runs monitor_volumes() with urgent=True on the periodic executor to ensure that this invocation will extend drives immediately, even if the last extend started less than 2.0 seconds ago. This change decrease the wait before sending an extend request from 0.0-2.0 seconds to 10 milliseconds, and the total time to extend to 0.66-1.30 seconds. With this we can write 50 GiB at rate of 1320 MiB/s to a thin disk without pausing the VM. The theoretical limit is 1538 MiB/s but my NVMe drive is not fast enough. Extend stats with this change: | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.66 | 0.97 | 1.30 | | extend | 0.53 | 0.79 | 1.11 | | refresh | 0.08 | 0.18 | 0.23 | | wait | 0.01 | 0.01 | 0.01 | Unfinished: - Some tests fail because the periodic executor is not during the tests. - Need to add tests for new behavior Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Apr 26, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. XXX Tests results with this change Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Apr 26, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. XXX Tests results with this change Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Apr 26, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. XXX Tests results with this change Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Apr 26, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. Here are test results from 4 runs, each writing 50 GiB to think disk at ~1300 MiB/s. Each run extends the disk 20 times. The VM was not paused during the test. | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.77 | 1.15 | 1.39 | | extend | 0.55 | 0.92 | 1.14 | | refresh | 0.16 | 0.22 | 0.31 | | wait | 0.01 | 0.01 | 0.03 | Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Apr 26, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. Here are test results from 4 runs, each writing 50 GiB to think disk at ~1300 MiB/s. Each run extends the disk 20 times. The VM was not paused during the test. | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.77 | 1.15 | 1.39 | | extend | 0.55 | 0.92 | 1.14 | | refresh | 0.16 | 0.22 | 0.31 | | wait | 0.01 | 0.01 | 0.03 | Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Apr 28, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. Here are test results from 4 runs, each writing 50 GiB to think disk at ~1300 MiB/s. Each run extends the disk 20 times. The VM was not paused during the test. | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.77 | 1.15 | 1.39 | | extend | 0.55 | 0.92 | 1.14 | | refresh | 0.16 | 0.22 | 0.31 | | wait | 0.01 | 0.01 | 0.03 | Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Apr 29, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. Here are test results from 4 runs, each writing 50 GiB to think disk at ~1300 MiB/s. Each run extends the disk 20 times. The VM was not paused during the test. | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.77 | 1.15 | 1.39 | | extend | 0.55 | 0.92 | 1.14 | | refresh | 0.16 | 0.22 | 0.31 | | wait | 0.01 | 0.01 | 0.03 | Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Apr 29, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. Here are test results from 4 runs, each writing 50 GiB to think disk at ~1300 MiB/s. Each run extends the disk 20 times. The VM was not paused during the test. | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.77 | 1.15 | 1.39 | | extend | 0.55 | 0.92 | 1.14 | | refresh | 0.16 | 0.22 | 0.31 | | wait | 0.01 | 0.01 | 0.03 | Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Apr 29, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. Here are test results from 4 runs, each writing 50 GiB to think disk at ~1300 MiB/s. Each run extends the disk 20 times. The VM was not paused during the test. | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.77 | 1.15 | 1.39 | | extend | 0.55 | 0.92 | 1.14 | | refresh | 0.16 | 0.22 | 0.31 | | wait | 0.01 | 0.01 | 0.03 | Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
Apr 29, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. Here are test results from 4 runs, each writing 50 GiB to think disk at ~1300 MiB/s. Each run extends the disk 20 times. The VM was not paused during the test. | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.77 | 1.15 | 1.39 | | extend | 0.55 | 0.92 | 1.14 | | refresh | 0.16 | 0.22 | 0.31 | | wait | 0.01 | 0.01 | 0.03 | Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
May 2, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. Here are test results from 4 runs, each writing 50 GiB to think disk at ~1300 MiB/s. Each run extends the disk 20 times. The VM was not paused during the test. | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.77 | 1.15 | 1.39 | | extend | 0.55 | 0.92 | 1.14 | | refresh | 0.16 | 0.22 | 0.31 | | wait | 0.01 | 0.01 | 0.03 | Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
May 2, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. Here are test results from 4 runs, each writing 50 GiB to think disk at ~1300 MiB/s. Each run extends the disk 20 times. The VM was not paused during the test. | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.77 | 1.15 | 1.39 | | extend | 0.55 | 0.92 | 1.14 | | refresh | 0.16 | 0.22 | 0.31 | | wait | 0.01 | 0.01 | 0.03 | Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
May 4, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. Here are test results from 4 runs, each writing 50 GiB to think disk at ~1300 MiB/s. Each run extends the disk 20 times. The VM was not paused during the test. | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.77 | 1.15 | 1.39 | | extend | 0.55 | 0.92 | 1.14 | | refresh | 0.16 | 0.22 | 0.31 | | wait | 0.01 | 0.01 | 0.03 | Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
May 9, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. Here are test results from 4 runs, each writing 50 GiB to think disk at ~1300 MiB/s. Each run extends the disk 20 times. The VM was not paused during the test. | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.77 | 1.15 | 1.39 | | extend | 0.55 | 0.92 | 1.14 | | refresh | 0.16 | 0.22 | 0.31 | | wait | 0.01 | 0.01 | 0.03 | Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
May 9, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. Here are test results from 4 runs, each writing 50 GiB to think disk at ~1300 MiB/s. Each run extends the disk 20 times. The VM was not paused during the test. | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.77 | 1.15 | 1.39 | | extend | 0.55 | 0.92 | 1.14 | | refresh | 0.16 | 0.22 | 0.31 | | wait | 0.01 | 0.01 | 0.03 | Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
May 10, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. Here are test results from 4 runs, each writing 50 GiB to think disk at ~1300 MiB/s. Each run extends the disk 20 times. The VM was not paused during the test. | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.77 | 1.15 | 1.39 | | extend | 0.55 | 0.92 | 1.14 | | refresh | 0.16 | 0.22 | 0.31 | | wait | 0.01 | 0.01 | 0.03 | Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
to nirs/vdsm
that referenced
this issue
May 11, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. Here are test results from 4 runs, each writing 50 GiB to think disk at ~1300 MiB/s. Each run extends the disk 20 times. The VM was not paused during the test. | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.77 | 1.15 | 1.39 | | extend | 0.55 | 0.92 | 1.14 | | refresh | 0.16 | 0.22 | 0.31 | | wait | 0.01 | 0.01 | 0.03 | Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
nirs
added a commit
that referenced
this issue
May 11, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. Here are test results from 4 runs, each writing 50 GiB to think disk at ~1300 MiB/s. Each run extends the disk 20 times. The VM was not paused during the test. | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.77 | 1.15 | 1.39 | | extend | 0.55 | 0.92 | 1.14 | | refresh | 0.16 | 0.22 | 0.31 | | wait | 0.01 | 0.01 | 0.03 | Fixes: #85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
erav
pushed a commit
to hbraha/vdsm
that referenced
this issue
Jun 21, 2022
Expose periodic.dispatch() function allowing immediate dispatching of calls on the periodic executor. This is useful when you want to handle libvirt events on the periodic executor. The first user of this facility is the thinp volume monitor. Now when we receive a block threshold or enospc events we use the periodic dispatch to extend the relevant drive immediately. This eliminates the 0-2 seconds wait after receiving an event. Here are test results from 4 runs, each writing 50 GiB to think disk at ~1300 MiB/s. Each run extends the disk 20 times. The VM was not paused during the test. | time | min | avg | max | |-------------|-------|-------|-------| | total | 0.77 | 1.15 | 1.39 | | extend | 0.55 | 0.92 | 1.14 | | refresh | 0.16 | 0.22 | 0.31 | | wait | 0.01 | 0.01 | 0.03 | Fixes: oVirt#85 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
enhancement
Enhancing the system by adding new feature or improving performance or reliability
storage
virt
In #82 we improved the defaults to avoid pauses during volume extension, but
we have room for more improvements.
The extend flow contains several steps:
extension
that a drive needs extension, it sends a request to the spm by writing
to the storage mailbox
the extend using the spm mailbox thread pool
detects the reply, it complete the extend on the host side and resume the vm if needed.
When we look at logs, we see:
We waited 1.93 seconds from the time the event was received until we
handled it. A guest using fast storage can write 1.5 GiB during this wait
and pause with ENOSPC. This was 46% of total extend time.
The wait is caused by the monitor interval - 2 seconds. If the periodic
executor is blocked on slow libvirt calls, it can take much more time.
When the executor runs the periodic watermark monitor, it checks all the vms,
and it can get blocked on another unresponsive vm, even if we are lucky and
it runs quickly after we received the event.
What we want to do is to handle the block threshold event immediately, avoiding
the 0-2 seconds delay (or more in bad cases). But we don't want to do this on
the libvirt/events threads since handling block threshold access libvirt and
storage layer, and it may blcok for long time in bad cases, delaying other
libvirt events.
I think the best way to handle it is to dispatch a call to VM.monitor_volumes()
on the periodic executor when we receive an event.
Changes needed:
add periodic.dispatch() API, for running operations immediately outside of
the normal monitoring schedule.
use periodic.dispatch() in VolumeMonitor.on_block_threshold() to schedule
call to VM.monitor_volumes() soon.
Add locking in VolumeMontior.monitor_volumes() to ensure that we don't have
multiple threads monitoring at the same time.
Once we have this, we can increase the monitoring interval, since is is needed
only as a backup if an extend request did not finish.
The text was updated successfully, but these errors were encountered: