DRS: Use free metrics instead of used for computation #8458

vishesh92 · 2024-01-08T07:54:34Z

Description

This PR makes changes to use cluster's free metrics instead of used while computing imbalance for the cluster. This allows DRS to run for clusters where hosts doesn't have the same amount of metrics.

Types of changes

Breaking change (fix or feature that would cause existing functionality to change)
New feature (non-breaking change which adds functionality)
Bug fix (non-breaking change which fixes an issue)
Enhancement (improves an existing feature and functionality)
Cleanup (Code refactoring and cleanup, that may add test cases)
build/CI

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

Major
Minor

Bug Severity

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

vishesh92 · 2024-01-08T07:54:51Z

@blueorangutan package

blueorangutan · 2024-01-08T07:56:03Z

@vishesh92 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

codecov · 2024-01-08T08:03:22Z

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (6d916ca) 30.85% compared to head (e4bf560) 30.80%.
Report is 12 commits behind head on main.

Files	Patch %	Lines
...apache/cloudstack/cluster/ClusterDrsAlgorithm.java	83.33%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #8458      +/-   ##
============================================
- Coverage     30.85%   30.80%   -0.05%     
+ Complexity    34048    33986      -62     
============================================
  Files          5341     5341              
  Lines        374861   374870       +9     
  Branches      54518    54521       +3     
============================================
- Hits         115659   115493     -166     
- Misses       243973   244117     +144     
- Partials      15229    15260      +31

Flag	Coverage Δ
simulator-marvin-tests	`24.72% <91.66%> (-0.03%)`	⬇️
uitests	`4.39% <ø> (+<0.01%)`	⬆️
unit-tests	`16.46% <91.66%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

blueorangutan · 2024-01-08T08:59:58Z

Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 8225

vishesh92 · 2024-01-08T09:43:39Z

@blueorangutan test

blueorangutan · 2024-01-08T09:44:03Z

@vishesh92 a [SL] Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

blueorangutan · 2024-01-09T02:58:03Z

[SF] Trillian test result (tid-8753)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 60482 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8458-t8753-kvm-centos7.zip
Smoke tests completed. 120 look OK, 1 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test	Result	Time (s)	Test File
test_01_vpc_site2site_vpn_multiple_options	`Failure`	397.07	test_vpc_vpn.py

DaanHoogland · 2024-01-09T08:35:54Z

This PR makes changes to use cluster's free metrics instead of used while computing imbalance for the cluster. This allows DRS to run for clusters where hosts doesn't have the same amount of metrics.

Though it sounds reasonable, I would expect to see some kind of used/free ratio as the measure. I can also see why this would not be reasonable though. Have you considered this @vishesh92 ?

vishesh92 · 2024-01-10T09:44:17Z

This PR makes changes to use cluster's free metrics instead of used while computing imbalance for the cluster. This allows DRS to run for clusters where hosts doesn't have the same amount of metrics.

Though it sounds reasonable, I would expect to see some kind of used/free ratio as the measure. I can also see why this would not be reasonable though. Have you considered this @vishesh92 ?

I didn't consider this earlier. But this might cause issues since used/free ratio can be very small or very large in some cases. This will also require a lot of changes at this point of time and not sure if it will work as expected or not. We can explore this in future.

DaanHoogland

clgtm

kiranchavala

LGTM, tested manually

Deployed some 4 vm’s

Set the following cluster level settings

Infrastructure > Cluster > Settings > Search “drs”

drs.imbalance= 0.4/0.9
drs.algorithm= condensed/balanced
drs.metric=memory/cpu

Drs.imbalance= 0.9 and drs.alogirithm=condensed >> The vms migrated to a single host

Drs.imbalance= 0.4 and drs.alogirthm=balanced >> The vms got distributed across the hosts

weizhouapache

code lgtm

This PR makes changes to use cluster's free metrics instead of used while computing imbalance for the cluster. This allows DRS to run for clusters where hosts doesn't have the same amount of metrics.

DRS: Use free metrics insteado of used for computation

e4bf560

boring-cyborg bot added the component:api label Jan 8, 2024

vishesh92 changed the title ~~DRS: Use free metrics insteado of used for computation~~ DRS: Use free metrics instead of used for computation Jan 8, 2024

vishesh92 marked this pull request as ready for review January 9, 2024 06:16

shwstppr requested review from DaanHoogland, harikrishna-patnala, sureshanaparti and weizhouapache January 10, 2024 09:47

DaanHoogland approved these changes Jan 10, 2024

View reviewed changes

kiranchavala approved these changes Jan 10, 2024

View reviewed changes

weizhouapache approved these changes Jan 10, 2024

View reviewed changes

shwstppr added this to the 4.19.0.0 milestone Jan 10, 2024

shwstppr merged commit 4f40eae into apache:main Jan 10, 2024
24 of 25 checks passed

shwstppr deleted the drs-improvements branch January 10, 2024 12:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DRS: Use free metrics instead of used for computation #8458

DRS: Use free metrics instead of used for computation #8458

vishesh92 commented Jan 8, 2024

vishesh92 commented Jan 8, 2024

blueorangutan commented Jan 8, 2024

codecov bot commented Jan 8, 2024 •

edited

blueorangutan commented Jan 8, 2024

vishesh92 commented Jan 8, 2024

blueorangutan commented Jan 8, 2024

blueorangutan commented Jan 9, 2024

DaanHoogland commented Jan 9, 2024

vishesh92 commented Jan 10, 2024

DaanHoogland left a comment

kiranchavala left a comment

weizhouapache left a comment

DRS: Use free metrics instead of used for computation #8458

DRS: Use free metrics instead of used for computation #8458

Conversation

vishesh92 commented Jan 8, 2024

Description

Types of changes

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

Bug Severity

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

vishesh92 commented Jan 8, 2024

blueorangutan commented Jan 8, 2024

codecov bot commented Jan 8, 2024 • edited

Codecov Report

blueorangutan commented Jan 8, 2024

vishesh92 commented Jan 8, 2024

blueorangutan commented Jan 8, 2024

blueorangutan commented Jan 9, 2024

DaanHoogland commented Jan 9, 2024

vishesh92 commented Jan 10, 2024

DaanHoogland left a comment

Choose a reason for hiding this comment

kiranchavala left a comment

Choose a reason for hiding this comment

weizhouapache left a comment

Choose a reason for hiding this comment

codecov bot commented Jan 8, 2024 •

edited