Skip to content

Conversation

weizhouapache
Copy link
Member

Description

This PR fixes 3 issues

  • set correct hostId of HA work for vm migration
  • skip migration job if vm is running on a different host
  • wait 10 seconds for all modules are loaded

The issues are found when

  • put host to maintenance
  • restart management server

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

this fixes
```
2025-03-04T07:58:13,306 WARN  [c.c.h.HighAvailabilityManagerExtImpl] (HA-Worker-2:[ctx-3dc0c480, work-256]) (logid:431cd943) Encountered unhandled exception during HA process, reschedule work java.lang.NullPointerException: Cannot invoke "org.apache.cloudstack.framework.jobs.AsyncJob.getId()" because "job" is null
```
@weizhouapache weizhouapache changed the title 4.19 fix ha work hostid HA: set correct hostId of HA work for vm migration Mar 20, 2025
Copy link

codecov bot commented Mar 20, 2025

Codecov Report

Attention: Patch coverage is 5.88235% with 16 lines in your changes missing coverage. Please review.

Project coverage is 15.17%. Comparing base (6c40a7b) to head (9eb7f24).
Report is 15 commits behind head on 4.19.

Files with missing lines Patch % Lines
...java/com/cloud/ha/HighAvailabilityManagerImpl.java 0.00% 15 Missing ⚠️
...n/java/com/cloud/vm/VirtualMachineManagerImpl.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##               4.19   #10591   +/-   ##
=========================================
  Coverage     15.16%   15.17%           
- Complexity    11328    11332    +4     
=========================================
  Files          5414     5415    +1     
  Lines        474811   474906   +95     
  Branches      57911    57923   +12     
=========================================
+ Hits          72017    72048   +31     
- Misses       394742   394804   +62     
- Partials       8052     8054    +2     
Flag Coverage Δ
uitests 4.28% <ø> (-0.01%) ⬇️
unittests 15.89% <5.88%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@weizhouapache
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✖️ el9 ✔️ debian ✖️ suse15. SL-JID 12842

@weizhouapache
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✖️ el9 ✔️ debian ✖️ suse15. SL-JID 12845

@weizhouapache
Copy link
Member Author

@blueorangutan test

@blueorangutan
Copy link

@weizhouapache a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian Build Failed (tid-12788)

@weizhouapache
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12847

@weizhouapache
Copy link
Member Author

@blueorangutan test

@blueorangutan
Copy link

@weizhouapache a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-12792)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 49672 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10591-t12792-kvm-ol8.zip
Smoke tests completed. 133 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

@Pearl1594 Pearl1594 added this to the 4.19.3 milestone Mar 21, 2025
Copy link
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

@Pearl1594 Pearl1594 moved this to In Progress in ACS 4.20.1 Mar 27, 2025
@rohityadavcloud
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@rohityadavcloud a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 13067

@weizhouapache
Copy link
Member Author

@blueorangutan test

@blueorangutan
Copy link

@weizhouapache a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@vladimirpetrov vladimirpetrov self-assigned this Apr 16, 2025
Copy link
Contributor

@sureshanaparti sureshanaparti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code lgtm

Copy link
Contributor

@vladimirpetrov vladimirpetrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM based on manual testing, let's see the smoke test results though.

@blueorangutan
Copy link

[SF] Trillian test result (tid-13010)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 54853 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10591-t13010-kvm-ol8.zip
Smoke tests completed. 133 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

@DaanHoogland DaanHoogland marked this pull request as ready for review April 17, 2025 08:01
@DaanHoogland DaanHoogland merged commit 7b68615 into apache:4.19 Apr 17, 2025
25 of 26 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in ACS 4.20.1 Apr 17, 2025
@DaanHoogland DaanHoogland deleted the 4.19-fix-ha-work-hostid branch April 17, 2025 08:02
dhslove pushed a commit to ablecloud-team/ablestack-cloud that referenced this pull request Jun 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

7 participants