Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alert: Send alert for ha'ed vm's #5664

Merged
merged 2 commits into from Mar 17, 2022
Merged

alert: Send alert for ha'ed vm's #5664

merged 2 commits into from Mar 17, 2022

Conversation

ravening
Copy link
Member

@ravening ravening commented Nov 4, 2021

Description

When ha is performed on vm's send the alert for it so that
its for admins to know which vm's got ha'ed else its time
consuming to get those details from logs

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Copy link
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm, but would the subject exactly the same as the content be helpful? can we add more info to the mail content?

@ravening
Copy link
Member Author

ravening commented Nov 4, 2021

clgtm, but would the subject exactly the same as the content be helpful? can we add more info to the mail content?

@DaanHoogland if you have any content then let me know.. i can add that

@DaanHoogland
Copy link
Contributor

clgtm, but would the subject exactly the same as the content be helpful? can we add more info to the mail content?

@DaanHoogland if you have any content then let me know.. i can add that

source and target host? time down? time up? down time duration?
just suggestions @ravening . What would you need to know as operator. As said, clgtm

@rohityadavcloud
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✖️ el7 ✔️ el8 ✖️ debian ✖️ suse15. SL-JID 1723

@rohityadavcloud
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✖️ el7 ✖️ el8 ✖️ debian ✔️ suse15. SL-JID 1771

@rohityadavcloud
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@rohityadavcloud a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@rohityadavcloud rohityadavcloud added this to the 4.17.0.0 milestone Dec 30, 2021
@blueorangutan
Copy link

Packaging result: ✖️ el7 ✖️ el8 ✖️ debian ✖️ suse15. SL-JID 2039

Copy link
Member

@GabrielBrascher GabrielBrascher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @ravening.
I've proposed a few changes that, in my opinion, can make this PR even more helpful :-)

@@ -608,7 +608,9 @@ protected Long restart(final HaWorkVO work) {

VMInstanceVO started = _instanceDao.findById(vm.getId());
if (started != null && started.getState() == VirtualMachine.State.Running) {
String message = String.format("HA on VM: %s", started.getHostName());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of enhancing the alerts.
What do you think of changing from Ha on VM: VM Name to Ha Starting VM: VM Name (i-2-13477-VM).

Thus, making it String.format("HA on VM: %s (%s)", started.getHostName(), started.getInstanceName());.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ravening Do you want to add instance name to the alert message (as advised)?

@@ -608,7 +608,9 @@ protected Long restart(final HaWorkVO work) {

VMInstanceVO started = _instanceDao.findById(vm.getId());
if (started != null && started.getState() == VirtualMachine.State.Running) {
String message = String.format("HA on VM: %s", started.getHostName());
s_logger.info("VM is now restarted: " + vmId + " on " + started.getHostId());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking the opportunity, what do you think of enhancing also the INFO log message?

As a suggestion, it could be logged something as the following:

HA is now restarting VM instance {id: "123", name: "VM Name", uuid: "1234.....abc", type="User"} on Host [{id: "1", name: "host.name", uuid: "ABC...123", type="Routing"}]

With the following code changes:

Suggested change
s_logger.info("VM is now restarted: " + vmId + " on " + started.getHostId());
HostVO hostVmHasStarted = _hostDao.findById(started.getHostId());
s_logger.info(String.format("HA is now restarting %s on %s", started, hostVmHasStarted));

@apache apache deleted a comment from rohityadavcloud Jan 10, 2022
@apache apache deleted a comment from blueorangutan Jan 10, 2022
@apache apache deleted a comment from blueorangutan Jan 10, 2022
@apache apache deleted a comment from sureshanaparti Jan 10, 2022
@apache apache deleted a comment from blueorangutan Jan 10, 2022
@apache apache deleted a comment from blueorangutan Jan 10, 2022
@DaanHoogland
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@DaanHoogland a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian ✖️ suse15. SL-JID 2157

@DaanHoogland
Copy link
Contributor

@blueorangutan test

1 similar comment
@sureshanaparti
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@sureshanaparti a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-2845)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 31418 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5664-t2845-kvm-centos7.zip
Smoke tests completed. 92 look OK, 0 have errors
Only failed tests results shown below:

Test Result Time (s) Test File

@sureshanaparti
Copy link
Contributor

HI @ravening Can you address outstanding comments. Thanks.

@nvazquez
Copy link
Contributor

nvazquez commented Feb 6, 2022

Ping @ravening

Rakesh Venkatesh added 2 commits February 9, 2022 14:37
When ha is performed on vm's send the alert for it so that
its for admins to know which vm's got ha'ed else its time
consuming to get those details from logs
@ravening
Copy link
Member Author

ravening commented Feb 9, 2022

Ping @ravening

@nvazquez done

@nvazquez
Copy link
Contributor

nvazquez commented Feb 9, 2022

@blueorangutan package

@blueorangutan
Copy link

@nvazquez a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✖️ el7 ✖️ el8 ✖️ debian ✔️ suse15. SL-JID 2545

Copy link
Member

@GabrielBrascher GabrielBrascher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM

@nvazquez
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@nvazquez a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔️ el7 ✖️ el8 ✔️ debian ✔️ suse15. SL-JID 2647

@nvazquez
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@nvazquez a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✖️ el7 ✔️ el8 ✔️ debian ✔️ suse15. SL-JID 2895

@nvazquez
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@nvazquez a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian ✔️ suse15. SL-JID 2898

@nvazquez
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@nvazquez a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-3628)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 32812 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5664-t3628-kvm-centos7.zip
Smoke tests completed. 92 look OK, 0 have errors
Only failed tests results shown below:

Test Result Time (s) Test File

@nvazquez nvazquez merged commit 6f3c18f into apache:main Mar 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

7 participants