Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

only update powerstate if sure it is the latest #3743

Merged
merged 2 commits into from Jan 7, 2020

Conversation

DaanHoogland
Copy link
Contributor

@DaanHoogland DaanHoogland commented Dec 4, 2019

Description

On update powerstate check if we have the latest knowledge before doing so.
When report is missing don't try to stop the VM on the host that doesn't know the VM.

Fixes: #3741

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

How Has This Been Tested?

This is purely based on log- and code analysis. intensiove review is requested. Also advise on environmental consequences is welcomed.
HA may be impacted do I suspect not in a negative way.

@DaanHoogland DaanHoogland changed the title only update powerstate is sure it is the latest only update powerstate if sure it is the latest Dec 5, 2019
@DaanHoogland DaanHoogland marked this pull request as ready for review December 5, 2019 12:51
@DaanHoogland
Copy link
Contributor Author

any vmware users or HA users are invited to have a look at this.

@DaanHoogland
Copy link
Contributor Author

will be backporting to 4.13

@DaanHoogland DaanHoogland changed the base branch from master to 4.13 December 5, 2019 15:25
@andrijapanicsb
Copy link
Contributor

/CC @Doni7722

Copy link
Contributor

@andrijapanicsb andrijapanicsb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can only give LGTM based on the internal discussion we had - since it's impossible to test this edge case of race condition we are fixing here.
(can't comment the code, since not developer myself)

@andrijapanicsb
Copy link
Contributor

This needs testing with VMware specifically

@blueorangutan
Copy link

@andrijapanicsb unsupported parameters provided. Supported mgmt server os are: centos6, centos7, ubuntu. Supported hypervisors are: kvm-centos6, kvm-centos7, kvm-ubuntu, xenserver-71, xenserver-65sp1, xenserver-62sp1, vmware-65u2, vmware-60u2, vmware-55u3, vmware-51u1, vmware-50u1

@andrijapanicsb
Copy link
Contributor

@blueorangutan test centos7 vmware-65u2

Copy link
Contributor

@nvazquez nvazquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM. Thanks @DaanHoogland

@DaanHoogland
Copy link
Contributor Author

revert snapshot error does not seem related but doing a re-package/re-test to be sure.
@blueorangutan package

@apache apache deleted a comment from blueorangutan Jan 3, 2020
@apache apache deleted a comment from blueorangutan Jan 3, 2020
@apache apache deleted a comment from andrijapanicsb Jan 3, 2020
@apache apache deleted a comment from andrijapanicsb Jan 3, 2020
@apache apache deleted a comment from blueorangutan Jan 3, 2020
@apache apache deleted a comment from blueorangutan Jan 3, 2020
@apache apache deleted a comment from blueorangutan Jan 3, 2020
@apache apache deleted a comment from blueorangutan Jan 3, 2020
@apache apache deleted a comment from blueorangutan Jan 3, 2020
@apache apache deleted a comment from borisstoyanov Jan 3, 2020
@apache apache deleted a comment from blueorangutan Jan 3, 2020
@apache apache deleted a comment from blueorangutan Jan 3, 2020
@apache apache deleted a comment from blueorangutan Jan 3, 2020
@apache apache deleted a comment from blueorangutan Jan 3, 2020
@andrijapanicsb
Copy link
Contributor

@blueorangutan test centos7 vmware-65u2

@apache apache deleted a comment from blueorangutan Jan 4, 2020
@DaanHoogland
Copy link
Contributor Author

@blueorangutan package

@apache apache deleted a comment from blueorangutan Jan 6, 2020
@apache apache deleted a comment from blueorangutan Jan 6, 2020
@apache apache deleted a comment from blueorangutan Jan 6, 2020
@apache apache deleted a comment from blueorangutan Jan 6, 2020
@apache apache deleted a comment from blueorangutan Jan 6, 2020
@blueorangutan
Copy link

@DaanHoogland a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@apache apache deleted a comment from blueorangutan Jan 6, 2020
@blueorangutan
Copy link

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-540

@DaanHoogland
Copy link
Contributor Author

@blueorangutan test centos7 vmware-65u2

@blueorangutan
Copy link

@DaanHoogland a Trillian-Jenkins test job (centos7 mgmt + vmware-65u2) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-708)
Environment: vmware-65u2 (x2), Advanced Networking with Mgmt server 7
Total time taken: 30988 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr3743-t708-vmware-65u2.zip
Smoke tests completed. 77 look OK, 0 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File

@DaanHoogland DaanHoogland merged commit d44dc07 into apache:4.13 Jan 7, 2020
@DaanHoogland DaanHoogland deleted the timingPowerState branch January 7, 2020 08:12
DaanHoogland added a commit that referenced this pull request Jan 7, 2020
* 4.13:
  only update powerstate if sure it is the latest (#3743)
  ui: fix migrate host form no host popup (#3682)
  client: jetty session timeout set after server is started (#3658)
  Increase DHCP lease time to infinite (#3662)
", power state: PowerReportMissing, last state update: " + vmStateUpdateTime.getTime());
if (s_logger.isInfoEnabled()) {
s_logger.info(
String.format("Detected missing VM. host: %l, vm id: %l(%s), power state: %s, last state update: %l"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @DaanHoogland, I'm getting this error in latest master:

INFO  [c.c.a.m.AgentManagerImpl] (AgentTaskPool-98:ctx-ad691272) (logid:1a451217) The agent from host 1 state determined is Up
INFO  [c.c.a.m.AgentManagerImpl] (AgentTaskPool-98:ctx-ad691272) (logid:1a451217) Agent is determined to be up and running
WARN  [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-8:null) (logid:) Caught: 
java.util.UnknownFormatConversionException: Conversion = 'l'
	at java.util.Formatter$FormatSpecifier.conversion(Formatter.java:2691)
	at java.util.Formatter$FormatSpecifier.<init>(Formatter.java:2720)
	at java.util.Formatter.parse(Formatter.java:2560)
	at java.util.Formatter.format(Formatter.java:2501)
	at java.util.Formatter.format(Formatter.java:2455)
	at java.lang.String.format(String.java:2940)
	at com.cloud.vm.VirtualMachinePowerStateSyncImpl.processReport(VirtualMachinePowerStateSyncImpl.java:141)
	at com.cloud.vm.VirtualMachinePowerStateSyncImpl.processHostVmStatePingReport(VirtualMachinePowerStateSyncImpl.java:66)
	at com.cloud.vm.VirtualMachineManagerImpl.processCommands(VirtualMachineManagerImpl.java:3245)
	at com.cloud.agent.manager.AgentManagerImpl.handleCommands(AgentManagerImpl.java:309)
	at com.cloud.agent.manager.AgentManagerImpl$AgentHandler.processRequest(AgentManagerImpl.java:1287)
	at com.cloud.agent.manager.AgentManagerImpl$AgentHandler.doTask(AgentManagerImpl.java:1370)
	at com.cloud.agent.manager.ClusteredAgentManagerImpl$ClusteredAgentHandler.doTask(ClusteredAgentManagerImpl.java:702)
	at com.cloud.utils.nio.Task.call(Task.java:83)
	at com.cloud.utils.nio.Task.call(Task.java:29)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i know, fixed in #3806

ustcweizhou pushed a commit to ustcweizhou/cloudstack that referenced this pull request Feb 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants