Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Power Restore Policy should NOT change host state on BMC reboots #3031

Open
AlexanderAmelkin opened this issue Mar 26, 2018 · 9 comments

Comments

Projects
None yet
4 participants
@AlexanderAmelkin
Copy link
Contributor

commented Mar 26, 2018

I'm not sure whether this should be posted here or in openbmc-test-automation or both, but currently we have test case "Test Restore Policy ALWAYS_POWER_ON With Host Off" failing on Vesnin. This may be a problem in Vesnin support, but what bothers me is the logic of the test. Since I suppose that this test is passing for Weatherspoon and other machines, I also suspect that implementation of Power Restore Policy is generally incorrect.

The test sets the power restore policy to AlwaysOn, sets the host to Off state and reboots the BMC, then expects the host to be turned on.

In my opinion, backed by wording in IPMI v2.0 specification, BMC reboots must not change the state of the host, whatever Power Restore Policy is in effect. The Power Restore Policy (/xyz.openbmc_project.Control.Power.RestorePolicy.Policy) must be about, literally, power restoration, not BMC reboots. This is a quote from IPMI specification, Section 28.8:

The power restore policy determines how the system or chassis behaves when AC power returns
after an AC power loss

Please note the "AC power loss" part.

At least with ASPEED chips it is easily possible to tell a cold boot (when SYSRST# is applied to the BMC chip) from a reboot, be it a programmatic reboot or a watchdog reset. Please see ASPEED AST2400 datasheet for SCU3C register.

@geissonator

This comment has been minimized.

Copy link
Contributor

commented Mar 26, 2018

If the host is running, then the policy will not be enforced (i.e. rule #1 is always leave the host up if it's already running) so even if the policy is ALWAYS_POWER_OFF it will not be enforced if the host is detected to be running after a bmc reboot.

But yeah, I agree, it would make sense to not enforce any policy on just a BMC reboot. Patches always welcome :) Something in https://github.com/openbmc/phosphor-state-manager/blob/master/discover_system_state.cpp#L142 that checks for the reset type and just exits if it's only a BMC reboot.

@AlexanderAmelkin

This comment has been minimized.

Copy link
Contributor Author

commented Jun 14, 2018

We have an AST2400-specific patch for this. The patch adds a shell-script based phosphor-discover-system-state service that uses devmem to read the AST register and only allow for applying the policy if the boot was cold.

It works for us, but apparently it won't work for non-Aspeed (or maybe non-AST2400) BMCs.

Do you want that solution submitted to gerrit?

@geissonator

This comment has been minimized.

Copy link
Contributor

commented Jun 14, 2018

Sure, probably can't upstream that but be good to see and discuss variations that may be more general.

@amboar

This comment has been minimized.

Copy link
Member

commented Jun 28, 2018

We have an AST2400-specific patch for this.

@AlexanderAmelkin I'm interested in what other state you're using devmem for. We have a WIP driver in the kernel tree that exposes random register bits to userspace via sysfs (drivers/soc/aspeed/aspeed-bmc-misc.c), and I wonder if this meets your needs? I'm in the middle of reworking it at the moment, would you like me to Cc you when I post the patches?

@AlexanderAmelkin

This comment has been minimized.

Copy link
Contributor Author

commented Jun 28, 2018

@geissonator, I decided to simply attach it here for reference instead of submitting to gerrit for a beforehand known-to-be-rejected review. :)

phosphor-state-manager-fix-3031.tar.gz

@amboar, yes please keep me Cc'ed on your changes to aspeed drivers. Thanks.

@gtmills

This comment has been minimized.

Copy link
Member

commented Oct 12, 2018

@amboar Patches get posted?

@amboar

This comment has been minimized.

Copy link
Member

commented Nov 8, 2018

Yeah, and got NACKed. I haven't had the need/time/energy to post another series due to taking other approaches to the problems I was trying to solve. I'll revive them at some point.

@stale

This comment has been minimized.

Copy link

commented May 10, 2019

This issue has been automatically marked as stale because no activity has occurred in the last 6 months. It will be closed if no activity occurs in the next 30 days. If this issue should not be closed please add a comment. Thank you for your understanding and contributions.

@stale stale bot added the stale label May 10, 2019

@AlexanderAmelkin

This comment has been minimized.

Copy link
Contributor Author

commented May 17, 2019

Anyone? I don't remember the check for warm/cold boot being implemented in openbmc. Am I missing anything? Anyway, I don't think this issue must be auto-closed.

@stale stale bot removed the stale label May 17, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.