Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic support for openbmc w/ httppower #33

Merged
merged 10 commits into from
Jan 19, 2019
Merged

Conversation

chu11
Copy link
Member

@chu11 chu11 commented Jan 16, 2019

This is a very simple support of openbmc with httppower. It's supported like this:

include "/etc/powerman/openbmc.dev"

device "openbmc0" "openbmc" "httppower -u https://pnode0 -H Content-Type:application/json -c |&"
device "openbmc1" "openbmc" "httppower -u https://pnode1 -H Content-Type:application/json -c |&"
device "openbmc2" "openbmc" "httppower -u https://pnode2 -H Content-Type:application/json -c |&"
device "openbmc3" "openbmc" "httppower -u https://pnode3 -H Content-Type:application/json -c |&"
device "openbmc4" "openbmc" "httppower -u https://pnode4 -H Content-Type:application/json -c |&"

node "node0" "openbmc0" "pnode0"
node "node1" "openbmc1" "pnode1"
node "node2" "openbmc2" "pnode2"
node "node3" "openbmc3" "pnode3"
node "node4" "openbmc4" "pnode4"

While working on this, I realized that scalable and good support of openbmc will be a lot harder / time consuming than originally thought.

  • httppower was originally written with a single / "global" curl handle. Therefore httppower can't be configured to work with multiple hosts if cookies are involved (since they are per host and
    per handle). Thus the one line per host above.

    To solve this, httppower would have to be written with some type of hashing algorithm to have a curl handle per host and would parse the host off the url used.

  • httppower uses the "easy" curl interface, which (AFAICT) is synchronous communication. Using the "hard" curl interface would allow for parallel network transactions.

  • Similar to how ipmi works, an "on" or "off" request / response is independent of the actual action. So you can get circumstances like this:

> powerman --off FOO
Command completed successfully

> powerman -q
on:     FOO
off:
unknown: 

The reason FOO is listed as on, is b/c the http response for "off" has been received, but the "off" action has not yet been done. openbmc can perform the "off" action at some point in the future.

In ipmipower, this is solved with options like --wait-until-on and --wait-until-off, which poll until an action is done.

  • httppower does not have any ability to determine if a network interface is down. If powerman were configured for a large number of nodes, we can expect a single node to always be removed for servicing once in awhile. That means that powerman / httppower will constantly time out on pretty much any action such as powerman --query.

    This was solved in ipmipower b/c ipmipower performs some ipmi "pings" in the background, to effectively monitor for nodes disappearing from the network.

  • It's worth noting that the last two points are more difficult than in ipmipower, b/c httppower has no concept of "power query" or "ping". Those are things defined in the .dev file for every device that uses http. Does this mean it has to be developed into powerman as a feature for all possible devices?

  • Possible option: develop a "openbmcpower" similar to "ipmipower"?

  • Possible option: redo "httppower" from scratch to do a lot more stuff. Future consideration for other rest APIs such as Redfish.

@garlick
Copy link
Member

garlick commented Jan 16, 2019

This looks nice @chu11!

I guess it's a question for @elgeoman whether he can live with the caveats:

  • process per host
  • open loop power control
  • potential for timeouts when BMC's are unreachable

If this is supposed to be used on a system like sierra, then probably not, but if it's a few dozen nodes, maybe?

One thing that seems to be missing is a test. See test/dli.c, the only other example of a test for a httppower-ish device. Yuck. Not sure what to suggest for a better strategy...

@grondo had a good thought - openbmcpower in python? There are probably some good python modules for managing async REST API's...

@chu11
Copy link
Member Author

chu11 commented Jan 16, 2019

One thing that seems to be missing is a test.

Yup, just wasn't sure if this was merge-able, so was going to add later.

openbmcpower in python? There are probably some good python modules for managing async REST API's...

If writing from scratch, I had thought of python as well. But would have to architect in the separate "ping monitor" either way. (Edit: Not suggesting it's a separate program, but rather it may not integrate with a REST API and has to be programmed off to the side.)

@garlick
Copy link
Member

garlick commented Jan 16, 2019

Maybe "ping" could just be an HTTP get or similar?

@chu11
Copy link
Member Author

chu11 commented Jan 16, 2019

Maybe "ping" could just be an HTTP get or similar?

Oh, that's a good point. A response doesn't even have to be "valid", e.g. like a 404 error is an acceptable response to a GET/"ping".

@elgeoman
Copy link

This is a good start for a couple dozen nodes, but the intention is to eventually go much larger than a couple dozen (say like 4K nodes or better). If openbmc happens to support IPMI v.2.0 in the near future, then developing something like openbmcpower may not be necessary. If openbmc decides to dump ipmi for restful interfaces for power control, then the effort to build openbmcpower may be worth it.

@garlick
Copy link
Member

garlick commented Jan 17, 2019

My vote would be to get this PR ready to merge, and open a new issue for an openbmcpower helper.

We can get the design sketched out in the new issue and come up with a rough estimate of the effort required to implement, and then have more of a basis for planning the work.

@garlick garlick mentioned this pull request Jan 17, 2019
@chu11
Copy link
Member Author

chu11 commented Jan 17, 2019

It's worth mentioning, one other consideration was to have every status / on / off do a full create / destroy a new curl handle and do a full login, on/off, logout. i.e. the dev file might have something like this:

	script status {
		send "post login {\"data\":[\"root\",\"0penBmc\"]}\n"
		expect "httppower> "
		send "get xyz/openbmc_project/state/chassis0/attr/CurrentPowerState\n"
	        expect "xyz.openbmc_project.State.Chassis.PowerState.(On|Off)"
                setplugstate $1 on="On" off="Off"
		expect "httppower> "
		send "post logout {\"data\":[]}\n"
		expect "httppower> "
	}

i.e. what's in the login/logout parts of the dev file go into every status / on / off /etc. part

But due to:

httppower uses the "easy" curl interface, which (AFAICT) is synchronous communication.

we'd be doing more http transactions per node and we'd effectively be doing it serially. So I didn't consider this path worthwhile to do (not to mention a fair amount of hacking up of httppower).

@chu11
Copy link
Member Author

chu11 commented Jan 17, 2019

re-pushed with some openbmc tests

@garlick
Copy link
Member

garlick commented Jan 18, 2019

Thanks! One more thing that we should do here is put a note at the top of the openbmc.dev file about the specific hardware it was tested on, and its limitations, e.g.

  • power on/off commands are "open loop", e.g. success is returned before the operation has completed
  • this is a one plug device, therefore requires one httppower processes per plug, which may present scalabiltiy concerns for a large cluster
  • there is no background management of up/down status for the BMC's like there is in ipmipower, so a missing or unresponsive BMC can cause powerman commands that include that plug to pause, annoyingly, for the timeout period.

and maybe add a reference to issue #34

By the way, one legit thing to do in the dev script, to work around that first limitation, say for the off script, is to add a delay for the amount of time it typically takes to power off a node plus some fudge factor, then add a status query and fail the command if the node is still on. Powerman should parallel this delay over multiple devices. It means some repetition in the script, but it's preferable to running open loop IMHO.

@chu11
Copy link
Member Author

chu11 commented Jan 18, 2019

specific hardware it was tested on

@elgeoman do you have the specific name of the hardware and firmware version? I looked around but couldn't find out the exact motherboard name (everywhere just says power9) and rest API doesn't seem to have firmware version call (and openbmctool I don't know user/pw).

@elgeoman
Copy link

@chu11
The nodes are IBM 8335-GTW.
/etc/os-release on the BMC shows:

root@pfire670:~# cat /etc/os-release                                                
ID="openbmc-phosphor"                                                               
NAME="Phosphor OpenBMC (Phosphor OpenBMC Project Reference Distro)"                 
VERSION="ibm-v2.0"                                                                  
VERSION_ID="ibm-v2.0-0-r46-0-gbed584c"                                              
PRETTY_NAME="Phosphor OpenBMC (Phosphor OpenBMC Project Reference Distro) ibm-v2.0" 
BUILD_ID="ibm-v2.0-0-r46"                                                           

@chu11
Copy link
Member Author

chu11 commented Jan 18, 2019

By the way, one legit thing to do in the dev script, to work around that first limitation, say for the off script, is to add a delay for the amount of time it typically takes to power off a node plus some fudge factor

Eek ... I've been seeing averages in the 25-ish second range. So adding a fudge ... 30 seconds? Too long? I would then have to increase the overall timeout from 30 seconds to something bigger as well.

then add a status query and fail the command if the node is still on.

I'm familiar with the "delay" command, but how would one send a "failure" back to the client safely? Nothing really jumps out at me.

@garlick
Copy link
Member

garlick commented Jan 18, 2019

Eek ... I've been seeing averages in the 25-ish second range. So adding a fudge ... 30 seconds? Too long? I would then have to increase the overall timeout from 30 seconds to something bigger as well.

That is a long time...is the command trying to do a clean OS shutdown or a hard power off? Is there a different command/flag that we should be using to get it done more directly?

I'd probably start with the delay + timeout you proposed and then fine tune it until it is reliable.

I'm familiar with the "delay" command, but how would one send a "failure" back to the client safely? Nothing really jumps out at me.

Just "expect" the desired result (like "off"). If something else is returned (like "on") the expect will hang until the script times out, and the user will get a failure.

@garlick
Copy link
Member

garlick commented Jan 18, 2019

On the upside, should take the same time to power off one node or the whole cluster, since scripts are run in parallel across devices, and we've just got one plug per device.

@chu11
Copy link
Member Author

chu11 commented Jan 18, 2019

That is a long time...is the command trying to do a clean OS shutdown or a hard power off? Is there a different command/flag that we should be using to get it done more directly?

It is a "soft power" off, according to openbmc documentation. "hard power" off appears to take the BMC off standby power, so it's really not an option to use.

But what's strange is the 20-25 seconds is both on & off. So it's unlikely it's a clean os shutdown kind of thing.

@chu11
Copy link
Member Author

chu11 commented Jan 18, 2019

re-pushed with extra language in the openbmc.dev file and some delays put in.

@garlick
Copy link
Member

garlick commented Jan 19, 2019

power on/off commands can return success before the operation has
been completed

No longer true - we'll catch that now, so I'd drop it from the comment.

Otherwise LGTM!

@chu11
Copy link
Member Author

chu11 commented Jan 19, 2019

No longer true - we'll catch that now, so I'd drop it from the comment.

Hehe, I was speaking to the fact that is the "openbmc" behavior. I'll tweak the comment.

@chu11
Copy link
Member Author

chu11 commented Jan 19, 2019

re-pushed with a minor text tweak

@garlick
Copy link
Member

garlick commented Jan 19, 2019

Great! Thanks for the work!

@garlick garlick merged commit 2324db4 into chaos:master Jan 19, 2019
@chu11 chu11 deleted the issue31 branch October 21, 2021 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants