Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RaptorCS Blackbird support #3341

Merged
merged 1 commit into from Dec 17, 2019
Merged

RaptorCS Blackbird support #3341

merged 1 commit into from Dec 17, 2019

Conversation

@stewartsmith
Copy link
Contributor

stewartsmith commented Dec 15, 2019

Hostboot config is basically Witherspoon but with LPC PORT 80h turned on
defconfig is from Witherspoon, with Blackbird XML setup from Raptor's tree.

NOTE: without linux v5.4.3 (or at least anything more recent than v5.3.7) my Intel SSD doesn't show up in Petitboot.

Also note that the blackbird-xml project doesn't yet existing in the open-power org, so you need to grab it from my repo (or Raptor's). Consider this a request to branch mine in :)

Signed-off-by: Stewart Smith stewart@flamingspork.com

Hostboot config is basically Witherspoon but with LPC PORT 80h turned on
defconfig is from Witherspoon, with Blackbird XML setup from Raptor's tree

Signed-off-by: Stewart Smith <stewart@flamingspork.com>
@stewartsmith

This comment has been minimized.

Copy link
Contributor Author

stewartsmith commented Dec 15, 2019

I've sent the skiboot patch upstream: https://patchwork.ozlabs.org/patch/1208967/

@sharkcz

This comment has been minimized.

Copy link
Contributor

sharkcz commented Dec 15, 2019

Hmm, I guess I should make the same for Talos :-)

@madscientist159

This comment has been minimized.

Copy link

madscientist159 commented Dec 16, 2019

Did the IPL observer stuff ever make its way in to upstream? Especially w/ Talos II it's pretty critical. ;)

@stewartsmith

This comment has been minimized.

Copy link
Contributor Author

stewartsmith commented Dec 16, 2019

@sharkcz

This comment has been minimized.

Copy link
Contributor

sharkcz commented Dec 16, 2019

I have started https://wiki.raptorcs.com/wiki/Firmware_Upstreaming (it got few changes from @merklort for the hostboot part) some time ago, probably based on the Talos GA branch (2018-04-19), not on the v2 firmware (2019-04-16).

@madscientist159

This comment has been minimized.

Copy link

madscientist159 commented Dec 16, 2019

I know that was one of the main blockers to upstreaming -- we don't want to have the support issues associated with OCC reads during IPL (or completely disable fan controls, for that matter). The former in particular was quite expensive as the symptoms lead to spurious mainboard / CPU RMA in some cases and escalation to higher level support in many others. The alternarnative, no fan controls, leads to attempted customer initiated RMA as no one expects a desktop to be that loud (and also ends up.being a bit of a PR black eye for POWER itself).

While I'd really like to see upstream support, we need to have the IPL observer support merged into the subcomponents first. Any chance that can happen?

@sharkcz

This comment has been minimized.

Copy link
Contributor

sharkcz commented Dec 16, 2019

Can't we start with the machine specific "overlay" in openpower/patches/talos-patches until they are properly upstreamed? How much of the non-upstream patches is shared between Blackbird and Talos?

@stewartsmith

This comment has been minimized.

Copy link
Contributor Author

stewartsmith commented Dec 16, 2019

Copy link
Contributor

dcrowell77 left a comment

Looks fine from my perspective

@dcrowell77

This comment has been minimized.

Copy link
Contributor

dcrowell77 commented Dec 16, 2019

There were some internal network issues recently so I'm guessing that is what the CI fail probably is.

@dcrowell77

This comment has been minimized.

Copy link
Contributor

dcrowell77 commented Dec 16, 2019

retest this please

@madscientist159

This comment has been minimized.

Copy link

madscientist159 commented Dec 16, 2019

Can't we start with the machine specific "overlay" in openpower/patches/talos-patches until they are properly upstreamed? How much of the non-upstream patches is shared between Blackbird and Talos?

Yes, a local set of patches combined with a RCS overlay directory would work. Most of the modifications are common between Talos II / Blackbird / Condor; please feel free to pull whatever you need from our repos as we have retained the original Apache / GPL / BSD licenses for the modified components (newly written components are normally GPL v3 from our side, licenses are in the file headers).

I just don't want a functionally broken upstream version. As to how we get there, I'm quite flexible. :). That being said, we do rely extensively on the RFC LPC communication (both for boot status and for other tasks); the BMC firmware assumes it's available and makes assumptions about what kind of information is provided, so I don't want to modify the "API" of sorts that it's using.

The Talos II beta firmware is far closer to the Blackbird firmware; the intent was to merge all three of the RCS hardware platforms onto a mostly common codebase. Don't bother with the older codebase associated with the currently shipping Talos II production firmware; it was largely rewritten for Blackbird and the Talos II beta FW.

@stewartsmith

This comment has been minimized.

Copy link
Contributor Author

stewartsmith commented Dec 17, 2019

The v2 for skiboot patch, which talks IPL Observer: https://patchwork.ozlabs.org/patch/1210958/

@oohal

This comment has been minimized.

Copy link
Contributor

oohal commented Dec 17, 2019

retest this please

@oohal oohal merged commit 592eff7 into open-power:master Dec 17, 2019
8 checks passed
8 checks passed
Build p9dsu,palmetto,witherspoon,zz using SDK Built p9dsu,palmetto,witherspoon,zz successfully using SDK
Details
DCO DCO
Details
IBM OP CI HW IPL test -- p8 P8 HW systems booted successfully with built PNOR images.
Details
IBM OP CI HW IPL test -- p9 P9 HW systems booted successfully with built PNOR images.
Details
IBM OP CI op-build build RHEL7 x86-64 Images built successfully.
Details
IBM OP CI op-build build Ubuntu ppc64le Build finished.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
default Build finished.
Details
@madscientist159

This comment has been minimized.

Copy link

madscientist159 commented Dec 17, 2019

@stewartsmith Thanks for that!

Getting the IPL observer stuff into the SBE and hostboot would also be good -- in particular, the SBE patches (in our tree) drive the progress bar during the first part of the Blackbird boot, and have proven very useful over time to determine bad CPUs / sockets from hostboot issues at a glance.

@stewartsmith

This comment has been minimized.

Copy link
Contributor Author

stewartsmith commented Dec 17, 2019

@madscientist159

This comment has been minimized.

Copy link

madscientist159 commented Dec 17, 2019

I'd be willing to test any patches to upstream, since I do have access to the recovery system...

@madscientist159

This comment has been minimized.

Copy link

madscientist159 commented Dec 17, 2019

Just dug this back up:
open-power/sbe#15

@sharkcz

This comment has been minimized.

Copy link
Contributor

sharkcz commented Dec 17, 2019

I have updated https://wiki.raptorcs.com/wiki/Firmware_Upstreaming with SBE, HCODE, OCC and Hostboot. Seems there is just few changes in the Raptor branches for these components.

@madscientist159

This comment has been minimized.

Copy link

madscientist159 commented Dec 17, 2019

Most center around the IPL observer, bit frustrating to have the simple PRs sit for > 8 months TBH.

@dcrowell77

This comment has been minimized.

Copy link
Contributor

dcrowell77 commented Dec 17, 2019

Is the SBE PR the only one outstanding? I understand the frustration, I'll try to shake a tree.

@sharkcz

This comment has been minimized.

Copy link
Contributor

sharkcz commented Dec 17, 2019

If I see right in the wiki page above, then we need https://git.raptorcs.com/git/talos-hostboot/commit/?id=d90e6c513094231f622a427030f3dbca1eeb5ed5 for the hostboot part. @madscientist159 will know if there has been a PR already.

@dcrowell77

This comment has been minimized.

Copy link
Contributor

dcrowell77 commented Dec 17, 2019

Of course the link was added after I read it. It looks like 3 new commits. Make some PRs and I'll at least get them into the pipeline (with no promises on getting a lot of attention).

@sharkcz

This comment has been minimized.

Copy link
Contributor

sharkcz commented Dec 17, 2019

op-build WIP with Talos support = https://github.com/sharkcz/op-build/tree/talos
skiboot WIP with Talos using LPC observer = https://github.com/sharkcz/skiboot/tree/talos-lpc

be aware, not even compile tested

@madscientist159

This comment has been minimized.

Copy link

madscientist159 commented Dec 19, 2019

FWIW 0xfefe is the IPL observer code for Linux online and userspace launched. Skiboot should NOT be sending it; it confuses two parts of the process and tells the BMC IPL has successfully finished when in fact it's only partially done.

@sharkcz

This comment has been minimized.

Copy link
Contributor

sharkcz commented Dec 20, 2019

@madscientist159, I understand your position. The open question is, whether there is a way to do the signalling correctly and upstreamable. How do the other platforms handle this kind of communication between the host and BMC?

@madscientist159

This comment has been minimized.

Copy link

madscientist159 commented Dec 20, 2019

@sharkcz I've seen a bunch of different methods over the years including writes to scratch registers in the southbridge. What exactly is wrong with a small userspace tool poking the last status code out the LPC port? If it's just that the kernel doesn't have a standardized LPC API, maybe all we need is a simple powernvlpc module?

The idea is that 0xfefe is only poked out once petitboot is either about to start (we've already tested our initramfs is working, shell works, etc.) or once petitboot is actually starting. If we reach end of skiboot status codes, and don't reach 0xfefe in under 15 seconds, something's wrong and a firmware rebuild / reflash is in order.

@sharkcz

This comment has been minimized.

Copy link
Contributor

sharkcz commented Dec 20, 2019

@madscientist159, nothing is wrong, it's for my education and overview :-) Also I'm thinking how it could be integrated into upstream.

@madscientist159

This comment has been minimized.

Copy link

madscientist159 commented Dec 20, 2019

@sharkcz we actually need an LPC access module for other reasons, something that can allow root to read / write bytes, nothing fancy. I can see why it might be desirable to disable the debug interface, but LPC access is a useful and reasonable thing to have.

I wonder if we can do a simple device node that takes basic commands like IO_LPC_SET_TARGET_ADDRESS, IO_LPC_WRITE_BYTE, IO_LPC_READ_BYTE?

@stewartsmith

This comment has been minimized.

Copy link
Contributor Author

stewartsmith commented Dec 20, 2019

@stewartsmith

This comment has been minimized.

Copy link
Contributor Author

stewartsmith commented Dec 20, 2019

@madscientist159

This comment has been minimized.

Copy link

madscientist159 commented Dec 20, 2019

Would there be an objection to an LPC access module in secure mode, if we were to write one?

I would be open to engaging fan control on skiboot exit, but that's a BMC change. Could we send 0x80ff at skiboot exit (IIRC 0x80 is the skiboot prefix code)?

@stewartsmith

This comment has been minimized.

Copy link
Contributor Author

stewartsmith commented Dec 20, 2019

@madscientist159

This comment has been minimized.

Copy link

madscientist159 commented Dec 20, 2019

Yeah was looking at arbitrary access, see my post a few above about a possible IO_ access mechanism around /dev/lpc or similar. Sounds like we'd need a secure mode filter -- e.g. if in secure mode restrict destination address to 0x80-0x82?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.