Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SuperMicro A2SDi doesn't work #189

Open
lasalvavida opened this issue Apr 10, 2018 · 17 comments
Open

SuperMicro A2SDi doesn't work #189

lasalvavida opened this issue Apr 10, 2018 · 17 comments
Assignees

Comments

@lasalvavida
Copy link

https://www.supermicro.com/products/motherboard/atom/A2SDi-2C-HLN4F.cfm

SPS version: 4.0.4.139

Have tried soft disable, code removal, and the combination of the two.

Soft disable doesn't appear to work at all here. The current state is still reported as Operational.

Code removal causes the board to enter a state where power is off and stays off even if the button is pressed, or a power command is issued over IPMI.

Have also tried --keep-modules, and using --whitelist to attempt removing only single partitions. These exhibit various behaviors, some where the board stays off as described above, some where the board comes up but never POSTs, staying at code 0xff, and some that go through a few codes and then get stuck at 0xad. None of these boot paths successfully initialize VGA.

I don't believe that this board has Intel Boot Guard, but the inability to remove anything from the ME section makes me think that either something about how me_cleaner modifies the image is failing some kind of validation, or that SuperMicro has made some kind of configuration/code change in SEC or PEI that requires ME to be present and functional.

@corna, any ideas?

@corna
Copy link
Owner

corna commented Apr 10, 2018

Great, finally I have the chance to work on SPS 4.x ;)

Note that --whitelist only adds some modules to the whitelist, if you want to remove only a partition you have to use --blacklist.

I'm going to analyze the SPS image (which, luckily, is available on the supermicro website), I'll keep you updated.

@lasalvavida
Copy link
Author

Note that --whitelist only adds some modules to the whitelist, if you want to remove only a partition you have to use --blacklist.

I was whitelisting all removed partitions and then removing them from the whitelist individually. Same idea I guess.

I'm going to analyze the SPS image (which, luckily, is available on the supermicro website), I'll keep you updated.

Awesome!

@lasalvavida
Copy link
Author

From the Positive Technologies blog post on HAP disable:

We also checked the firmware of server and mobile versions of ME (SPS 4.x and TXE 3.x). In the server version, this flag is always set to 1; in the mobile version, it is ignored. This means that this method will not work in server and mobile versions (Apollo Lake) of ME.

So it would seem that soft disable not working is expected.

@lasalvavida
Copy link
Author

Update from playing with some of the undocumented bits in the soft strap at 0x14 for disabling various components on the C3000 chipset:

image

  • Bits 9, 10, 14, 15, 16, 18, 20: Boots normally, no effect on ME status
  • Bit 11: No power
  • Bit 12: Powers on, hangs with POST code 0xff, no VGA
  • Bit 13: Boots, no USB

Also saw this IE Disable bit:

image

And decided to try bit 13 at 0x78 to match the spacing with the ME SMBus Management soft strap:

image

Board booted normally, but no change of the ME status.

@lasalvavida
Copy link
Author

lasalvavida commented Apr 11, 2018

Looking at this Power Management Controller register:

screenshot from 2018-04-11 18 01 51

It seems to mirror the ordering of the listed soft strap bits, and lines up with the observed behavior of the undocumented bits (i.e. Bit 13 disables USB2). So if that was going to work, it would have been bit 15, which had no effect. I did try it again just to be sure.

@jjurkus
Copy link

jjurkus commented Apr 18, 2018

Oh great, so these newer C3000 motherboards, which seem so great for pfSense aren't all that great.

I have the A1SRi-2558F, which does not have intel ME. Perhaps it also lacks that "innovation engine", that sounds so incredibly helpful.

@lasalvavida
Copy link
Author

lasalvavida commented Apr 19, 2018

A few updates:

There is a second $FPT Partition at 0x1000 identical to the one at 0x10. It doesn't seem to be used, maybe this is for recovery mode? Either way, we probably want to trim this as well. I am already doing this locally just to be sure that it wasn't related to the following observed behavior.

Previously, I wasn't able to boot an image even with all partitions whitelisted. I have been able to boot an image now after I disabled the EFFS related changes and the auto-removing of empty modules.

One very odd thing came up that I'm not really sure what to make of yet. The partitions in this image are as follows: FTPR, FTUP, DLMP, PSVN, IVBP, MFS, ROMB, FPTB, MFSB, IVB1, IVB2, BIS, FLOG, UTOK, and OPR1.

PSVN, IVBP, and ROMB are empty. I discovered that you can boot with any one or two of them removed, but not all three.

Then, I tried removing FLOG and UTOK which are not empty, and the board booted, but removing any third module causes the board not to boot.

It would seem that I can't remove more than two modules, but I'm not really sure why yet. @corna, any insight you have would be appreciated.

@jjurkus
Copy link

jjurkus commented Apr 19, 2018

I've read somewhere SPS firmware images have a main and recovery image together, so if you find two $FPT partitions that would make sense.

I think I read it on the win-raid forum. Check this topic for a start: https://www.win-raid.com/t596f39-Intel-Management-Engine-Drivers-Firmware-amp-System-Tools.html

@jjurkus
Copy link

jjurkus commented Apr 19, 2018

And of course you have also found ME Analyzer?

Other thing: I've looked in the manual. Have you tried to set the ME to manufacturing mode?
Set JPME2 to bridge pins 2-3. (1-2 are bridged by default)
See #195

@lasalvavida
Copy link
Author

Other thing: I've looked in the manual. Have you tried to set the ME to manufacturing mode?
Set JPME2 to bridge pins 2-3. (1-2 are bridged by default)

I think this jumper is actually mislabeled. The HECI firmware status always has bit 4 (manufacturing mode) set to 1 in both jumper positions. The jumper appears to toggle the ME between operational and recovery modes.

@corna
Copy link
Owner

corna commented Apr 20, 2018

I'm quite busy these days, but I haven't forgotten about this issue, don't worry. ;)

I should have some spare time this weekend, I'll work on it.

@corna corna self-assigned this Apr 20, 2018
@lasalvavida
Copy link
Author

🔔 Ding, dong, the witch is dead! (I think) 🔔

I was able to work around not being able to remove more than two modules by changing the offset and length of a module to zero to remove it instead of removing the table entry entirely.

You must leave MFS or the no power behavior that I described earlier occurs, FPTB which points to the recovery $FPT (you also need to make these changes there as well, or the ME will drop to recovery and continue functioning), and BIS or the board will not POST.

Board reports firmware version: 0.0.0.0 and recovery mode, which I understand is usually a good sign that this worked. The firmware heartbeat bits of HECI1_GS_SHDW1 are no longer incrementing.

Booted ArchLinux, board has been up for 30 minutes, so no issues with the watchdog timer either.

Happy to contribute my code for nulling out table entries instead of removing them if you think it's useful.

@corna
Copy link
Owner

corna commented Apr 23, 2018

Good job!

I've looked into the SPS firmware, here you can find the raw content; as you can see the interesting partitions are FTPR, FTUP and OPR1 (plus MFS).

  • FTPR seems to be the standard "main" partition, however it contains a bup_rcv module, which reminds me something like "Bring UP, recovery". Another (smaller) bup module is present in FTUP. FTPR contains only six modules and 4 of them (bup_rcv, rbe, kernel and syslib) are the four main modules needed to bring up the platform in ME.
  • FTUP ("FTPR update"?) contain a more complete list of modules. It also contains rbe, kernel and syslib, which are identical to the ones in FTPR (I checked only rbe, but they're Huffman compressed and they're exactly of the same size, so I suppose also the others are identical). The remaining modules are very similar to the ME ones (maestro, peci, ...).
  • MFS is the equivalent of EFFS for ME11 (or SPS4), a read-write partition used by ME to store stuff between reboots. "Someone" (thank you if you're reading!) told me that the removal of MFS shouldn't be possible (ME should refuse to start if MFS is missing), however the MFS check is bugged, so it's just a lucky chance that we're able to remove MFS. Maybe this "bug" is not present in SPS, so we must keep MFS this time (but it's not a big deal anyways).
  • UTOK is just garbage, let's ignore it
  • OPR1: points to exactly the same data as FTUP.

So my hypothesis is (no way to verify it, so I may be completely wrong):

  • FTPR is the "disaster recovery" partition: this partition is read-only and never gets updated during the SPS update process, in case something goes wrong. The bup in this partition is bigger, as it also initialize stuff which is normally contained in other modules (which, in the "disaster" situation are not available anymore). I suppose this partition contains the bare minimum needed to turn on the server and flash back a valid SPS image.
  • FTUP is the upgradable read-write partition: it contains a full set of modules and can be updated by the updater.
  • OPR1 is just a pointer to the boot partition to be used

This scheme allows a good redundancy, however the FPT is still a single point of failure, so they added a second one (FTPB) to a fixed address (0x2000, so that you have one FPT at 0x0-0x1000 and another one at 0x1000-0x2000).

Note that, at least in ME 8, there was probably a backup FPT in the ROM (as we were able to completely wipe it without any effect). According to Youness, the ROM size has been reduced in Skylake, so it makes sense that they've moved the backup FPT out of it to save some space.

The ROMB (ROM Bypass) partition is used only in pre-production images, so it makes sense that it's empty.


Now, let's move to the part "what can we do?". Which partitions have you removed? Which ones are still there?

@skochinsky
Copy link

some additional comments:

OPR is a reference to Operational Region (term used in some Intel docs on ME). IIRC older SPS firmwares had OPR1 and OPR2 regions. The SPS firmware is distributed to OEMs as two binaries: spsRecovery.bin and spsOperational.bin. spsRecovery.bin contains an FPT while spsOperational.bin starts directly with a $CPD header without $FPT.

UTOK seems to be "Unlock Token", used to enable debug/diagnostic functionality on production firmware (see Inside Intel Management Engine for more info)

@lasalvavida
Copy link
Author

lasalvavida commented Apr 30, 2018

Just wanted to say that I didn't forget about this; things have just been a little crazy on my end. I will try to make time to put together a pull request sometime this week.

edit: Unfortunately, this got away from me a little bit and I just haven't had the time. I will do my best to put something together before the end of July.

edit 2: No longer actively working on this, but hopefully my comments here help someone in the future.

@felixsinger
Copy link

@lasalvavida
I am interested in this mainboard and I would like to know if BootGuard is enabled. If not, I will buy one myself and port coreboot on it.

Could you please check this? Would be very appreciated :)

Just do the following steps:

  1. git clone https://review.coreboot.org/coreboot
  2. cd coreboot/util/intelmetool && make
  3. Enable msr kernel module as it is needed for reading the specific registers sudo modprobe msr
  4. sudo ./intelmetool -b

Also, please attach a dump of lspci -nnk.

@lasalvavida
Copy link
Author

Hi @felixsinger. Unfortunately, I no longer have access to this mainboard.

I can tell you that it does not have bootguard since the CPU is an Intel® Atom™ Processor C3338 which does not have bootguard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants