-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ledmon[2374]: Unsupported AMD interface #65
Comments
Attached is an strace capture: |
Hello, it looks that "PowerEdge R7425" is unexpected by AMD code checking platform name (_get_amd_led_interface). @nfont could you please take a look on this issue? I am afraid that with current implementation this problem will return with every new OEM platform. Regards, |
It appears this occurred when I added support for IPMI led control. Before the update to support IPMI the check for EM enablement on AMD was the _amd_sgpio_em_enabled() routine. This routine validates enclosure management capabilities on AMD systems. With the addition of IPMI support a new amd_em_enabled() routine was added that looks up the AMD platform and uses that to determine if SGPIO or IPMI is to be used and then calls the appropriate routine to verify that SGPIO or IPMI is enabled. It seems this approach was not entirely correct as it breaks systems that do support SGPIO but do not have their platforms specifically checked for. I think the fix for this is to check the platform name to see if can use IPMI and the check to see if IPMI EM is enabled. This check is still needed since we need to know the channel and slave address for the platform to use IPMI. If the platform name is not listed as supporting IPMI we should just call _amd_sgpio_em_enabled() to see if it's possible to use SGPIO. I can put together a patch for this and get it submitted. |
Thank you. When is this likely to get merged and then become available via Ubuntu update repos? |
I have opened a pull request with the fix for this issue. |
Has the pull request been approved, has progress been made? |
Hi, Thanks, |
I am experiencing some hardware related issues so as soon as I get my system back online I will give it a try, thanks. Doe it need to be compiled or are pre-compiled binaries available? |
It needs to be compiled. |
I have my system back online now. I am trying to compile but can't get past ./configure checking for a BSD-compatible install... /usr/bin/install -c Suggestions? |
Hi, Following packages are required for building and compiling:
So, I think you will need to install the last 3. Regards, |
Thanks, I was looking for that list. Where can I find it? |
This table will be added to README soon. |
Thanks, that is exactly the information I was looking for. Where was it hidden previously? |
Ran make install ... still broken sudo ledctl From /var/log/syslog External storage not currently connected. |
Please ensure that you don't have two ledmon and ledctl in system. The simplest way is to uninstall ledmon package from repository. Sometimes "make install" puts binaries in different location than package. You can also run ledctl directly without installation. Please look into src folder after "make". Your binaries will be there. Just run ./src/ledmon or ./src/ledctl Mariusz |
I uninstalled the packages but still seeing the same/similar errors
sudo ./ledctl
/dev/shm/ledmon.conf: does not exist, using global config file
ledctl: SGPIO EM not supported for /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2
ledctl: controller discovery: /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2 - enclosure management not supported.
ledctl: missing operand(s)... run ledctl --help for details.
ledctl: main(): _ibpi_parse() failed (status=STATUS_IBPI_DETERMINE_ERROR).
May 20 23:29:21 host ledmon[20163]: monitor service has been started...
May 20 23:29:21 host ledmon[20163]: SGPIO EM not supported for /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2
May 20 23:29:21 host ledmon[20163]: controller discovery: /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2 - enclosure management not supported.
May 20 23:29:26 host ledmon[20163]: SGPIO EM not supported for /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2
May 20 23:29:26 host ledmon[20163]: controller discovery: /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2 - enclosure management not supported.
May 20 23:29:31 host ledmon[20163]: SGPIO EM not supported for /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2
May 20 23:29:31 host ledmon[20163]: controller discovery: /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2 - enclosure management not supported.
May 20 23:29:36 host ledmon[20163]: SGPIO EM not supported for /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2
May 20 23:29:36 host ledmon[20163]: controller discovery: /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2 - enclosure management not supported.
May 20 23:29:42 host ledmon[20163]: SGPIO EM not supported for /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2
May 20 23:29:42 host ledmon[20163]: controller discovery: /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2 - enclosure management not supported.
May 20 23:29:42 host ledmon[20174]: daemon is running...
May 20 23:29:47 host ledmon[20163]: SGPIO EM not supported for /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2
May 20 23:29:47 host ledmon[20163]: controller discovery: /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2 - enclosure management not supported.
May 20 23:29:52 host ledmon[20163]: SGPIO EM not supported for /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2
May 20 23:29:52 host ledmon[20163]: controller discovery: /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2 - enclosure management not supported.
May 20 23:29:57 host ledmon[20163]: SGPIO EM not supported for /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2
May 20 23:29:57 host ledmon[20163]: controller discovery: /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2 - enclosure management not supported.
May 20 23:30:02 host ledmon[20163]: SGPIO EM not supported for /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2
May 20 23:30:02 host ledmon[20163]: controller discovery: /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2 - enclosure management not supported.
May 20 23:30:07 host ledmon[20163]: SGPIO EM not supported for /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2
May 20 23:30:07 host ledmon[20163]: controller discovery: /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2 - enclosure management not supported.
May 20 23:30:12 host ledmon[20163]: SGPIO EM not supported for /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2
May 20 23:30:12 host ledmon[20163]: controller discovery: /sys/devices/pci0000:00/0000:00:08.1/0000:05:00.2 - enclosure management not supported.
… On May 20, 2020, at 11:10 PM, Mariusz Tkaczyk ***@***.***> wrote:
Please ensure that you don't have two ledmon and ledctl in system. The simplest way is to uninstall ledmon package from repository. Sometimes "make install" puts binaries in different location than package.
You can also run ledctl directly without installation. Please look into src folder after "make". Your binaries will be there. Just run ./src/ledmon or ./src/ledctl
Mariusz
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#65 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACLTKBO3WZDLV4HSRH6BLS3RSTAWBANCNFSM4MQQ3DWA>.
|
Looks like issue is still there, @nfont could you look into it again? |
@mtkaczyk, I think there is a bug in the option parsing for ledctl. The recent update to allow non-root users (ff49cce) adds a second call to getopt_long() without resetting the getopt internal variables. This results in not parsing any options in _cmdline_parse() in ledctl.c. The following update corrects this.
@minorsatellite, A couple of questions since I cannot re-create the failure you're seeing. Can you confirm that this same system configuration was working with the previous version of ledmon. This is reporting that SGPIO EM is not enabled. On my systems I have to enabled enclosure management in the BIOS, can you verify that you have enclosure management enabled. Can you run the following command and provide the output, this will help me track down what may be happening. You will likely need the patch above to correct option parsing in ledctl. #> sudo ./ledctl --all -L |
This is a new install on new hardware so I cannot confirm. This is my first sortie with Dell AMD architecture so its all quite new to me. In the Dell BIOS itself I see no option to enable EM, however there is such an option under Device Settings for the PERC card. The Dell-branded, certified HBA card (which I care most about as it will manage by external storage, but not currently connected), unfortunately has no such option. Requested output from command below: |
This output indicates that enclosure management is not supported for this device. This could be because it is not enabled in BIOS on your system. What the enablement check is looking at is the |
Any idea where in the Dell BIOS I would enable it? Is there a global setting outside of the adapter cards themselves? |
No, I don't know where it is in Dell BIOS. I would hope that Dell has published a manual or spec that includes how to enable SGPIO enclosure management for blinking LEDs. |
Hi @nfont, Thanks, |
I am checking with Dell on whether or not the SFF-8485 standard is supported on their Dell/LSI branded card. The card appears to be a rebranded, Dell firmware flashed 9300-8e. According the the Broadcom Product Brief page linked below, SFF-8485 is not supported in the 9300-8e. Does that sound right? How is that even possible? |
I am trying to confirm whether or not this one particular HBA is the culprit, or if it is some other PCI device. I am little confused because my original suspicion was the Dell/LSI controller, but looking at the output from
Below is the location provided respectively, by the
Grepping through the output of
The question is, how likely is it that an onboard SATA controller would cause |
Any further input on this issue? |
Hi, For AMD hardware issues I'm not able to help you. due to lack of hardware. I believe that @nfont is the AMD interface maintainer. |
I'll post the response above to the PR you referenced. Thanks. |
The #66 has been verified. |
New install of Ubuntu 20.04 LTS, clean upgrade from Ubuntu 18.04 LTS where ledmon was previously working.
Hardware: Dell R7425 (AMD Epyc Architecture)
In syslog I am seeing the following entires:
Apr 25 00:51:36 host ledmon[2374]: ledmon[2374]: Unsupported AMD interface Apr 25 00:51:36 host ledmon[2374]: Unsupported AMD interface Apr 25 00:51:46 host ledmon[2374]: ledmon[2374]: Unsupported AMD interface Apr 25 00:51:46 host ledmon[2374]: Unsupported AMD interface Apr 25 00:51:56 host ledmon[2374]: ledmon[2374]: Unsupported AMD interface Apr 25 00:51:56 host ledmon[2374]: Unsupported AMD interface
admin@host:~$ sudo service ledmon status ● ledmon.service - Enclosure LED Utilities Loaded: loaded (/lib/systemd/system/ledmon.service; enabled; vendor preset: enabled) Active: active (running) since Sat 2020-04-25 00:42:23 UTC; 17min ago Main PID: 2374 (ledmon) Tasks: 1 (limit: 115335) Memory: 2.0M CGroup: /system.slice/ledmon.service └─2374 /usr/sbin/ledmon --foreground
Apr 25 00:59:37 host ledmon[2374]: ledmon[2374]: Unsupported AMD interface Apr 25 00:59:37 host ledmon[2374]: Unsupported AMD interface Apr 25 00:59:47 host ledmon[2374]: ledmon[2374]: Unsupported AMD interface Apr 25 00:59:47 host ledmon[2374]: Unsupported AMD interface Apr 25 00:59:57 host ledmon[2374]: ledmon[2374]: Unsupported AMD interface Apr 25 00:59:57 host ledmon[2374]: Unsupported AMD interface Apr 25 01:00:07 host ledmon[2374]: ledmon[2374]: Unsupported AMD interface Apr 25 01:00:07 host ledmon[2374]: Unsupported AMD interface Apr 25 01:00:17 host ledmon[2374]: ledmon[2374]: Unsupported AMD interface Apr 25 01:00:17 host ledmon[2374]: Unsupported AMD interface
The text was updated successfully, but these errors were encountered: