Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue: unable to read pci information from sysfs #91

Closed
altimore opened this issue Sep 23, 2022 · 10 comments
Closed

Issue: unable to read pci information from sysfs #91

altimore opened this issue Sep 23, 2022 · 10 comments
Assignees
Labels
bug Something isn't working

Comments

@altimore
Copy link

Hello,

I'm using Manjaro linux with an external geforce card.
egpu-switcher worked before and after upgrading the system it doesn't anymore.

Here is the error message :

$ sudo egpu-switcher config
panic: unable to read pci information from sysfs

goroutine 1 [running]:
github.com/hertg/egpu-switcher/internal/pci.ReadGPUs()
	/home/runner/work/egpu-switcher/egpu-switcher/internal/pci/pci.go:96 +0x17e
github.com/hertg/egpu-switcher/cmd.glob..func1(0xac5be0?, {0x7d8df7?, 0x0?, 0x0?})
	/home/runner/work/egpu-switcher/egpu-switcher/cmd/config.go:26 +0x3b
github.com/spf13/cobra.(*Command).execute(0xac5be0, {0xafffe0, 0x0, 0x0})
	/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.5.0/command.go:872 +0x694
github.com/spf13/cobra.(*Command).ExecuteC(0xac56e0)
	/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.5.0/command.go:990 +0x3bd
github.com/spf13/cobra.(*Command).Execute(...)
	/home/runner/go/pkg/mod/github.com/spf13/cobra@v1.5.0/command.go:918
github.com/hertg/egpu-switcher/cmd.Execute()
	/home/runner/work/egpu-switcher/egpu-switcher/cmd/root.go:79 +0x25
main.main()
	/home/runner/work/egpu-switcher/egpu-switcher/main.go:8 +0x17

It seems that the Go library pci cannot read the informations needed and end the process.
I think i'm missing a library for the pci go package to read the list of video cards. Maybe someone can tell me which one.

Extra infos

  • Did you install egpu-switcher via ppa or via git + make
    I upgraded the system with yay -Syu and the package i installed from AUR is aur/egpu-switcher-bin 0.18.0-1 (+0 0.00).
  • What Linux distribution (+ version) are you using
$ uname -a
Linux alti-laptop 5.17.15-1-MANJARO #1 SMP PREEMPT Wed Jun 15 07:09:31 UTC 2022 x86_64 GNU/Linux
  • What brand / model is your laptop
$ sudo dmidecode | grep -A3 '^System Information' 
System Information
	Manufacturer: ASUSTeK COMPUTER INC.
	Product Name: ZenBook UX325EA_UX325EA
	Version: 1.0
  • What brand / model is your GPU (+ enclosure)
    AORUS RTX 2070 Gaming Box
  • What drivers (+ version) are you using
    extra/nvidia-dkms 515.65.01-3
    extra/xf86-video-intel 1:2.99.917+916+g31486f40-2
  • What Desktop-Environment do you use (+ Display-Manager)
    community/cinnamon-desktop 5.4.2-1
  • If you are not using a Desktop-Environment, what Window-Manager do you use?
    ...
@hertg
Copy link
Owner

hertg commented Sep 23, 2022

Hey @altimore thanks for the report.
I'm sorry the most recent update caused issues for you.

Of course I forgot to print the actual error message at the exact line the script failed on your system 🤦‍♂️
I just pushed a commit that will print more information about the error, can you go ahead and switch to the egpu-switcher-git version and report back the output you get? Thanks

@altimore
Copy link
Author

Hello,

Thanks for taking the time to help.

Here is the new output, i got a new error first line.

sudo egpu-switcher config                                            
[error] unable to read pci information from sysfs: got error while scanning device '10000:e0:1d.0': unable to parse '10000:e0:1d.0' as pci address
panic: unable to read pci information from sysfs

goroutine 1 [running]:
github.com/hertg/egpu-switcher/internal/pci.ReadGPUs()
	/home/alti/egpu-switcher-git/src/egpu-switcher-git/internal/pci/pci.go:98 +0x1be
github.com/hertg/egpu-switcher/cmd.glob..func1(0xac2da0?, {0x7d8df7?, 0x0?, 0x0?})
	/home/alti/egpu-switcher-git/src/egpu-switcher-git/cmd/config.go:26 +0x3b
github.com/spf13/cobra.(*Command).execute(0xac2da0, {0xafd1a0, 0x0, 0x0})
	/home/alti/go/pkg/mod/github.com/spf13/cobra@v1.5.0/command.go:872 +0x694
github.com/spf13/cobra.(*Command).ExecuteC(0xac28a0)
	/home/alti/go/pkg/mod/github.com/spf13/cobra@v1.5.0/command.go:990 +0x3bd
github.com/spf13/cobra.(*Command).Execute(...)
	/home/alti/go/pkg/mod/github.com/spf13/cobra@v1.5.0/command.go:918
github.com/hertg/egpu-switcher/cmd.Execute()
	/home/alti/egpu-switcher-git/src/egpu-switcher-git/cmd/root.go:79 +0x25
main.main()
	/home/alti/egpu-switcher-git/src/egpu-switcher-git/main.go:8 +0x17

@altimore
Copy link
Author

for information the output of lspci

$ lspci
0000:00:00.0 Host bridge: Intel Corporation 11th Gen Core Processor Host Bridge/DRAM Registers (rev 01)
0000:00:02.0 VGA compatible controller: Intel Corporation TigerLake-LP GT2 [Iris Xe Graphics] (rev 01)
0000:00:04.0 Signal processing controller: Intel Corporation TigerLake-LP Dynamic Tuning Processor Participant (rev 01)
0000:00:07.0 PCI bridge: Intel Corporation Tiger Lake-LP Thunderbolt 4 PCI Express Root Port #0 (rev 01)
0000:00:07.2 PCI bridge: Intel Corporation Tiger Lake-LP Thunderbolt 4 PCI Express Root Port #2 (rev 01)
0000:00:08.0 System peripheral: Intel Corporation GNA Scoring Accelerator module (rev 01)
0000:00:0d.0 USB controller: Intel Corporation Tiger Lake-LP Thunderbolt 4 USB Controller (rev 01)
0000:00:0d.2 USB controller: Intel Corporation Tiger Lake-LP Thunderbolt 4 NHI #0 (rev 01)
0000:00:0d.3 USB controller: Intel Corporation Tiger Lake-LP Thunderbolt 4 NHI #1 (rev 01)
0000:00:0e.0 RAID bus controller: Intel Corporation Volume Management Device NVMe RAID Controller
0000:00:14.0 USB controller: Intel Corporation Tiger Lake-LP USB 3.2 Gen 2x1 xHCI Host Controller (rev 20)
0000:00:14.2 RAM memory: Intel Corporation Tiger Lake-LP Shared SRAM (rev 20)
0000:00:14.3 Network controller: Intel Corporation Wi-Fi 6 AX201 (rev 20)
0000:00:15.0 Serial bus controller: Intel Corporation Tiger Lake-LP Serial IO I2C Controller #0 (rev 20)
0000:00:15.1 Serial bus controller: Intel Corporation Tiger Lake-LP Serial IO I2C Controller #1 (rev 20)
0000:00:16.0 Communication controller: Intel Corporation Tiger Lake-LP Management Engine Interface (rev 20)
0000:00:1c.0 PCI bridge: Intel Corporation Device a0bc (rev 20)
0000:00:1d.0 System peripheral: Intel Corporation RST VMD Managed Controller
0000:00:1f.0 ISA bridge: Intel Corporation Tiger Lake-LP LPC Controller (rev 20)
0000:00:1f.3 Multimedia audio controller: Intel Corporation Tiger Lake-LP Smart Sound Technology Audio Controller (rev 20)
0000:00:1f.4 SMBus: Intel Corporation Tiger Lake-LP SMBus Controller (rev 20)
0000:00:1f.5 Serial bus controller: Intel Corporation Tiger Lake-LP SPI Controller (rev 20)
0000:57:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS522A PCI Express Card Reader (rev 01)
10000:e0:1d.0 PCI bridge: Intel Corporation Tiger Lake-LP PCI Express Root Port #9 (rev 20)
10000:e1:00.0 Non-Volatile memory controller: Intel Corporation SSD 660P Series (rev 03)

I didn't connect the eGPU.
The GPU is the second entry and the parsing error in on the one before last.

@hertg
Copy link
Owner

hertg commented Sep 24, 2022

Thanks for the feedback. now I understand why it's failing.
For some reason the PCI Domain on some of your devices exceeds 16bits. I was expecting the domain to always be a 16bit number, so the largest compatible number would be 0xffff but yours is 0x10000. I didn't know that such PCI Domain numbers existed frankly, so I'll dig into that. Are you running egpu-switcher in a virtual machine?

I found some related bug reports in other software:
https://bugs.launchpad.net/nova/+bug/1897528
https://lore.kernel.org/all/20170622094107.7e80bb0f@xeon-e3/T/
https://cgit.freedesktop.org/xorg/lib/libpciaccess/commit/?id=a167bd6

I am also tracking the issue in hertg/gopci#1

@hertg hertg added the bug Something isn't working label Sep 24, 2022
@altimore
Copy link
Author

No i run Manjaro as my base system on my laptop.

After some google search there are multiple occurrences of this 0x1000 pci domain.
Most of these are extra root pci express ports. I don't know if these are virtual devices.

From wikipedia you seem to be right the first field was 16 bits but it might be specific for some systems.
https://en.wikipedia.org/wiki/PCI_configuration_space

@hertg
Copy link
Owner

hertg commented Sep 24, 2022

Well, I have to admit that I used a fairly old reference during development, so maybe the info there is outdated.
https://www.oreilly.com/library/view/linux-device-drivers/0596005903/

All three lists of devices are sorted in the same order, since lspci uses the /proc files as its source of information. Taking the VGA video controller as an example, 0x00a0 means 0000:00:14.0 when split into domain (16 bits), bus (8 bits), device (5 bits) and function (3 bits).
Chapter 12. PCI Drivers

Out of curiosity, what do the numbers look like when you run:

cat /proc/bus/pci/devices

The numbers in the first column are all 32bit (4 bytes) on my computer. My guess is you have some numbers there that are larger?

@hertg
Copy link
Owner

hertg commented Sep 24, 2022

@altimore I just pushed another commit that might resolve the issue, can you try it out and report back? Thanks

@altimore
Copy link
Author

Hello hertg, the new version solve the error.

$ sudo egpu-switcher config     

Found 1 possible GPU(s)...

1: 	Intel Corporation TigerLake-LP GT2 [Iris Xe Graphics] (i915)

Which one is your external GPU? [1-1]: ^C

I'll be at my place with the egpu tomorrow and will try it then, I'm pretty sure it will work since it worked before.

I don't know if you still need this but here is the result of the command requested :

$ cat /proc/bus/pci/devices
0000	80869a14	0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	
0010	80869a49	b0	      603e000004	               0	      400000000c	               0	            3001	               0	           c0002	         1000000	               0	        10000000	               0	              40	               0	           20000	i915
0020	80869a03	10	      603f280004	               0	               0	               0	               0	               0	               0	           20000	               0	               0	               0	               0	               0	               0	proc_thermal
0038	80869a23	7c	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	pcieport
003a	80869a27	7d	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	pcieport
0040	80869a11	ff	      603f2d3004	               0	               0	               0	               0	               0	               0	            1000	               0	               0	               0	               0	               0	               0	
0068	80869a13	93	      603f2b0004	               0	               0	               0	               0	               0	               0	           10000	               0	               0	               0	               0	               0	               0	xhci_hcd
006a	80869a1b	10	      603f240004	               0	      603f2d2004	               0	               0	               0	               0	           40000	               0	            1000	               0	               0	               0	               0	thunderbolt
006b	80869a1d	10	      603f200004	               0	      603f2d1004	               0	               0	               0	               0	           40000	               0	            1000	               0	               0	               0	               0	thunderbolt
0070	80869a0b	0	      603c000004	               0	        50000000	               0	      603f100004	               0	               0	         2000000	               0	         2000000	               0	          100000	               0	               0	vmd
00a0	8086a0ed	96	      603f2a0004	               0	               0	               0	               0	               0	               0	           10000	               0	               0	               0	               0	               0	               0	xhci_hcd
00a2	8086a0ef	0	      603f2c8004	               0	      603f2d0004	               0	               0	               0	               0	            4000	               0	            1000	               0	               0	               0	               0	
00a3	8086a0f0	13	      603f2c4004	               0	               0	               0	               0	               0	               0	            4000	               0	               0	               0	               0	               0	               0	iwlwifi
00a8	8086a0e8	1b	      4017000004	               0	               0	               0	               0	               0	               0	            1000	               0	               0	               0	               0	               0	               0	intel-lpss
00a9	8086a0e9	28	      4017001004	               0	               0	               0	               0	               0	               0	            1000	               0	               0	               0	               0	               0	               0	intel-lpss
00b0	8086a0e0	95	      603f2cd004	               0	               0	               0	               0	               0	               0	            1000	               0	               0	               0	               0	               0	               0	mei_me
00e0	8086a0bc	7e	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	pcieport
00e8	808609ab	0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	
00f8	8086a082	0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	
00fb	8086a0c8	cb	      603f2c0004	               0	               0	               0	      603f000004	               0	               0	            4000	               0	               0	               0	          100000	               0	               0	sof-audio-pci-intel-tgl
00fc	8086a0a3	10	      603f2cc004	               0	               0	               0	            efa1	               0	               0	             100	               0	               0	               0	              20	               0	               0	i801_smbus
00fd	8086a0a4	0	        4f800000	               0	               0	               0	               0	               0	               0	            1000	               0	               0	               0	               0	               0	               0	intel-spi
5700	10ec522a	8c	        6c300000	               0	               0	               0	               0	               0	               0	            1000	               0	               0	               0	               0	               0	               0	rtsx_pci
e0e8	8086a0b0	94	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	               0	pcieport
e100	8086f1a8	80000000	        50000004	               0	               0	               0	               0	               0	               0	            4000	               0	               0	               0	               0	               0	               0	nvme

Thanks again for your help, I'm not confident i could have solved it alone.

Have an excellent weekend.

@hertg
Copy link
Owner

hertg commented Sep 24, 2022

Great to hear. I went ahead and created a release 0.18.1, so you can go back to the egpu-switcher-bin package if you prefer that

@altimore
Copy link
Author

Just arrived and connected the egpu, everything works as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants