-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial setup of M.2 Accelerator with Dual Edge TPU fails #491
Comments
can you please share the output of below snippet.
|
when i run that command my output looks the same except I am on python 3.9.7 and gcc 11.2.0
|
@mogorman Which machine/hardware are you working with ? |
@manoj7410 currently trying this on a librem mini v1. with a Coral M.2 Accelerator with Dual Edge TPU |
@mogorman Please disable the secure boot on your machine and then try to run the demo again. |
it doesnt have secure boot enabled. its using stock seabios |
can you please paste the output of below command:
|
|
can you please check the permissions of |
its already 660.
|
can you please try the demo with the below lines here and share the logs..
|
i didnt go further as couldnt talk to it |
I am not sure how to fix the device open error.. Seems to be issue with MSI-X support (lspci -vvv|grep -i MSI-X).. Might be your host machine does not support M.2 dual edge TPU.
|
see above where i posted the output of lspci for info. seems it should support it? |
tried on 2 other machines, only worked in one of them. pretty frustrating. happy to test anything on my other machine. |
@mogorman The machine, on which the PCIe device is working, has same configuration of the machine, on which the device is not working ? |
they are different types of machines. currently working on a seed odyssey board via an nvme to mini pci e adapter. the others where just straight into the mini pcie slot |
Do you see any useful difference in the output of <lspci -vvv> from both the machines ? |
Any updates on this? I meet basically the same issue when running M.2 B+M key TPU on either M.2 slot of intel 7700k on STRIX Z270i motherboard with Ubuntu 20. I'd like to add that sometime when I reboot the machine, the coral edge TPU can be entirely gone and not visible until the next reboot. @hjonnala Please let me know if anything I can share will be useful. |
@tedzhouhk can you please share the following details:
|
Sure, here's the output (it's the same as yours expect the python version.
|
@tedzhouhk please add these two lines to the demo and share the output in txt file.
|
Here's the result. There's around 10 seconds before the "failed to open device" error showed up after the "opening /dev/apex_0. read_only=0" message.
"dmesg | grep apex" after the execution:
|
can you try the Workaround to disable Apex and Gasket section form this page: https://coral.ai/docs/m2/get-started/#troubleshooting-on-linux |
Do you mean disable apex and gasket when installing the driver? There is no apex or gasket before I install the driver. |
@tedzhouhk Yes, please try the Workaround to disable Apex and Gasket section. If still not working, if possible please try with Linux container or Ubuntu 18.04. |
@hjonnala Happy new year and sorry for the late reply, I finally have time to try this out on Ubuntu 18.04. Unfortunately, I got the same error. Any chance that it's a hardware problem? |
Same error here still, despite I disabled power save via the kernel parameter |
What helped my solve the issue was this reddit thread |
fwiw - I had the |
Do you mean to add it to the Proxmox machine itself? |
I dont do proxmox / virtualization. I have it on baremetal. |
I actually get it working by using this recommended PCIe to M.2 adaptor. My motherboard is Asus Z270i with i7-7700k. It has two m.2 slots. My OS is installed in one m.2 drive so previously I have switched location between the edgetpu and the SSD but neither is working. Then I tried the PCIe to m.2 adapter and use the only PCIe3x16 slot and set it to x4 mode. Everything seems to work well. |
This worked for me in Ubuntu 22.04.3 LTS |
As per my previous, it worked initially for about 24 hours, after that the issue appeared again. I have to remove power from the machine and start it again so that the TPU is detected |
Since my last comment, I've downgraded from ubuntu 22.04 to Debian 10, 4.19.0-25-amd64, working without any issue so far, no crash or no event of TPU missing from the machine |
Extremely odd. We have two servers, identical specs except for the CPU (3900X on one, 5900X on the other). Running latest proxmox with a VM containing Coral's drivers.
|
Same issue here:
Every 24h the tpu:
|
I tried to get it the dual edge tpu card working in my pc and cant seem to get it to do any work. I am running a fresh ubuntu 21.04 and followed instructions from here https://coral.ai/docs/m2/get-started/ . I tried the one troubleshooting suggestion,
pcie_aspm=off
and it seemed to have no effect.Also shouldnt I see two boards? I am only seeing the one apex_0 input.I see my m.2 is only single laned my badAny advice or things to try would be very appreciated.
Relevant dmesg lines
Not all boots have the Couldn't initialize interrupts error.
lspci -vvv|grep -i MSI-X
lspci -vv
python3 examples/classify_image.py --model test_data/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite --labels test_data/inat_bird_labels.txt --input test_data/parrot.jpg
dmesg after run
good lspci -vv
The text was updated successfully, but these errors were encountered: