Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU not found for Radeon HD 3450 #6945

Closed
kohrar opened this issue Aug 29, 2013 · 13 comments
Closed

GPU not found for Radeon HD 3450 #6945

kohrar opened this issue Aug 29, 2013 · 13 comments
Assignees
Labels
Bug broken, incorrect, or confusing behavior

Comments

@kohrar
Copy link

kohrar commented Aug 29, 2013

salt-call grains.items is returning no GPUs on a system with a Radeon HD 3450 installed. I am currently on salt 0.16.3 running SL 6.4 x86_64.

# lspci -vmm
Slot:   01:00.0
Class:  VGA compatible controller
Vendor: Advanced Micro Devices [AMD] nee ATI
Device: RV620 LE [Radeon HD 3450]
SVendor:        Dell
SDevice:        OptiPlex 980
# salt-call grains.items
...
gpus:                               <-- I get an empty list.
...
num_gpus:
    0

Expected output should be:

gpus:
    ----------
    - model:
        RV620 LE [Radeon HD 3450]
    - vendor:
        ati
...
num_gpus:
    1

I took a quick glance at the code, but didn't see anything obvious that was preventing it from parsing it properly...

@terminalmage
Copy link
Contributor

Can you post any lines from the output of lspci that contain information about your video card?

@ghost ghost assigned terminalmage Aug 29, 2013
@kohrar
Copy link
Author

kohrar commented Aug 30, 2013

I had included the output of lspci for the video card on my original post using the command that the salt grains use for parsing (lspci -vmm).

lspci|grep -i vga shows:

01:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI RV620 LE [Radeon HD 3450]

@terminalmage
Copy link
Contributor

D'oh! I overlooked that, sorry. Thanks, I'll take a look at this.

@kohrar
Copy link
Author

kohrar commented Aug 30, 2013

Excellent. Thanks :D Let me know if you need anything else.

@terminalmage
Copy link
Contributor

@kohrar I was not able to reproduce this. To test, I copied the block of code that generates the gpu grains into it's own script, and replaced the lspci_out variable with the shell output from lspci -vmm on my own machine. It worked as expected. I then replaced the entry for my video adapter with the one from yours, and again it worked as expected.

The odd thing is that this happened when I tested with 0.16.3, as well as with git develop.

@kohrar
Copy link
Author

kohrar commented Sep 4, 2013

That's the weirdest thing. I did the exact same thing you did and pulled the code from _linux_gpu_data() from grains/core.py and it returned the correct result as you have reproduced:

root@zone05ea:~# python test.py
{'gpus': [{'model': 'RV620 LE [Radeon HD 3450]', 'vendor': 'ati'}], 'num_gpus': 1}

But salt-call doesn't agree:

root@zone05ea:~# salt-call grains.items |grep -A 5 gpu
gpus:
groups:
    - workstation
host:
    zone05ea
id:
--
num_gpus:
    0
os:
    Scientific
os_family:
    RedHat

@terminalmage
Copy link
Contributor

Yeah. I'll give it another look in the morning and see if something might be overwriting that value later on.

@terminalmage
Copy link
Contributor

@kohrar So, looking more at the grains info, the empty gpus dict that you are seeing is the default value for all platforms, and it is populated based on platform using the platform-specific function. Can you check the minion log and see if there are any tracebacks when the minion starts?

@terminalmage
Copy link
Contributor

@kohrar Were you able to find anything in the minion log?

@kohrar
Copy link
Author

kohrar commented Sep 16, 2013

Sorry, there are no tracebacks from the logs. Perhaps you can see something more from the logs here: http://pages.cpsc.ucalgary.ca/~leo/zone05ea.salt-call.log

@zmarano
Copy link
Contributor

zmarano commented Oct 24, 2013

I am having the same problem except with an Nvidia GPU and saltstack 0.17.1. I have another box with an Intel GPU that works just fine however. But I think I have found the problem.

I added a bunch of extra logging to core.py to see what is going on and found the problem is when the GPU is the last item in the lspci output. Basically the problem is how the lspci -vmm output is split into lines. Then the code looks for an empty line to denote separate devices except that splitlines() doesn't add blank lines to the end of the list it produces so the last entry it always stripped off. And therefore, if your GPU is the last PCI device in lspci it will be ignored. I am looking for a sane way to solve this right now.

@cachedout
Copy link
Contributor

I don't think it's related to this issue but it's important to note that the new default for masters is not to gather GPU data unless enable_gpu_grains is set to True in the master config. The intention here is to avoid lag time on master startup that was happening on some servers with oddball video cards or bugs or distros that had buggy implementations of lspci.

This code is brand new as of this week. If this is affecting the minions or not behaving as intended on the master, please let us know. Thanks!

@basepi
Copy link
Contributor

basepi commented Oct 25, 2013

@zmarano This is good to know. Hopefully you'll find an easy fix.

cachedout added a commit that referenced this issue Oct 25, 2013
Fix for Linux gpus grain, issue #6945
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior
Projects
None yet
Development

No branches or pull requests

5 participants