-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU Device Order issue #1451
Comments
If Linux, then you just rerun ./installer.sh. At the very end of the script, it'll redetect. This is how I do it, but may not be the correct way. I'll let the man confirm this, or smack me around... |
I just looked though the install.ps1 and it does seem to be in there. Let me try running the install again. |
Nope, that didn't really work, but I think I figured out where the issue is. The Bus IDs are 04, 07, 0a, and 0b. When the application processes that information it puts 0a and 0b before 04 and 07. Overdrive N labels them as 4,7,10, and 11. RBMiner doesn't send bus ids to select what miner to use...and it is arranging the list as a,b,4,7 so when it send a command to use device 1 and 2, it thinks it is sending bus a and bus b, but it is really sending bus 4 and bus 7. I hope this makes sense... @RainbowMiner Can you take a look at this? |
These 2 were both running at the same time, both trying to use the same cards from the way i am seeing it... Issues also not from the amd devices list [ |
That makes perfect sense! Thank you, I will fix this asap. |
Ok, that little fix should do the job. I strongly suspect, that this was the root cause for issue #1452, as well. |
- more future compatible fix for bus alignment (issue #1451)
Now it's done. This should be the most secure fix. |
I will pull it and check when I get time later today |
Still seeing the issue. Is it rebuilding the GlobalDeviceCache on every run? or importing it from somewhere? I am asking because technically nothing has changed on the configuration, so it may not trigger a rebuild |
Completely wiped and set it back up, still the same issue so it isn't because of something "lingering" Just noticed an opencl error pop up on startup so wondering if that could be related. |
Oh yes, an OpenCl error is never good. Would you please run GpUtest.bat and upload the result? |
When i start RBMiner i get this error now: WARNING: OpenCL platform detection failed: Cannot bind argument to parameter 'ReferenceObject' because it is null. Since last time i posted i have: i fixed the order by moving the physical connections so that the 580's would be detected as the lowest 2 id's. This seems to have let me "work around" what is going on and run in a mode other than legacy until it is figured out. |
One additional error, likely related: Line | |
FYI: I was now able to clear up the WARNING: OpenCL platform detection failed: Cannot bind argument to parameter 'ReferenceObject' because it is null. error by opening it in PS Windows. The openclplatorms.json only had [] until i opened it in Windows PS. For some reason it is not generating properly running under core. After running under Windows PS, i went back to core and that error is still not returning. |
Good finding! Thank you very much. I will implement additional checks, regarding that [] |
I am running through a few ideas/scenarios and will let you know what i come up with. |
Super! Thank you for trying and reporting. |
For the AMD device information, i was able to fix all of my stats by changing the references from Where-Object Type_Vendor_Index -eq ... to Where-Object BusId_Mineable_Index -eq I know that isnt 100% correct, but it works for me. The reason it is 100% correct is the additional use cases. It would probably be best to find a good way of doing matching like bus id or something. The only other case i can think of is IF you have both AMD and NVIDIA cards in the same device you would have to have another index for vendor bus id, or calculate it based off of type_vendor_index, get a list of bus ID's, sort, and then reference based on location. I do believe that all of those utilities you are using for AMD stats do primary sorting based on bus ID though. Now, on to figuring out if there are some miners that actually use bus id location vs opencl vs vendor...etc. |
I am also working on either adding logic for missing powerdraw or adding logic to use multiple methods to retrieve card stats/data. I will just post a link to the commit on my repo when i am done with all of it. That will probably be easier. |
lolminer, phoenix, wildrig, and ethminer all seem to use busid order in my testing. still checking into others. |
- add BusId indices for Type/Type_Vendor/Type_Mineable/Vendor to solve AMD GPU addressing (issue #1451)
the above doesnt really parse properly in comments on git... |
Yes, doesn't look too readable. Maybe just upload the function somewhere as psm1 |
- let OverdriveN use Type_Vendor_Index (vs. BusId_.., issue #1451)
Oh! It really looks like, I have to retire OverdriveN.exe - it just skips the RX GPUs. That is bad, except if we can parse the |
Overdrive N seems to work...I think it is in your vendor order though and not bus order. |
oh, i see what you are saying now. let me take a look |
Maybe it needs to be updated. Here is the output from the other version i have on my machine: (0.2.9 i think) AMD Radeon RX 5600 XT|GPU_P0=800;688|GPU_P1=1250;751|GPU_P2=1700;950|Fan_Acoustic=1750|Power_Target=0|Fan_P0=30;50|Fan_P1=50;50|Fan_P2=63;50|Fan_P3=76;50|Fan_P4=85;50|Fan_ZeroRPM=0|Mem_TimingLevel=0|GPU_Min=800|GPU_Max=1700|Mem_Max=1830 |
Ah, that's the OverdriveNTool.exe - the OverdriveN.exe is quiet antique. |
ah, ok. you can run overdriventool.exe with -getcurrent to get that output. |
Are these the current overclocking settings, or is this the current clock, like in Afterburner? |
for which values? It looks like the value in OverdriveN is the instance path....you could probably make system calls with that to get other information, but it could be ugly. And it looks like it only supports the 580's If you are talkinga bout the OverdriveNToolValues....they look to be current settings, not live values like afterburner is providing and not default settings like odvii is providing. and it looks to be ordered by busid |
Ok. Regarding this old tool, could you try the following?
|
Caption : AMD Radeon RX 5600 XT Caption : AMD Radeon RX 5600 XT Caption : Radeon RX 580 Series Caption : Radeon RX 580 Series Caption : Intel(R) UHD Graphics 630 |
- retire old OverdriveN.exe and odvii.exe - check live data Afterburner -> odvii_x64.exe - issue #1451
Ok. That should be now pretty neat, since both tools deliver PCI bus ids, that we then can easily match to the PCI bus ids in the OpenCL record. |
Maybe that will work. Updated Include.psm1 and i am seeing static default values for clock/mem. so either Afterburner isnt working properly or odvii_64 is overwriting. |
- prio clocks for Afterburner, all other for odvii_x64.exe (issue #1451)
Ok, took a look. The way it is working now, odvii needs to go first then Afterburner because it is going to run both and the live values are preferred if available. so process default values (odvii) then afterburner (live values). IMO it would probably be cleaner to initialize the data object before running either one. Run Afterburner first, store values in object, run odvii to fill in anything else that is still missing, then write it back out to the device. |
just saw you updated again. checking it |
yeah they look to be bouncing back and forth. i stand behind my previous statement. The way it is working now, odvii needs to go first then Afterburner because it is going to run both and the live values are preferred if available. so process default values (odvii) then afterburner (live values). IMO it would probably be cleaner to initialize the data object before running either one. Run Afterburner first, store values in object, run odvii to fill in anything else that is still missing, then write it back out to the device. I think the only way to get rid of the "bouncing" is to only present one set of data. Create object |
With this change, it should now be working. |
I think it looks good now...I will let it run some and let you know for sure. |
That would be superb! Next step, and if you have time for that, I could try to implement the overclocking, using OverdriveNTool.exe, so that we could use ocprofiles. |
I can definitely be on-board with that. Side note---would you like to carry these conversations over to discord or some other format? |
Sure. We can use Discord, if you like. Just PN me over there. |
here is the debug file for all of the AMD cards being duplicated after i added a NVidia to the machine. I am about to pull that card as i wanted to do it as a test, but looking at the debug files i see what looks to be issues with openCL although i dont recall seeing them in the console. |
On one of my devices, at some point, I moved around some connections and where the device order was something like:
"card type 1, card type 2, card type 1, card type 2"
it is now
"card type 1, card type 1, card type 2, card type 2"
If I use them in legacy mode, it doesn't matter for the most part. If I change the mode, or if I use an algorithm that isn't compatible with all cards it can crash or not use all of them.
The question I have is how do I force RBMiner to rescan/rebuild its GPU config for this machine?
The text was updated successfully, but these errors were encountered: