Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run ROCm without PCIe atomics? #157

Closed
gezine opened this issue Jul 8, 2017 · 40 comments

Comments

@gezine
Copy link

commented Jul 8, 2017

Hi I am having trouble using ROCm 1.6
My system is
E5-2670v1 + two RX480
CPU itself is using PCIe 3.0, but no PCIe atomics support

I can see two RX480, If I use "rocm-smi -a"
But If I run vector_copy I got "Getting a gpu agent failed."
This is output of "dmesg | grep kfd"
kfd kfd: skipped device 1002:67df, PCI rejects atomics

Is there any solution to run ROCm without PCIe atomics?

Thank you

@gstoner

This comment has been minimized.

Copy link
Contributor

commented Jul 8, 2017

@gstoner gstoner closed this Jul 8, 2017

@gezine

This comment has been minimized.

Copy link
Author

commented Jul 8, 2017

Oh. Thank you
Seems like I need to change cpu.

Is Xeon E5 v2 supported?
https://software.intel.com/en-us/articles/intel-xeon-processor-e5-2600-v2-product-family-technical-overview#pcie

document says it supports PCIe atomic. But Is ROCm support it?

@gstoner

This comment has been minimized.

Copy link
Contributor

commented Jul 8, 2017

Ivybridge is the first Intel Processor to support PCIe atomics with PCIe Gen3. We never acquired an Ivybridge system. Since Haswell already out and OEM vendors we working stopped selling Ivybridge system. It should work since it meets the requirement, but you need to make sure you're on the PCIe Gen3 slots attached to CPU complex, not off the southbridge. But you have to be careful you also need motherboard compliant with PCIe Gen3.

When we started our project Haswell processor was what we have been doing development for ROCm, the primary versions are Xeon E5 v3 -2640, 2660 and 2690. Now EPYC is out they work great

@gezine

This comment has been minimized.

Copy link
Author

commented Jul 8, 2017

Just tested ROCm with ivybridge intel core i7 3770

Doesn't work.
Having same errors

Is this software limitation?

@gstoner

This comment has been minimized.

Copy link
Contributor

commented Jul 8, 2017

@P-Tillmann

This comment has been minimized.

Copy link

commented Jul 23, 2017

I've also tried to get rocm running on an ivy-bridge system (Xeon 1230 v2).
Unfortunatly the vector_copy sample fails with
"Getting a gpu agent failed."
And "dmesg | grep kfd" outputs
kfd kfd: skipped device 1002:67df, PCI rejects atomics

Under Windows GPU-Z claims i have a pcie-v3 connection.

Is there hope that rocm support will be provided for ivy bridge generation cpus?

@gstoner

This comment has been minimized.

Copy link
Contributor

commented Jul 23, 2017

If that is Xeon E3 Ivybridge it will be missing PCIe Atomics.

The full list of HSA system architecture platform requirements are here: http://www.hsafoundation.com/html/HSA_Library.htm#SysArch/Topics/01_Overview/list_of_requirements.htm

For ROCm the Platform atomics are used in ROCm in the following ways:

  • Update HSA queue’s read_dispatch_id: 64bit atomic add used by the command processor on the GPU agent to update the packet ID it processed.
  • Update HSA queue’s write_dispatch_id: 64bit atomic add used by the CPU and GPU agent to support multi-writer queue insertions.
  • Update HSA Signals – 64bit atomic ops are used for CPU & GPU synchronization.
  • The PCIe Platform Atomics are CAS, FetchADD, SWAP

Here is document on PCIe Atomics https://pcisig.com/sites/default/files/specification_documents/ECN_Atomic_Ops_080417.pdf

In ROCm, we also take advantage of PCIe ID based ordering technology for P2P when the GPU originates two writes to two different targets:

  • write to another GPU memory,
  • then write to system memory to indicate transfer complete.
@P-Tillmann

This comment has been minimized.

Copy link

commented Jul 26, 2017

Ah a pity,

i didn't knew that Ivybridge doesn't support full PCIe v3 features.

Thanks for sharing this information.

@iavael

This comment has been minimized.

Copy link

commented Aug 25, 2017

Well, according to my lspci output Xeon v2 supports some pcie atomics (AtomicOpsCap: 32bit+ 64bit+), but maybe it's not sufficient to HSA needs.

Full lspci output

# lspci -s 01:00.0 -vv
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/580] [1002:67df] (rev e7) (prog-if 00 [VGA controller])
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device 3418
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 38
        Region 0: Memory at e0000000 (64-bit, prefetchable) [size=256M]
        Region 2: Memory at f0000000 (64-bit, prefetchable) [size=2M]
        Region 4: I/O ports at e000 [size=256]
        Region 5: Memory at f7b00000 (32-bit, non-prefetchable) [size=256K]
        Expansion ROM at f7b40000 [disabled] [size=128K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 256 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM L1, Exit Latency L0s <64ns, L1 <1us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR+, OBFF Not Supported
                AtomicOpsCap: 32bit+ 64bit+ 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                AtomicOpsCtl: ReqEn-
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
                         EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
        Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000fee00418  Data: 0000
        Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
        Capabilities: [150 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP+ BadDLLP+ Rollover- Timeout+ NonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
        Capabilities: [200 v1] #15
        Capabilities: [270 v1] #19
        Capabilities: [2b0 v1] Address Translation Service (ATS)
                ATSCap: Invalidate Queue Depth: 00
                ATSCtl: Enable-, Smallest Translation Unit: 00
        Capabilities: [2c0 v1] Page Request Interface (PRI)
                PRICtl: Enable- Reset-
                PRISta: RF- UPRGI- Stopped+
                Page Request Capacity: 00000020, Page Request Allocation: 00000000
        Capabilities: [2d0 v1] Process Address Space ID (PASID)
                PASIDCap: Exec+ Priv+, Max PASID Width: 10
                PASIDCtl: Enable- Exec- Priv-
        Capabilities: [320 v1] Latency Tolerance Reporting
                Max snoop latency: 0ns
                Max no snoop latency: 0ns
        Capabilities: [328 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 1
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [370 v1] L1 PM Substates
                L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
                          PortCommonModeRestoreTime=0us PortTPowerOnTime=170us
                L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
                           T_CommonMode=0us LTR1.2_Threshold=0ns
                L1SubCtl2: T_PwrOn=10us
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu

CPU: Xeon E3-1275 V2
MoBo: ASUS P8C WS
GPU: Radeon RX 580 (in PCIe 3.0 slot)

@andreychernyshev

This comment has been minimized.

Copy link

commented Feb 9, 2018

hi
i have working one with :
DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR+, OBFF Not Supported
AtomicOpsCap: 32bit+ 64bit+ 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
AtomicOpsCtl: ReqEn+

@luncht1me

This comment has been minimized.

Copy link

commented Feb 24, 2018

I'd love a way to have OpenCL detect my VEGA56s that are plugged into pcie 2.0 slots...
Althought ROCM-SMI detects them, OpenCL won't register them as devices on Ubuntu 16.04.

ie: rocm-smi and lspci | grep 'VGA' show 3 devices, but clinfo only shows the one connected via pcie 3.0 slot. The other two are on pci-e risers connected to pcie 2.0.

@Spudz76

This comment has been minimized.

Copy link

commented Mar 11, 2018

I find it interesting most of these stacks are all assumptive of moving lots of data to and from the card.

Mining moves a couple bytes literally, mostly chews the GPU, definitely doesn't NEED atomics nor even DMA really and definitely works fine on single lane (when the driver doesn't try to enforce "OMG GAMING BRO" performance). Similar to how the driver assumes you'll be using a GPU for display and not just brainwork, and it's nearly impossible to ignore the display ports and audio junk (don't init what you aren't going to use - best practice?)

@rhlug

This comment has been minimized.

Copy link

commented Mar 11, 2018

One of several reasons why mining w/ Vega on linux is still not worth the hassle

  1. PCI atomics required for opencl, restricting usable slots on legacy hardware.
  2. Modifying clocks and/or voltages is difficult at best.
  3. Cryptonight performance not on par with windows.

If you are mining ethash, and only have 2 or 3 vegas, you might be able to get away with it on basic hardware that we've all used in the past. If you want to splurge on x399 board, then 4 Vegas is doable.

I set up one vega rig (and its currently running win10) because of the issues mentioned above, and it will be the only one unless something changes in usability. Its too bad they lock out the bios, and provide no software on linux to make it usable for mining. May be moving onto Volta if it ever gets released.

@gstoner

This comment has been minimized.

Copy link
Contributor

commented Mar 11, 2018

When we started develop on ROCm the use case was HPC applications and Deep Learning. Both ran on servers which Xeon v3 or newer we had to compliant. Both need bandwidth. Hence x16 optimization What hear issue are is we want to use cheapest oldest posibile hardware. Volta will not be that.

We are looking into the Cryptonight issue. It firmware issue internal firmware team needs to solve

Soon you do not need atomic on Vega and ROCm. This removes pcie gen 3 restriction.

Minning is new use case they looking at. Amdgpupro is going to use opencl on pal which will give you what your asking for. Version 18.10. It goes back to older compiler used under windows as well

@dfad44

This comment has been minimized.

Copy link

commented Mar 11, 2018

@rhlug I run quad vega-fes on x399. I mine cryptonight and my issue is the inconsistent performance. The hashrates declared on xmr-stak is inaccurate. Typically, the miner declares between, 6500 - 7000 h/s and on poolside will report between 5000 - 14,000 h/s. I have to confess i got the best performance using rocm kfd kernel (somewhat stable hashrates). There has been some type of regression since it was moved to dkms or maybe the 4.13 kernel. I sincerely hope it can be rectified. I've decided to wait till kernel 4.17 to know if i made the right choice going with rocm.

@luncht1me

This comment has been minimized.

Copy link

commented Mar 11, 2018

@gstoner

This comment has been minimized.

Copy link
Contributor

commented Mar 11, 2018

@luncht1me It really is not exciting about losing PCIe Atomic/Atomic Completor support, we lose a lot of performance optimization for GPU Computing. But I understand the Miners who just care about their use case this is exciting.

@luncht1me

This comment has been minimized.

Copy link

commented Mar 11, 2018

@gstoner

This comment has been minimized.

Copy link
Contributor

commented Mar 11, 2018

@luncht1me With AMDGPUpro stack you will not have to use the ROCm foundation at All, You will get OpenCL with out PCIe Atomics support in 18.10, it does not leverage ROCm driver foundation. You get every you want, but it is not open source. Note the issue rhlug is seeing could also be still there since Window Driver uses different kernel driver foundation that AMDGPU uses.

@briansp2020

This comment has been minimized.

Copy link

commented Mar 11, 2018

@gstoner
Will ROCm stop using atomic for all configuration? It seems counter productive if that will result in performance hit for those that do have the supported hardware.

@luncht1me

This comment has been minimized.

Copy link

commented Mar 12, 2018

@gstoner

This comment has been minimized.

Copy link
Contributor

commented Mar 12, 2018

@rhlug

This comment has been minimized.

Copy link

commented Mar 12, 2018

@gstoner Thanks for the update. Thats great news. Glad to hear there is some eyes on cryptonight optimization. Its not about being cheap per say, but more-so about being able to ROI the gear before they EOL or the rise in hashing difficulty obsoletes them.

If I could buy server-class gear with xeons and pci atomics support across 12 GPUs for same as 2x 6GPU desktop-class rigs, I would definitely do that. But there is about a $6k gap, and that makes ROI alot harder.

@dfad44

This comment has been minimized.

Copy link

commented Mar 12, 2018

@rhlug I understand how you feel. However if you believe that this endeavor is worth it and this market is here to stay, it would be worthwhile in the end. After watching https://www.youtube.com/watch?v=k3aGaxcYCxw I was completely sold on Mr. Stoners passion for the prospects of hpc and believe me when I say I know nothing of it. I've spent my savings and I don't regret it. If anything, I gained knowledge learning this stuff.

@gstoner

This comment has been minimized.

Copy link
Contributor

commented Mar 12, 2018

@rhlug I get the ROI part of it on mining. I used to run Major VFX houses Rendering farm, you have CapEx and OpeEx you have to watch. In the long run, OpEX dominates your costs. We are finally getting more resource on the team. One thing on CapEx you will start seeing more Cloud service with ROCm.

@dfad44

This comment has been minimized.

Copy link

commented Mar 13, 2018

@gstoner with some algorithm optimization, prosumer can be alive and well. There should be enough room to exist side by side with enterprise. Collectively, we are also enterprise. I hope your team can give us something before summer.

@Spudz76

This comment has been minimized.

Copy link

commented Mar 13, 2018

You see though, "cheapest oldest posibile hardware" is generally what works best on Linux since developers pretty much get the software working right just about at the EOL... latest hardware on Linux is shooting yourself in the foot.

Hawaii core cards still ripping 30MH here, can't get a Fiji or Ellesmere to get anywhere close (because of the drivers, obviously).

@gstoner

This comment has been minimized.

Copy link
Contributor

commented Mar 13, 2018

@gstoner

This comment has been minimized.

Copy link
Contributor

commented Mar 13, 2018

@Spudz76

This comment has been minimized.

Copy link

commented Mar 13, 2018

Generally using H81 based boards which run very low TDP processors. Many low TDP Intels do not have extended features (Atomics, AES-NI, etc) even if they are brand new, to save power and transistor count and complexity.

I don't recall anything refusing to boot without TCQ on SATA. It's a extension and should be treated as such, don't panic just deal with it missing for whatever reason ("old" hardware or just turned off because I wanted to).

I turn off all KVM/VT-d/Virtualization as well since I'm not using it. If you had used those just because it was easier it would also break 6+ GPU installs or PCIe bridges beyond bridges beyond bridges (PCIe expansion/mux boards).

I have never gotten ROCm to work on anything including Hawaii. Segfaults at best, if it finds the platform at all. Same software runs against Intel/nVidia OpenCL albeit slow as tortoises at least it works.

Also stuff you say about ROCm vs AMDGPU-PRO does not match what the AMDGPU-PRO people seem to claim (target is merging the two and going open source completely / AMDGPU-PRO will be retired). This is the only reason I care about ROCm at all, if it will replace AMDGPU-PRO then we have to ensure it works with mining now. Atomics will break current use cases. We also use x1 and no more. That is all.

@gstoner

This comment has been minimized.

Copy link
Contributor

commented Mar 13, 2018

@Spudz76

I am sorry you have hard time getting anything working for your use case. ROCm started on very specific set of use cases when we started. We always target Xeon based Processor first including Pseudo Xeon I7 Extreme Edition + core i3, i5, and i7. We also support LargeBar P2P and P2P via RDMA which is not your use case.

Core linux driver is common between the AMDGPU, AMDGPUpro and ROCm enabeld AMDGPU driver. Some of the issue are seeing are in the base driver. AMDGPU adds KFD and Thunk to enable HSA based stack you can see the requirements http://www.hsafoundation.com/html_spec111/HSA_Library.htm So we can support more then OpenCL

Our main testing is on ( we test with Fiji Nano, S9150 Hawaii, MI25 )
Intel Xeon E5 v3 or newer CPU

  • SuperMicro SYS-1019
  • SuperMicro SYS-1029
  • SuperMicro SYS-1028GQ -TRT
  • SuperMicro SYS-4028GR-TRT
  • SuperMicro SYS-4028GR-TRT2

EPYC
Invetec P47 EPYC based

Intel i7 extreme edition ASUS x99 based
Core i5 &. i7 with Z-97 based Motherboard

Threadripper

Ryzen

  • Note MSI has SBIOS issue which is why you need newer version of Linux

I get you care about Mining use case only PCIe x1 Lane on the H81 2013 era Chip set and using only PCIe gen 2 via PCIe x1, also you want to source lowest possible cost PCIe Switch.

What your really asking for your use case is stripped down stack that just runs OpenCL, Windows driver OpenCL run on one of two foundation Pre GFX9 is Orca which been around since Catalyst drivers. For GFX9 runs on PAL Platform Abstraction Layer.

For AMDGPUpro 18.10 and newer OpenCL will be running on PAL. PAL is the same user driver foundation used by the Vulcan driver. It will also move back to using same compiler as the Windows Stack, LLVM to HSAIL to SC compiler aka the shader compiler. Note Linux and Windows driver will not have same base Kernel driver architecture.

@gstoner

This comment has been minimized.

Copy link
Contributor

commented Mar 13, 2018

@Spudz76 We also see if we can source H81 motherboards. Also if you give your exact linux kernel config we can added to our testing profile. Also need to know the exact GPU's and AIB vendor your using to build the test case. Please also include the exact application you are running a care about.
Also include instruction on how your run the application. Also any special configs Like voltage, clocks changes since we need to understand if your under driving the GPU which can also drive instability

@RadeonOpenCompute RadeonOpenCompute deleted a comment from boberfly Mar 20, 2018

@Spudz76

This comment has been minimized.

Copy link

commented Mar 20, 2018

I have reverted to Ubuntu 14 and kernel 3.x with fglrx 13, which ironically is mining on a HD5570 right now, a Redwood card that would not work with any other stack at all. I tried Windows7 and six different driver revisions, which never showed OpenCL because apparently you have to have a display (or dummy plug) actually connected to any GPU you want to calculate upon (so dumb, as to be nearly infuriating). I tried various other Linux 4.x kernel distros as basis and never got results either, similar failure points or unavailable old source files.

Basically it seems like the best practice for mining specifically is to obtain the oldest working version of the entire OS as if it were a time warp and it's 2005 (or whenever the hardware target was made) again, and then everything works perfectly other than being susceptible to thousands of exploits (hope the LAN is secure enough...) And then shift that OS compiler stack up to GCC 5 as some mining software compilation requires it, side-install CMake manually since nothing works with v2 anymore and v3 is not available on antique OS releases, etc, etc, etc.

Also these OS components and driver revisions are getting VERY difficult to find it would be very cool if AMD/ATI could ensure all old versions are available. For example you can't find the old Stream SDK 2.1/2.3 versions anywhere ever since the rename to APP-SDK, however you must have the old Stream SDK to be able to do things on old cores, and thanks to AMD stranglehold on IP / nobody mirrors their files only links back to the same old URL / whatever is on the main web site is what exists / APP-SDK 2,9 doesn't work for what I was trying...

@Spudz76

This comment has been minimized.

Copy link

commented Mar 20, 2018

Hawaii still works on Ubuntu 16 but now I'm wondering if it would be more optimal on this time-warp stack. Unfortunately I think I have to roll back even more (Xorg is too new in the final Ubuntu 14) in order to make that work properly. I do have one Hawaii that almost works but chokes on one ring/IB test and I sort of suspect the old driver / fglrx might work around that IMO minor error and use the card OK (via the other 9 lanes that DO still work). Every other radeon/amdgpu/amdgpu-pro stack just finds the bad lane and quits. I had hoped Windows might also work around bad single IB however I don't have any dummy plugs nor the gumption to tear down one of my working LCDs and drag it down into the basement/crypt just to find out it probably doesn't. Or, some other bad coder decided feature Z shouldn't work unless you have headphones hooked up, for who knows why apparently just trolling.

This is like if an Intel CPU refused to boot unless you had the iGPU inside it hooked up to something. Sorry but graphical output and even Xorg should not be necessary to compute on a GPU, much less to be able to control a GPU (clocks, fans, temps tend to be inaccessible without firing up Xorg... on both GPU vendors... which is insane).

@Spudz76

This comment has been minimized.

Copy link

commented Mar 20, 2018

The H81 stack is just Ubuntu 16 server from the latest install ISO.
Then manually install the older 4.10 kernel (linux-image-extra-4.10....) and headers.
Then install either 17.40 or 17.50 (either one basically works the same for Hawaiis). If you use rocm option with 17.50 it won't work / hits the atomics error / needs 'legacy' OpenCL. Also 'headless' doesn't actually work, you need most of the libraries it skips, even when "as headless as possible given bad assumptions that everyone has a head and runs Xorg and..." or maybe that's just to obtain all the -dev packages so as to compile OpenCL stuff.

You can run the 17.50 on the stock kernel 4.13 whatever but 17.40 doesn't build on anything newer than 4.10 / thus I just run 4.10 for both.

The main client software is xmr-stak from the dev branch of fireice-uk github repo. I have not developed a local proxy for the cryptonight coins yet so I prefer to have CPU + all various GPUs running in as few instances of xmr-stak as possible so it can divide up the nonce ranges appropriately for varied speed devices (better than having 100 things connected out to the Internet). This is why I am trying to run say Redwood garbage along with GCN stuff along with old GTX 550Ti and such as well all in one box (regardless of CPU/atomics capabilities). If the card eats 40w but gives me the output of a couple CPUs (58H/s) then I'm overall making progress. No reason to toss the Redwood in the trash where it otherwise belongs. Kind of 'green mining' instead of robbing the gamers of all the new GPUs.

@wrt54gl

This comment has been minimized.

Copy link

commented Mar 22, 2018

@gstoner

Soon you do not need atomic on Vega and ROCm. This removes pcie gen 3 restriction.

That is good news for me. I have a rig with a prime Z370A and an I3-8100 cpu. Right now I can run 2 vega 56 on it with rocm amd very impressed with the performance. I would like to run 9. Let me know If I can help with testing.

@Sumenia

This comment has been minimized.

Copy link

commented Mar 22, 2018

@wrt54gl I wonder how many Hashrate you get ? 🤔

@wrt54gl

This comment has been minimized.

Copy link

commented Mar 22, 2018

@Sumenia
39.76 mh/s per card. All I did was put a 19% overclock on the memory. Seems very stable. Using msi airboost which don't overclock as well as some.

@nguha

This comment has been minimized.

Copy link

commented Apr 19, 2018

Soon you do not need atomic on Vega and ROCm. This removes pcie gen 3 restriction.
You will get OpenCL with out PCIe Atomics support in 18.10

Both great news. Specially the support on the open-source ROCm.

@gstoner Can you give us an ETA for any of these two?

@boberfly

This comment has been minimized.

Copy link

commented Apr 24, 2018

Just to post on my E5-2680v2 using kernel 4.17 Ubuntu 18.04 I get this on a dmesg |grep kfd
[ 2.789921] kfd kfd: Initialized module

Currently I can't get ROCm/OpenCL to work but I think it's something else, still investigating...

@gstoner you might be able to change the minimum spec for Xeon CPUs to E5-2600v2 on the main page, provided that kfd report is an accurate indicator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.