Skip to content
This repository has been archived by the owner on Apr 24, 2022. It is now read-only.

Reduced hashrate over time on K80 Revisited issue 472 #2310

Open
axfwora opened this issue May 22, 2021 · 10 comments
Open

Reduced hashrate over time on K80 Revisited issue 472 #2310

axfwora opened this issue May 22, 2021 · 10 comments

Comments

@axfwora
Copy link

axfwora commented May 22, 2021

Hi Dev Found that this issue presented at #472 is still a issue with k based Tesla gpu's I see that ticket is closed and would like for someone to look in to issue.

K40 resolved issue by issuing "-M 100" . Tested on both linux manjaro 15.xx and and windows 10 with NVIDIA drivers 462.31, used ethminer-0.19.0-cuda10.0-windows-amd64 for windows, and used ethminer -0.19.0-cuda 9.00 for linux.

Linux start command used : "./ethminer -P stratum1+tcp://wallet@"eu1.ethermine.org:4444" -U -R -M 100"

windows 10 start command used : " ethminer.exe -P stratum1+tcp://wallet@eu1.ethermine.org:4444 -U -R -M 100"

original test without prefix command 1.66 mh

after prefix command -M 100 + stock clock 12.66 Mh

after prefix command -M 100 + over clock at mem 3535, clock 1150 17.22 Mh

start.bat fill created for windows ethminer

setx GPU_FORCE_64BIT_PTR 0

setx GPU_MAX_HEAP_SIZE 100

setx GPU_USE_SYNC_OBJECTS 1

setx GPU_MAX_ALLOC_PERCENT 100

setx GPU_SINGLE_ALLOC_PERCENT 100

setx CUDA_DEVICE_ORDER PCI_BUS_ID

ethminer.exe -P stratum1+tcp://wallet@@eu1.ethermine.org:4444 -U -R -M 100

start script used for linux ethminer

export GPU_FORCE_64BIT_PTR=0
export GPU_MAX_HEAP_SIZE=100
export GPU_USE_SYNC_OBJECTS=1
export GPU_MAX_ALLOC_PERCENT=100
export GPU_SINGLE_ALLOC_PERCENT=100

./ethminer -P stratum1+tcp://0x23E27A613570A4fDBB00286f1928be4d3B303899.Bolle@eu1.ethermine.org:4444 -U -R -M 100

No Idea why issue is present suspect a bug or issue with memory addressing with Kepler Tesla k series

-ecc memory enabled / disabled in nvidia panel makes little to no changes to hash rate

From my testing additional prefix such as GPU_FORCE_64BIT_PTR 0, GPU_MAX_HEAP_SIZE 100, setx GPU_USE_SYNC_OBJECTS 1, GPU_MAX_ALLOC_PERCENT 100, CUDA_DEVICE_ORDER PCI_BUS_ID make little to no difference to hash rate.

"I can confirm that Tesla k series GPU k40 Compute 3.5 and k80 Compute 3.7 can be used for ethminer on both linux and windows 10 with newest NVIDIA drivers . I'm sure k20 Compute 3.5 would also work but don't have one to test.

-Haven't tried any consumer Kepler yet but I have a GTX 770 2 GB I'll be testing next. I'm having a k10 shipped to me to test as well in a bit i should be able to at least compare results.

"Please if anyone is able to optimize K series gpus for ethminer please reach out!"

"also if Dev team wont's to at least say Hi were alive it would be much appreciated!"

Thank's thomasjungblut for the work around your the shit!

@peon501
Copy link

peon501 commented May 28, 2021

yep, need to reduce the hash rate on more GPUs, so miners switch to GPUs designed for mining.

@Cavia
Copy link

Cavia commented Jun 6, 2021

Hello,

the option "-M 100" will run ethminer in simulation mode. Thus, no connection is established with the mining pool because you are basically testing the mining on block 100.
Whenever I launch ethminer correctly to mine from the current block, I cannot get more than 2.4 MH/s per GPU core on my K80 :(

@RhynarAI
Copy link

RhynarAI commented Jun 9, 2021

Thats interesting. Something similar is happening with Maxwell Cards... WIth increase of DAG Size, the Hahsrate drops siginifcantly. I tried this on Titan X and Tesla M40 (pretty much the same card). Around 2Mh/s, depending on clock rates and miner used (tried different Linux and Windows implementations, pretty much the same all over with different miners).

The odd thing about this is: I fired up almost every benchmark I could find (Cuda, Vulkan, OpenCL, Gaminig, Compute, etc...) and compared the Maxwell 2.0 Cards against newer generations. With over 3000 Cores and 12GB of VRAM, they easily compete against something like a 1070 in almost every scenario, in some they even outperform the midrange pascal cards... Running Inference Engines is no problem to get speeds of a 1060 or 1070.

But still, the performance in ETH Mining drops to around 1:10 of Pascal or older GCN Cards.
Since a K80 with its two GPUs on board (and as far as I remember quite a lot CUDA Cores (maxwell reduced core count per shader), the K80 actually should also get more than the 2MH/s - judging from the raw compute Power.

@axfwora
Copy link
Author

axfwora commented Jun 9, 2021

I found that coins using Ethash with smaller dag size such as DBIX, PGC, and UBQ mine as expected on K40 on both windows and linux . "stock speeds 13.00 mega-hash overclocked 1125 core 17.83 mega-hash ". Wile some coins using Ethash perform worse like MOAC, ETP, CLO, PRKL hash rate of "6 - 8 Mega-hash" over clock no to little difference. All alt coins performed better then the 2 - 3 Mega-hash mining ethereum. I started using phenix miner due to the lake of support and development of ethminer. I Just ordered 2 NVIDIA Tesla m40 to test and see if results can be replicated. Again if dev team like to Join the conversation.

Failed attempt of solutions: Flash k40 to quadro k6000 = no difference of hash rate but works on pci-e Risers, pcie x1 slots, and mining motherboards after flash "likely due to software limitations of K40 drivers with VM".

Failed attempt of solutions: tried phenix miner, old versions of ethminer and compiled from source. no change

Failed attempt of solutions: use cuda 6.5 compiling from cpp-ethereumalong with a old version of nvidia drivers 350. No change hard to replicate endless issues compiling using cmake.

It might be a software limit from NVIDIA, issue with Dag as size increases, bug in the software or algorithm? I'm stuck on cross-roads on why and where the issue is.

@RhynarAI
Copy link

Thanx for sharing and cool, that you did try to compile with older CUDA versions - too bad this didn´t work out. Did you manually set CUDA to 3_x for Kepler ?

@estagugo
Copy link

estagugo commented Jul 3, 2021

Hello to all.
I am starting a new project to unlock Tesla's hashrates I saw onces a K80 doing more than 20MH/s.

So I got 3 K80s and already tried by OC and Drivers with no luck,
Running Windows Server 2019 DataCenter on a TB360-BTC D+ with 16GB RAM and an i5 9400 CPU. PSU 2.4KW

I got an steady temp by only using 2nd GPU with OC memory clock to 3000 for same some 2MH/s.

I guess already burned hardware tweak options, going to software,

Installed VS Enterprise and CUDA toolkit to seek for a new miner code on Kepler GPUs.

@Cavia
Copy link

Cavia commented Jul 4, 2021

Hello,

there is an 'urban legend' of people getting excellent hash rates by tweaking and re-flashing the BIOS of the K80.
According to some comments (on youtube), some guy even ordered a large batch of K80 because of this BIOS technique.
The following long thread contains some useful info about tweaking the BIOS of a Tesla card and a guy even posted a set of tweaked bios for the K80.
https://blenderartists.org/t/tesla-k80-24gb-x3-rendering/1255786
Tho, this is more related to rendering power than mining power.
I've attached the tweaked BIOS (k80bios.zip) posted by the guy on the above thread. They are the exact same files you can obtain from www.techpowerup.com (as mentioned in the thread), but you need to search quite a while for the right pointer.

The two files are suppose to populate the dual core in the K80.

If anyone gets promising results with the BIOS technique, please share your knowledge here :)

@RhynarAI
Copy link

RhynarAI commented Jul 4, 2021

Thanx, thats interesting.

Of course, tweaking the Kepler or Maxwell Teslas to some higher freq or voltage can sqeueez out a little more power - or in situations where the Card doesn´t switch to the C0 mode, setting it manually also helps... But all of this is changing speeds within a magnitude of a few ten percent at best...

The thing with the etherum hashrates is a totally different magnitude - even with all tweaks enabled, overclocked, water cooled, manually set to highest freq, etc.... 3 MH/s are about the maximum possible. Compared to smaller Pascal cards, which easily hit 10MH/s and above without no special treatment.

It´s often mentioned that with smalle DAG sizes, speed of maxwell and Kepler rises significantly. That´s also where the rumors come from - some mining calculators still have some numbers stored in the database which are so old, that people see 20 or 30Mh/s with Titan X (maxwell) etc... But these numbers are old, when the DAG sizes were much smaller. Simulating smaller DAG sizes gives much higher numbers...

So there must be somehting going on either related to the architectures themselfes, the memory controller or...

I´ve read about almost any possible benchmark, mikro benchmark on the memory, etc... and nowhere a discrepancy like that is found... The biggest difference when calling up a lot of memory from maxwell to pascal was about 3x speed change.. Not 10x like we encounter here.

@estagugo
Copy link

estagugo commented Jul 7, 2021

Hello,

there is an 'urban legend' of people getting excellent hash rates by tweaking and re-flashing the BIOS of the K80.

According to some comments (on youtube), some guy even ordered a large batch of K80 because of this BIOS technique.

The following long thread contains some useful info about tweaking the BIOS of a Tesla card and a guy even posted a set of tweaked bios for the K80.

https://blenderartists.org/t/tesla-k80-24gb-x3-rendering/1255786

Tho, this is more related to rendering power than mining power.

I've attached the tweaked BIOS (k80bios.zip) posted by the guy on the above thread. They are the exact same files you can obtain from www.techpowerup.com (as mentioned in the thread), but you need to search quite a while for the right pointer.

The two files are suppose to populate the dual core in the K80.

If anyone gets promising results with the BIOS technique, please share your knowledge here :)

Thank you,

Actually that forum is quite good and I did my tweaks based on those files, where for mining...I can say,

  1. Two GPU usage, so far, ends in troubles(heating, dropping and never same hr at the same time.
  2. With only OC memory to 3000 you get same results.
  3. Not drivers or OS or OC...get better hr.

Thank you for your reply.

@estagugo
Copy link

estagugo commented Jul 7, 2021

@RhynarAI Thank you,

I can back up all you said with my latest tests. I did mine ETC which has a smaller DAG and, yes hr was duplicated. So, may be I missed time frame on videos I watched nor that I knew any of this at that time, jajajaj.

My only concern is that,

As I said in my last post if I use only one GPU everything goes better, for example DAG loading improves and here is my concern about it.

It is obvious that, it's arquitecture, is playing along because that behavior (improving) tried to virtualized and mine simultaneously in a K80 but it does cut resources in fair percentages for each vm.

So if it is an architectural problem, maybe changing basic routines on miner code, perhaps can use all computacional power in there. For example machine learning or rendering should be use to seek.

Anyway, I will start with machine learning algorithms/procedures to see if it is viable.

Regards to all.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants