Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GTX 1070 - Version 1.3.8 alpha neoscrypt not work #78

Closed
Kayfolom opened this issue Nov 24, 2017 · 54 comments
Closed

GTX 1070 - Version 1.3.8 alpha neoscrypt not work #78

Kayfolom opened this issue Nov 24, 2017 · 54 comments
Labels

Comments

@Kayfolom
Copy link

[17:11:24][0x00001518][info] Log started [17:11:25][0x00001518][info] core | Found CUDA device: GeForce GTX 1070 [17:11:25][0x00001518][info] core | Found CUDA device: GeForce GTX 1070 [17:11:25][0x00001518][info] core | Found CUDA device: GeForce GTX 1070 [17:11:25][0x00001518][info] core | Found CUDA device: GeForce GTX 1070 [17:11:25][0x00001518][info] core | Found CUDA device: GeForce GTX 1070 [17:11:25][0x00001518][info] api | Listening on 0.0.0.0:33333 [17:11:25][0x00001518][info] http | Listening on 0.0.0.0:38080 [17:11:25][0x00001518][info] core | Initialized! [17:11:26][0x00002300][info] net | Connecting to 169.50.175.203:3341 (neoscrypt.eu.nicehash.com) [17:11:26][0x00002300][info] net | Connected! [17:11:26][0x00002300][info] net | Authorized as 19M9g3vvfyDfUhdekwokAVanKdiTtSCmF2.node07 [17:11:26][0x00002300][info] algo-neoscrypt | New job_0 '000000096d78dbc0', diff=0.0078125 [17:11:28][0x00000774][info] wrkr1-1 | Algorithm: CUDA-neoscrypt parameters: B=1410 [17:11:29][0x00002b8c][info] wrkr3-3 | Algorithm: CUDA-neoscrypt parameters: B=1410 [17:11:29][0x00000d58][info] wrkr2-2 | Algorithm: CUDA-neoscrypt parameters: B=1410 [17:11:29][0x00002008][info] wrkr0-0 | Algorithm: CUDA-neoscrypt parameters: B=1410 [17:11:29][0x00002814][fatal] wrkr4-4 | CUDA error 'out of memory' in func 'cuda_neoscrypt::init' line 1258 [17:11:29][0x00001518][info] Shutting down [17:11:30][0x00001518][info] api | Closing [17:11:30][0x00001518][info] http | Closing

Overclocking disabled :
{"id":1,"method":"device.set.tdp","params": ["0","100"]},
{"id":1,"method":"device.set.core_delta","params": ["0","0"]},
{"id":1,"method":"device.set.memory_delta","params":["0","0"]},

@Kayfolom
Copy link
Author

on GTX 1080 - system crash.
At GTX 1080Ti I will not try so far - is it not enough that ...

@t4nja
Copy link
Contributor

t4nja commented Nov 24, 2017

Are you running anything else on your cards beside excavator? That would explain the 'out of memory' error. You can try to decrease the B parameter, the size of memory needed to run will decrease as well. Let me know how it goes.

Also, could you please post config file here? Thanks.

@Kayfolom
Copy link
Author

Only the excavator has been launched, no other miners or applications
[ {"time":1,"commands":[ {"id":1,"method":"algorithm.add","params":["neoscrypt","neoscrypt.eu.nicehash.com:3341","19M9g3vvfyDfUhdekwokAVanKdiTtSCmF2.node07"]} ]}, {"time":2,"commands":[ {"id":1,"method":"worker.add","params":["0","0"]}, {"id":1,"method":"worker.add","params":["0","1"]}, {"id":1,"method":"worker.add","params":["0","2"]}, {"id":1,"method":"worker.add","params":["0","3"]}, {"id":1,"method":"worker.add","params":["0","4"]} ]}, {"time":5,"commands":[ {"id":1,"method":"device.set.tdp","params": ["0","100"]}, {"id":1,"method":"device.set.core_delta","params": ["0","0"]}, {"id":1,"method":"device.set.memory_delta","params":["0","0"]}, {"id":1,"method":"device.set.tdp","params": ["1","100"]}, {"id":1,"method":"device.set.core_delta","params": ["1","0"]}, {"id":1,"method":"device.set.memory_delta","params":["1","0"]}, {"id":1,"method":"device.set.tdp","params": ["2","100"]}, {"id":1,"method":"device.set.core_delta","params": ["2","0"]}, {"id":1,"method":"device.set.memory_delta","params":["2","0"]}, {"id":1,"method":"device.set.tdp","params": ["3","100"]}, {"id":1,"method":"device.set.core_delta","params": ["3","0"]}, {"id":1,"method":"device.set.memory_delta","params":["3","0"]}, {"id":1,"method":"device.set.tdp","params": ["4","100"]}, {"id":1,"method":"device.set.core_delta","params": ["4","0"]}, {"id":1,"method":"device.set.memory_delta","params":["4","0"]} ]}, {"time":10,"commands":[ {"id":1,"method":"worker.reset","params":["0"]}, {"id":1,"method":"worker.reset","params":["1"]}, {"id":1,"method":"worker.reset","params":["2"]}, {"id":1,"method":"worker.reset","params":["3"]}, {"id":1,"method":"worker.reset","params":["4"]}, {"id":1,"method":"worker.reset","params":["5"]}, {"id":1,"method":"worker.reset","params":["6"]}, {"id":1,"method":"worker.reset","params":["7"]}, {"id":1,"method":"worker.reset","params":["8"]}, {"id":1,"method":"worker.reset","params":["9"]} ]}, {"time":15,"loop":20,"commands":[ {"id":1,"method":"worker.print.speed","params":["0"]}, {"id":1,"method":"worker.print.speed","params":["1"]}, {"id":1,"method":"worker.print.speed","params":["2"]}, {"id":1,"method":"worker.print.speed","params":["3"]}, {"id":1,"method":"worker.print.speed","params":["4"]}, {"id":1,"method":"worker.print.speed","params":["5"]}, {"id":1,"method":"worker.print.speed","params":["6"]}, {"id":1,"method":"worker.print.speed","params":["7"]}, {"id":1,"method":"worker.print.speed","params":["8"]}, {"id":1,"method":"worker.print.speed","params":["9"]}, {"id":1,"method":"algorithm.print.speeds","params":["0"]} ]}, {"event":"on_quit","commands":[ {"id":1,"method":"device.set.tdp","params": ["0","100"]}, {"id":1,"method":"device.set.core_delta","params": ["0","0"]}, {"id":1,"method":"device.set.memory_delta","params":["0","0"]}, {"id":1,"method":"device.set.tdp","params": ["1","100"]}, {"id":1,"method":"device.set.core_delta","params": ["1","0"]}, {"id":1,"method":"device.set.memory_delta","params":["1","0"]}, {"id":1,"method":"device.set.tdp","params": ["2","100"]}, {"id":1,"method":"device.set.core_delta","params": ["2","0"]}, {"id":1,"method":"device.set.memory_delta","params":["2","0"]}, {"id":1,"method":"device.set.tdp","params": ["3","100"]}, {"id":1,"method":"device.set.core_delta","params": ["3","0"]}, {"id":1,"method":"device.set.memory_delta","params":["3","0"]}, {"id":1,"method":"device.set.tdp","params": ["4","100"]}, {"id":1,"method":"device.set.core_delta","params": ["4","0"]}, {"id":1,"method":"device.set.memory_delta","params":["4","0"]} ]} ]

Stable earned only with parameters "worker.add","params":["0","0","B=1230"]

And a very large consumption of computer RAM (Not GPU RAM) - at the start of the miner is 500 megabytes. And after the start, the memory consumed is reduced by 1 megabyte per second. Several mines passed and consumed memory of 216 megabytes.

@t4nja
Copy link
Contributor

t4nja commented Nov 24, 2017

This doesn't seem ok. Which version of drivers are you using? I will try to reproduce the problem on my machine.

FYI. You're resetting workers that don't even exists. You have 5 workers (not 10).

@Kayfolom
Copy link
Author

Kayfolom commented Nov 24, 2017

Win 10 Pro 64bit, nvidia driver 382

FYI. You're resetting workers that don't even exists. You have 5 workers (not 10).

Yes, I forgot ... Config copied from another file, fixed the worker, and forgot the worker reset ... But it could not affect - the excavator is smart enough to reset the non-existent workers ;-)

@Kayfolom
Copy link
Author

Try GTX 1080 Ti - all fine :

21:01:37][0x000006a4][info] core | Algorithm 'neoscrypt' total speed: 1.111407 MH/s 21:01:43][0x00000870][info] net | Share #10 accepted 21:01:45][0x00000870][info] algo-neoscrypt | New job_0 '0000000c8bebcf44', diff=0.00390625 21:01:46][0x000006a4][info] net | Share #11 accepted 21:01:57][0x000006a4][info] core | Device #0-0 speed: 1.101545 MH/s 21:01:57][0x000006a4][info] core | Algorithm 'neoscrypt' total speed: 1.106978 MH/s 21:02:17][0x00000870][info] core | Device #0-0 speed: 1.101532 MH/s 21:02:17][0x00000870][info] core | Algorithm 'neoscrypt' total speed: 1.104563 MH/s 21:02:34][0x00000870][info] net | Share #12 accepted 21:02:37][0x000006a4][info] core | Device #0-0 speed: 1.100956 MH/s 21:02:37][0x000006a4][info] core | Algorithm 'neoscrypt' total speed: 1.103118 MH/s 21:02:42][0x000006a4][info] algo-neoscrypt | New job_0 '0000000c8bebe7bf', diff=0.00390625

@UselessGuru
Copy link

Try GTX 1080 Ti - all fine.
But unfortunately nowhere near as fast as Ccminer-KlausT

@t4nja
Copy link
Contributor

t4nja commented Nov 26, 2017

@UselessGuru We're still working on optimisations, especially for 1080 and 1080 ti cards. We'll release an update soon.

@kgiedrius
Copy link

works fine for me on 1070cards with newest nvidia drivers

@ebbbang
Copy link

ebbbang commented Dec 3, 2017

Not working for me as well on both my rigs ...
Same issue as described in the first post ...

@t4nja t4nja added the bug label Dec 3, 2017
@t4nja
Copy link
Contributor

t4nja commented Dec 3, 2017

@ebbbang driver version? Which cards do you have?

@obit8
Copy link

obit8 commented Dec 3, 2017

Good evening guys, i have the same problem some times.

This is my configuration:
ASUS B250 MINING EXPERT Intel B250 LGA 1151 (Socket H4) ATX motherboard - motherboards (DDR4-SDRAM, DIMM, 2133,2400 MHz, Dual, 32 GB, Intel)

18 x - GTX 1060 3gb MSI gamingx
1 x - CPU intel Pentium G4400 LGA1151
3 x - Alimentatore Cooler master V1000
1 x - 16gb DDR4 2400mhz Hyper x fury
1 x - 120gb ssd (kingston adata)
1 x - custom rack mining

I'm working with only 11 GPU and 2 Power supply, because the driver on the mobo doesn't allow, for now, 18 card of the same kind.

I put an image of the error i receive sometimes, at least any 340/400 net share accepted (this is the only unit i can see to give you an element to understand how many times i get that error)

It is ok to receive it sometimes or not?
And my RAM is ok or i need more?
schermata 2017-12-03 alle 20 49 23

@t4nja
Copy link
Contributor

t4nja commented Dec 4, 2017

@obit8 It's definitely not ok to get this error. We're working on it. Thanks for the info.

@obit8
Copy link

obit8 commented Dec 4, 2017

@dropky my card is a 1060 msi nvidia.

Some times this error shows up

@t4nja
Copy link
Contributor

t4nja commented Dec 4, 2017

@obit8 driver version?

@kgiedrius
Copy link

Can't it be the problem that sometimes algorithms use > 5GB of GPU memory and your card have only 3GB ?

@kgiedrius
Copy link

I personaly use excavator on ubuntu with gtx 1070 cards and never saw similar issues

@t4nja
Copy link
Contributor

t4nja commented Dec 4, 2017

@kgiedrius Neoscrypt requires quite a lot of memory, on 1070 cards it's 5GB, on other cards it's less than that (usually allocates around 70% of the GPU's memory). The cuda 'out of memory' error happens because at the moment there isn't enough free memory on the GPU, the system might be using it (primary cards) or something hasn't been freed completely yet (nhm2 switching). We're working on a way around - if there won't be enough free memory the block parameter will be modified (the intensity will be decreased for a bit).

Note for everyone:
Make sure you're running only one worker per card!
In some cases this problem was solved with updating windows drivers.

@obit8
Copy link

obit8 commented Dec 4, 2017

@dropky Driver version 388.4

However now i see that the escavator reboot frequently! Also without any CUDA memory error!...

There is something i can do to solve this?

I'm using the Nicehashminer v 2.0.1.5 BETA

@obit8
Copy link

obit8 commented Dec 5, 2017

@Kayfolom i've disabled the neoscrypt from the config file!

Is just annoying that you need to edit every single gpu file! But at the End the excavator started the work without error mining the "Lyra2Rev2"...

Today it worked well for me for almost 10 hours with CryptoNight and Equihash...then i see 0 worker online and when i checked i saw that it continued to give me the CUDA MEMORY error trying to mine the neoscrypt!
schermata 2017-12-05 alle 17 43 43

@OlArMiners
Copy link

OlArMiners commented Dec 5, 2017

Same problem....
1070 nicehash
8x1070 Palit

8 Gb RAM
32Gb Swap
385.41

When switching to "neoscrypt" - go to Error 'out of memory' and endless restart

So i Reboot.... and it working fine whith Algorithm 'equihash'..... for now....

@Pititul
Copy link

Pititul commented Dec 7, 2017

Got the same 'out of memory'.

@obit8
Copy link

obit8 commented Dec 7, 2017 via email

@hapklaar
Copy link

Is this issue still on the radar now Nicehash is down? Have the same memory errors on my dual GTX1070 setup.

@t4nja
Copy link
Contributor

t4nja commented Dec 21, 2017

@hapklaar Yes :)

@nlaciii
Copy link

nlaciii commented Dec 21, 2017

I assume this is not a common scenario yet, but I have a rig with 13xGTX1070 and 4xP106-100 Nvidia GPUs. Before Dec 5. it was running only the 13x1070-s, and with that config NHM2 and excavator was running perfectly stable. Now with the P106-100-s (6GB) I'm getting this same CUDA error, out of memory. Also some other crashes without noticable error message.
On my other rigs with 4x1070 and 4x1060(or P106-100) 6GB, everything running fine.

Win10 1709, Nvidia driver 388.43, - and with all fairness all ccminer incarnations also have the same problem with neoscrypt and lyra2rev2
EWBF, DSTM, and CLaymore's eth miners are perfectly stable running on all the GPUs.
I can provide you logs, or even access to the rig for testing if needed.

@diwu1989
Copy link

One of my miner just hit this issue too, had to restart NiceHash in the morning and missed out on 6 hours of mining.

Please do one of the following:

  • investigate and fix this memory bug
  • take neoscrypt out of Nvidia cards for now until its stable
  • add code to detect memory failure and auto-restart in that case vs be stuck in the failed state

@jlavyan
Copy link

jlavyan commented Dec 26, 2017

I have Same error
RAM 8GB, Virtual memory 20GB - 25GB, Windows 10
Nicehash 2.0.1.5

neoscrypt 'out of memory'
9 X Nvidia 1070

@bviktor
Copy link

bviktor commented Dec 29, 2017

To me it happens on a 1080 too. Even if I run nothing else, and simply try to start a single excavator on a single GPU. And it fails right away. Seems like a problem with parameters or extremely screwed up memory allocation, if I had to guess.

@Tottom
Copy link

Tottom commented Dec 30, 2017

Hi,
Same issue for me on GeForce GTX 6x 1070 and 2x 1070Ti, CUDA 9.1.85, Nvidia 388.71
CUDA error 'out of memory' in func 'cuda_neoscrypt::init' line 1258

@SergeyKG
Copy link

SergeyKG commented Jan 1, 2018

Hi all and Happy New Year!!!
Have a problem running NiceHash Miner 2.0.1.5 with 11x 1080ti 65% +50 +100
Stable I can run only lyra2rev2 any other algo-s runs for 15-30 seconds and then excavator restarts. I turned off all algo-s except lyra2rev2 to work stable with NiceHash Miner. Tried with 65% +0 +0 - same issue. If I run the same algos with ccminer no crushes - works fine.

@Tottom
Copy link

Tottom commented Jan 1, 2018

Hi.
How much physical memory and virtual (pagefile) memory do you have if you are running Windows?

Also refer to this thread.
Nicehash v2.0.1.3 out of memory Issue still "nicehash/NiceHashMiner2"
sorry, not sure how to reference it...

Seems there might also be a fix on the way.

@SergeyKG
Copy link

SergeyKG commented Jan 1, 2018

Hi Tottom!

I have 8GB RAM and 40GB virtual memory.

@SergeyKG
Copy link

SergeyKG commented Jan 1, 2018

Just checked RAM loading under Lyra2rev2 and Equihash, interesting effect - Lyra2rev2 loads the RAM up to 48% (3.8GB) and works ok. When running Equihash it runs ok several seconds before the RAM load reaches 50%+ (4GB) and then Excavator restarts. I raised virtual memory to 80Gb but unfortunately Excavator continues to restart.

@M1r00
Copy link

M1r00 commented Jan 1, 2018

Hi, I am experiencing the same issue of running out of memory especially with neoscrypt algorythm on one of my rigs .
I got two rigs with 6* GTX 1070 each,
1st rig1 Aorus + 5 * G1
MSI H270 A pro with latest bios
12GB ram
CPU G3950 (7th gen)
2* cooler master V750
primary graphic card is on the intel HD 630
Windows 10 x64 pro ver 1709
nvidia driver 384.76

The second rig is the exact same except for
Strix gpu instead of G1
CPU i5 6400

OC settings 70% / +100/ +700

The first rig is working perfectly without any issue even with neoscrypt ( take appart if I run the 6 gpu on neoscrypt it crashes after minutes). the second rig (the strix) is stop working using neoscrypt on nicehash 2.0.1.5 with only two gpus mining in neoscrypt.

i will try to clone the G1 OS and run the strix with it to see if the problem is due to software problems because seems weird that the two rigs aren't behaving the same way

@Tottom
Copy link

Tottom commented Jan 1, 2018

So the 11x 1080ti cards has 11GB mem each?

My 8x 1070 has 8GB each. at 64GB vRAM it was still giving issues.
After changing it to 96GB it seems to be stable now for almost 32 hours.
Could have been less vRAM needed but easy to set without worrying about the failure again.

Now also fixed on mining only Lyra2rev2, Equihash and Blake2s.
Best profit from those 3 for me.

@SergeyKG
Copy link

SergeyKG commented Jan 1, 2018

Understood your point! Thanks a lot!
Unfortunately my SSD has only 120GB. To map 11x11GB I need at least 121GB and it means that I need to change SSD.
You are right 11x 1080ti has 11GB each.

@Tottom
Copy link

Tottom commented Jan 2, 2018

How long before the excavator restarts? Maybe you can keep the 80GB vRAM and enable 6 cards, then 7 and then more to see how long it before it becomes unstable? Last night i started see the excavator restarting. so it seems the garbage collection is not clearing the vRAM efficiently and when i restarted the Rig, the excavator started and has been running since without the same issue. so i guess i have to restart the Rig every 24 to 30 hours...

This will help to determine if you really need an additional SSD for just the Pagefile.

@Tottom
Copy link

Tottom commented Jan 2, 2018

M1r00, what vMEM have you set for each your Rigs. Since you have 2x Rigs with 6x 1070. I have 8x 1070(2 of them ti) and had tried it with 64GB vMEM and it still gave me Out of Memory... So i changed it to 80GB and it was stable for just over 32hours. If you are on 16GB or 32GB vMEM then make it 48GB or 64GB. see how it runs then?

@obit8
Copy link

obit8 commented Jan 2, 2018

In my case if i disable all the algo exept one the miner is stable, if i enable 2 or more the miner continue to restart!

I have x13 1060...

Now i'm only mining lyr2v2, and when i see it on whattomine i change with Equihash!!!

@Tottom
Copy link

Tottom commented Jan 2, 2018

here is my current config. Running Lyra2rev2, NeoScrypt, EquiHash and Blake2b. Depending on the profitability as you mentioned on WhatToMine. Maybe it is just my Rig but this works for me. If I add 2 more cards, i might run into the same issue again or maybe not.

miner1
miner2
miner3

@SergeyKG
Copy link

SergeyKG commented Jan 2, 2018

Before I got to 11x 1080ti I run 7 of them and Excavator worked stable (several days without issues) on any algo even with 2-3 algos at the same time.
With 11 cards it restarts in 30-60 seconds (working stable only on Lyra2rev2)
Later today will try to expand virtual memory to 256GB and see what happens.

@Tottom
Copy link

Tottom commented Jan 2, 2018

Don't think 256GB is what you should set it rigt now.
Try 128GB then Level up 32GB till stable.
Good luck!

@M1r00
Copy link

M1r00 commented Jan 2, 2018

I am setting 20 GB for each. my ssd is 120gb and still have 30gb free on my ssd so I assume I can go to 40GB. But my G1 rig is rock stable with 20 GB unlike the strix.

@Tottom
Copy link

Tottom commented Jan 2, 2018

might be some app config perhaps. what version of excavator are you running?
and Nvidia Driver? Try 32GB for the one that is the problem. Monitor and feedback?

@M1r00
Copy link

M1r00 commented Jan 2, 2018

Using nicehash 2.0.1.5...tried multiple drivers 384.76 then 388.31....the G1 is running stable with the 384.76 ....will post screenshots when @ home

@SergeyKG
Copy link

SergeyKG commented Jan 2, 2018

Tried 128GB and then +32GB up to 244GB unfortunately only Lyra2rev2 stays stable, at Equihash Excavator still restarting.

@Tottom
Copy link

Tottom commented Jan 2, 2018

That is unfortunate.
Not sure then why it is failing still... if not memory then power / driver / excavator?

I am running nVidia driver: 388.71

Excavator_Server 1.38 - https://github.com/nicehash/excavator/releases
I downloaded it and pasted it in my own NH2 folder.

Apart from the above, i am at a loss why you are experiencing the issue :(

@SergeyKG
Copy link

SergeyKG commented Jan 3, 2018

Running the same versions driver and Excavator - will keep on digging)
Thanks for support!

@reproteq
Copy link

reproteq commented Jan 10, 2018

I had the same problem when I was mining neoscrypt
and solve it like that:
format and reinstall all drv nvidia and gigabyte oc soft. and nicehash last version.
oc gigabyte soft alls gpus -20% power
open nicehash soft and make normal benchmark
now all it is working perfect , i can miner neoscrypt without errors cuda

solved
CUDA error 'out of memory' in func 'cuda_neoscrypt::init' line 1258

do not try to cheat the benchmark by doing the test with a configuration and then change it to mine

@t4nja
Copy link
Contributor

t4nja commented Jan 11, 2018

Increasing virtual memory size should solve the problem. #104

@t4nja
Copy link
Contributor

t4nja commented Jan 15, 2018

So I can finally reproduce this issue. When running standalone excavator everything works fine, but if there is NHM running on the system (doesn't have to be mining, it's enough that it's opened), then 'out of memory' crash occurs on the primary GPU. To avoid such issues, I suggest you to disable NeoScrypt on your primary card (only primary GPU is problematic, at least in my case) until this bug is fixed by the NHM team. Of course, the crash can still happen when using a standalone Excavator, since third party applications that are GPU consuming can crash the primary card as well, in such case just do the same - don't run neoscrypt on your primary card.

If this issue persists in the future, we'll work on a different implementation of the NeoScrypt (that won't need as much memory as current one).

I'm closing this (will reopen if needed), thank you all for your feedback! Keep on mining :)

Btw, we released new version of excavator today - https://github.com/nicehash/excavator/releases/tag/v1.3.9a

@MossP
Copy link

MossP commented Mar 2, 2018

@dropky I have started seeing this error on v2.0.1.11 even though my primary card is my separate onboard graphics card. Has something changed in the latest build?

Edit: I've not yet tried the VM increase. About to try it now but through I would mention the issue.

@MossP
Copy link

MossP commented Mar 2, 2018

@dropky Apologies, it appears that my primary card has switched to one of my mining cards. I'll apply the fixes suggested and try again. Thanks for the info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests