Skip to content
This repository has been archived by the owner on Apr 24, 2022. It is now read-only.

the pool is regularly disconnected #1539

Closed
Zaqsnaider opened this issue Sep 4, 2018 · 22 comments
Closed

the pool is regularly disconnected #1539

Zaqsnaider opened this issue Sep 4, 2018 · 22 comments

Comments

@Zaqsnaider
Copy link

Hello.
I'm using the version ethminer-0.16.0rc1 and windows10. And I have the same problem of regular disconnects and rejects on etp coin - dodopool or metaverse.farm. I run the following bat file:

@echo off
timeout /t 45
:bg
ethminer -U -P stratum://WALLET.NAME:x@nl.metaverse.farm:3002 --exit --tstop 72 --tstart 45 --noeval
timeout /t 10
goto :bg

default

Thanks.

@AndreaLanfranchi
Copy link
Collaborator

Actually this problem has nothing to do with ethminer.
You see "Connection remotely closed by" ... ?

This can be caused by any of these :

  1. Poor internet connection
  2. Absence of static public IP address on your internet connection (your IP gets changed)
  3. Aggressive behavior of your router/firewall which prematurely detects idle connections
  4. Problem on pool's side which abruptly closes connections

@Zaqsnaider
Copy link
Author

Thank you for your prompt reply. I have been mining for a long time. I like Ethminer very much and I would like to use it always. I would not ask such question - as other miner gives out approximately on 2000 decisions only 4 rejekts and the hashrate on a pool corresponds. While ethminer receives a regular discount and dozens of rejects-on 2000 solutions about 20 rejects.
By your answer:

  1. Internet connection was not lost.
  2. I have a static Internet address and it's fine, too.
  3. This refutes the behavior of another miner.
  4. Here I can say that I tried three pools - (dodo, sand and farm) and everywhere the same behavior, and on ETC and ETH this was not. ETP I started mining only recently.
    I would like to send a screen with another miner on the same machine where everything is in order later.
    What do you advise to do to narrow down the cause of the problem? It may be possible to make a debug log - if explain how.
    Thanks.

@AndreaLanfranchi
Copy link
Collaborator

The cause of rejects have to be investigated.
I see you have set the cli argument --noeval which prevents the CPU to re-validate solution found by GPU.
I'd suggest to remove it temporarily and run a batch of 4~6 hours.
If you see several messages about GPU giving incorrect result then it's likely you're overclocking too much. Please note OC values are not universal : what is fine for another miner might be too much for ethminer or vice versa (all miners implement very different ways to invoke GPU work).

@Zaqsnaider
Copy link
Author

I send a screen of another miner-a few hours everything is fine.
phoenix

And immediately launched ethminer. Immediately visible problems. Acceleration and --noeval removed completely.
ethmine_1
ethmine_2
ethmine_3
ethmine_4

Is it possible that the pool accepts ethminer or something else as an attack and disconnects?

@AndreaLanfranchi
Copy link
Collaborator

Is it possible that the pool accepts ethminer or something else as an attack and disconnects?

Highly unlikely
Will try to run a batch now

@AndreaLanfranchi
Copy link
Collaborator

I've run a batch of 30 minutes and got several disconnections too.
Specifically every time a Stale solution gets submitted the remote end (the pool) drops connection.
Not to mention the fact the submission time vary greatly in a range from 60ms to 800ms

I'd suggest to get in touch with pool devs to inspect the problem.
Ethminer on other pools works smoothly.

@Zaqsnaider
Copy link
Author

For much time this was the first such strange case. I am very grateful for your help. I'll contact pool support.

@Zaqsnaider
Copy link
Author

Another farm 4588+1578 shows a record of stability.
ethminer 0.16.0rc1 and win10
ethminer -G -P stratum://WALLET.Name:x@nl.metaverse.farm:3004 --exit --tstop 72 --tstart 40
extra

@Zaqsnaider
Copy link
Author

Sorry - 4rx588+1rx578

@Zaqsnaider
Copy link
Author

Still I will tell that on other farms with such problem there are only Nvidia.

@ddobreff
Copy link
Collaborator

ddobreff commented Sep 5, 2018

It happens when pool doesn't send jobs too often, usually small pools cause this disconnects. Switched to smaller pool on purpose and experienced the same behaviour. What Andrea Lanfranchi mentioned about "--response-timeout" did the trick, depend on how often pool sends jobs you need to increase value from 2-30s. and even more.

[2018-09-05 11:10:10][info][miner]:  X 11:10:08 stratum  No response received in 10 seconds.
[2018-09-05 11:10:10][info][miner]:  i 11:10:08 main     Disconnected from eu-eth.hiveon.net [3.120.72.4:4444]
[2018-09-05 11:10:10][info][miner]:  i 11:10:09 main     Suspend mining due connection change...
[2018-09-05 11:10:10][info][miner]:  i 11:10:09 main     No more connections to try. Exiting...
[2018-09-05 11:10:10][info][miner]:  i 11:10:09 main     Shutting down miners...
[2018-09-05 11:10:10][info][miner]:  i 11:10:09 cuda-1   No work. Pause for 3 s.
[2018-09-05 11:10:10][info][miner]:  i 11:10:09 cuda-7   No work. Pause for 3 s.
[2018-09-05 11:10:10][info][miner]:  i 11:10:09 cuda-5   No work. Pause for 3 s.
[2018-09-05 11:10:10][info][miner]:  i 11:10:09 cuda-4   No work. Pause for 3 s.
[2018-09-05 11:10:10][info][miner]:  i 11:10:09 cuda-3   No work. Pause for 3 s.
[2018-09-05 11:10:10][info][miner]:  i 11:10:09 cuda-6   No work. Pause for 3 s.
[2018-09-05 11:10:10][info][miner]:  i 11:10:09 cuda-2   No work. Pause for 3 s.
[2018-09-05 11:10:13][info][miner]:  i 11:10:12 cuda-7   No work. Pause for 3 s.
[2018-09-05 11:10:13][info][miner]:  i 11:10:12 cuda-5   No work. Pause for 3 s.
[2018-09-05 11:10:13][info][miner]:  i 11:10:12 cuda-4   No work. Pause for 3 s.
[2018-09-05 11:10:13][info][miner]:  i 11:10:12 cuda-3   No work. Pause for 3 s.
[2018-09-05 11:10:13][info][miner]:  i 11:10:12 cuda-6   No work. Pause for 3 s.
[2018-09-05 11:10:13][info][miner]:  i 11:10:12 cuda-2   No work. Pause for 3 s.

@Zaqsnaider
Copy link
Author

With this, the disconnects from the pool are gone.
ethminer -U -P stratum://Wallet.Name@nl.metaverse.farm:3002 --noeval --report-hashrate --response-timeout 60 --work-timeout 300

@chfast
Copy link
Contributor

chfast commented Sep 6, 2018

Two comments:

  1. If we drop the connection because of inactivity, maybe the log message should be other than "Connection remotely closed".
  2. Can we increase the default value for --work-timeout?

@jean-m-cyr
Copy link
Contributor

jean-m-cyr commented Sep 7, 2018

Why would changing --response-timmeout or --work-timeout cause a "connection remotely closed" to disappear? I think that error message comes from the stack, so I don't see how client side timeouts would be the cause!

Or, does boost::asio::error::eof not necessarily mean "connection remotely closed"? Seems so: "An error code of boost::asio::error::eof indicates that the connection was closed by the peer."

Just checked, either of the --response-timmeout or --work-timeout timers would have issued a specific error log on expiring.

@chfast chfast added the bug label Sep 7, 2018
@chfast
Copy link
Contributor

chfast commented Sep 7, 2018

@jean-m-cyr The issue has been resolved by increasing some timeouts, see #1539 (comment).

I'm guessing that when one of the timeout is hit, the ethminer disconnects from the pool. If I'm right, then the log message "connection remotely closed" is incorrect and users don't have a clue how to solve the problem. It would be much easier if the log said "disconnected from pool because of inactivity (response timeout)".

I'm also wander if increasing the default value have any impact on the overall performance. If not we could increase them to make ethminer work with default values in this case.

@AndreaLanfranchi
Copy link
Collaborator

I do not agree.
Actually I see in the thread a couple of misunderstandings.

  1. Connection remotely closed by ... have nothing to do with any value of timeout we can set. This is catched by a very specific condition if (ec == boost::asio::error::eof) which is a clear declaration the remote party have sent an "End of transmission".

  2. Every time we hit a timeout we output the proper log message like No response in ...

I strongly believe the problem depicted on this thread is strictly related to the weakness of the pool which (I am only guessing) has suboptimal load balancing techniques or faulty pool implementation.

All tests on Rate A pools have never depicted such a situation.

@AndreaLanfranchi
Copy link
Collaborator

The increase of timeouts, IMHO, has produced some effects only coincidentally

@AndreaLanfranchi
Copy link
Collaborator

As @jean-m-cyr correctly underlined ... if the disconnection was on our side boost would have returned the "Operation Aborted" error code which is also trapped with a different output message.

@AndreaLanfranchi
Copy link
Collaborator

I'm also wander if increasing the default value have any impact on the overall performance. If not we could increase them to make ethminer work with default values in this case.

No it does not affect ethminer performance in any way. Only problem you may stay connected longer on a non-responsive pool.

@urpils
Copy link

urpils commented Sep 22, 2018

If you look the timestamps, you will see the disconnects are exactly 60 seconds after last job send AND no solution found. So it is a pool decision for disconnecting "inactive" clients. The diff on port 3004 is too high for the miner, try port 3002.
bildschirmfoto 2018-09-22 um 16 23 04

@AndreaLanfranchi
Copy link
Collaborator

@urpils and other.

Opening post for this thread depicts connection on port 3002 which causes problems.

This said if this statement by you is true

the disconnects are exactly 60 seconds after last job send AND no solution found. So it is a pool decision for disconnecting "inactive" clients

the pool is behaving pretty badly. As MTP has a block time of 24 seconds (avg) in 60 seconds the POOL should send at least 2~3 jobs. If it doesn't and it computes the missing jobs as idle time ... well blame pool maintainers ... not ethminer.

@AndreaLanfranchi
Copy link
Collaborator

Closing

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants