Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

select difficulty adjustment algorithm #147

Closed
nathan-at-least opened this issue Mar 26, 2015 · 105 comments
Closed

select difficulty adjustment algorithm #147

nathan-at-least opened this issue Mar 26, 2015 · 105 comments
Assignees
Labels
A-consensus Area: Consensus rules A-consensus-genesis Area: Starting a Zcash chain from scratch. A-pow Area: Proof-of-Work and mining C-research Category: Engineering notes in support of design choices D-economics Design issue: Economics I-performance Problems and improvements with respect to performance I-SECURITY Problems and improvements related to security. in 1.0 question special to Daira special to Nathan

Comments

@nathan-at-least
Copy link
Contributor

Are there improvements to availability if difficulty is adjusted each block? Is there any larger vulnerability (eg: DOSing a nodes peer connections to starve it of blocks lowering it's local belief about difficulty)?

Are there any known advantages or risks which clearly make this worthwhile to spend time on or avoid?

We suspect this may be important in the long term, because PoW mining may eventually lead to cascading exit of miners when their margins go <= 0 (Brad Templeton Death Spiral, #95), or during sudden-withdrawal infanticide, or during sudden withdrawal due to side-chain knock-on effects (see #146). This may help with availability during those kinds of events (although it won't mitigate those events directly).

Is the implementation complexity low? (It would seem like changing a single constant K to 1 in "calculate difficulty every K blocks".)

@nathan-at-least nathan-at-least added question C-research Category: Engineering notes in support of design choices mining I-performance Problems and improvements with respect to performance D-economics Design issue: Economics A-consensus Area: Consensus rules A-pow Area: Proof-of-Work and mining labels Mar 26, 2015
@zookoatleastauthoritycom

I'd like to hear @amiller's opinion of this.

@zookoatleastauthoritycom

I would also be interested in evaluation of this from any of the 7S.

@nathan-at-least
Copy link
Contributor Author

Hey @amiller: Are you already aware of any research on this issue? If you aren't and don't intend to look into this, please let us know soon.

@nathan-at-least nathan-at-least added this to the Zcoin 1.0 milestone Aug 20, 2015
@daira daira added the I-SECURITY Problems and improvements related to security. label Feb 3, 2016
@daira daira changed the title Evaluate security/incentives of adjusting difficulty per block. Evaluate security/incentives of adjusting difficulty more quickly than Bitcoin Mar 8, 2016
@zookozcash
Copy link

A related change is that we should use median instead of mean for these sorts of calculations, so that a few outliers can't affect the result.

@zawy12
Copy link

zawy12 commented Sep 5, 2016

It looks like you're multiplying by a Sum of Past N difficulties based on the previous N ~ 10 to 24, not the current N=2 when it is triggered. So the peaks are 5 to 10x higher than they should be. Is that program usable by someone who can't program in C? Please try v0.12

@str4d
Copy link
Contributor

str4d commented Sep 6, 2016

@zawy12 the code above is JavaScript, and it definitely is using the correct N (this.nAveragingInterval in the code): see how it first checks the last 2 blocks and sets this.nAveragingInterval = 2 if they are outside the bounds, and after that it continues back up to this.nAveragingInterval blocks to sum past difficulties and get nActualTimespan.

@str4d
Copy link
Contributor

str4d commented Sep 6, 2016

Put another way, if the limits are triggered, then this.nAveragingInterval gets set to 2, and so the second for loop will do nothing and exit immediately, leaving sumDifficulties and nActualTimespan being calculated for N = 2.

@zawy12
Copy link

zawy12 commented Sep 6, 2016

[edit: OK, I'm looking into it more to see why we're getting something different ]

@str4d
Copy link
Contributor

str4d commented Sep 6, 2016

@zawy12, in my simulations, red is the difficulty, green is the block interval averaged over the short interval (10 blocks), and blue is the block interval averaged over the long interval (100 blocks). The total hash power is stable, apart from a single entity controlling 35% of total hash power, which turns on and off every 500 blocks (starting on at block 0).

@zawy12
Copy link

zawy12 commented Sep 6, 2016

OK, yes, your chart concurs with my math for a Poisson process with a stable hash rate and my settings. There should be about 5 jumps up and 5 jumps down per 100 blocks. It may be that by judging it by observation instead of theory (using my charts), my algo is useless. But it needs to be "investigated" why this is so (why z9 is not a Poisson process). In particular, there are no block times since block > 1500 that was more than 330 seconds ( so Actual/Target > 2.5 never occurred on me unless it was post attack). And the minimum is not a Poisson process since only an attack can make it < 0.18 when it should have been occurring on me also at 5%.

It may be I have to abandon hope of using N=2 or 3 minimums as a result of not being able to tweak on an active network. Even if my charts are correct for z9, any future change to the solve routine that makes it more or less like a Poisson process would need an unavailable tweaking of my difficulty settings. It needs to be more robust in regards to solve routines.

Concerning your previous comment "real block times aren't themselves sufficient for describing the effect of changes, since changing the algorithm would completely alter those times" I have been saying my solve time would have been (actual solve time) x (my difficulty) / (actual difficulty) which I can't see from theory would not be perfectly accurate, at least on average.

@daira
Copy link
Contributor

daira commented Sep 6, 2016

@zawy12: the reason for the block times not being greater than 330 seconds is the testnet-specific minimum difficulty rule. It will be removed in beta 1, since it's obviously causing confusion and is an obstacle to understanding what the behaviour will be on mainnet.

@zawy12
Copy link

zawy12 commented Sep 6, 2016

@daira str4d said in the related issue the same as my thoughts: someone is turning a big miner on when they see > 300 seconds (5 minutes) which is why the "1" is solved quickly. My big confusion with "1" was the result of it never being actually set to "1" for what the miners had to solve. str4d has said that this is the case. It shows "1" but as far as the difficulty being solved it's set at the previous difficulty. This made Zcash difficulty algorithm function differently from Digisheild v3 because when it was "1" the 16% decrease was being blocked. So it stayed high long periods. I posted a comparison in the forum. The accident is arguably an improvement on Digisheild v3 as both have terrible oscillations.

In z8 d=1 averaged 200 second solve time instead of 30 to 60 seconds. And in z9, here is a typical sequence of block solve times before the ongoing "attack" started. Block 1364 below is when it was set to "1" instead of "200" like all the other blocks around it.

Block solve times should be shifted up 1 in this list.
1359 210
1360 270
1361 55 D =240
1362 107
1363 151
1364 318
1365 306 D=1
1366 143 D=1
1367 57 D=240
1368 60 D= 240
1369 111
1370 136
1371 241

@daira
Copy link
Contributor

daira commented Sep 6, 2016

It's not necessary for there to be a big miner switching on in that case; the expected outcome is that whichever miner next finds an Equihash solution after the 5 minutes, will grab the block. It's not surprising at all that this would take less than 30 seconds.

Honestly, focusing on the minimum difficulty behaviour is not useful, because it's testnet-specific and is going away.

@zawy12
Copy link

zawy12 commented Sep 6, 2016

If a big miner is not turning on, how can it suddenly violate a Poisson distribution with a rigid limit? Both things for me are about trying to understand what the network is doing under testnet conditions so I can know what to expect from mainnet when the programmign differences are removed. D=1 took 40% longer than D=80 on z8. D=1 has generally been the same as D=150. But suddenly in z9 this changed after block 1500.

If D is really "1" like it says, and if big miners have not been turning on above 300 s in testnet, then main net will not be a Poisson distribution for a given hash rate. But by knowing D is not actually being set to "1", I am able to see there is a miner and that Poisson distribution is not violated.

It also enables me to see that testnet has not tested Digishield v3 as intended.

@daira
Copy link
Contributor

daira commented Sep 6, 2016

I'm not sure how much if anything can be inferred from behaviour when D = 1, but just to clarify the current behaviour on testnet: at the time of the difficulty switch 5 minutes after the previous block, all miners will already be some part of the way through a solution run. Whichever one gets there first will obtain the block. So the expected time to obtain a block after 5 minutes is essentially a "next bus arrival problem" (for multiple buses with different interarrival times depending on the performance of each miner) and in practice will be pretty quick whenever there are many miners.

@daira
Copy link
Contributor

daira commented Sep 6, 2016

What may be confusing you @zawy12 is that although the difficulty for the current block is set to 1 after 5 minutes, the difficulty used for input to the adjustment algorithm for the next block is unchanged. And just to repeat, this behaviour is going away in beta 1.

@daira
Copy link
Contributor

daira commented Sep 6, 2016

Also, the oscillation problem seems very likely to be fixed completely by @str4d's averaging change, as far as we can tell from simulations.

@daira
Copy link
Contributor

daira commented Sep 6, 2016

Incidentally, the way we're graphing block times does tend to exaggerate the variability. When we have time we might try to find a visualisation that properly takes into account the expected variability of the solution process and teases apart the effect of the difficulty adjustment on that. Maybe we should hire a statistician? (Serious suggestion.)

@mcelrath
Copy link

mcelrath commented Sep 6, 2016

I'll reiterate a previous comment: All behavior of the difficulty adjustment algorithm can be understood in terms of a damped harmonic oscillator: https://en.wikipedia.org/wiki/Harmonic_oscillator#Damped_harmonic_oscillator

Median time is the correct thing to do as str4d points out, and all other parameters can be understood as fooling with the damping factor and averaging window. But! There's a well-known optimal solution for the damping factor ("critical damping"). In this case you know the period of the driving force (it's the block time -- hashrate that changes faster than this is unobservable), so you know the critical damping factor too.

I know you guys didn't want to mess with this close to release (and I don't blame you), but the conversation I see above seems to be re-creating a bad approximation to a critically damped harmonic oscillator... You're simply not going to do any better than that solution, and at best will recreate it using a bunch of other opaque parameters.

The fundamental parameters are:

  1. Damping timescale (can be removed by choosing "critical damping" -- setting it equal to the block time)
  2. averaging window for measuring (hashrate), d(hashrate)/dt, and d^2(hashrate)/dt^2

With these in hand, you can write a simple 2nd order differential equation for critical damping with only one parameter (the block time). The necessity of averaging introduces a second timescale.

I think reducing this problem to have one parameter (the averaging window, at fixed block time) significantly simplifies and clarifies the problem.

@zawy12
Copy link

zawy12 commented Sep 6, 2016

OK, thanks to daira's comment I understand it (finally). I had no idea it was possible to change the difficulty in the middle of a block that was being solved. This greatly affects the testing of difficulty algorithms. The bus analogy should be: 1st bus is on a 2.5 hour Poisson distribution schedule, and another is on a 15 minute Poisson schedule, but the 2nd bus is dispatched to replace the 1st bus if the 1st bus did not make it in 5 hours.

Because of this, the current difficulty algorithm has not been tested because testnet is not a Poisson process, and the algorithm itself was modified away from Digisheild v3.

Since you were not seeing oscillations in the theoretical implementation, I suspect no simple averaging method would have oscillations in that implementation. I suspect the oscillations are the result of either the hard upper limit, Digishield v3, if it is not a Poisson effect.

@mcelrath To an extent you're correct and it's what I'm trying to do, but there are problems with saying it is a straight-forward controller problem. 1) it's not a system with a stable forcing function. Both its frequency and magnitude are changed after you've designed your controller in order to cheat the controller's design. Big miners will make it go into oscillation if you underdamp. They'll get blocks on the down side. If you overdamp, they will mine quickly to pick up 20 blocks and make everyone else wait an hour to get a few more. This is what the adjustment method is going to do. Digishield v3 actually uses postive feedback in a way that cause BOTH to occur. I suspect it's not obvious to everyone on a real network because miners competing to cheat may have reduced the visibility of the effect, actually helping everyone to avoid the problems. 2) You can't easily program the controller to change itself in response to a new forcing function because you can only look at past blocks to determine the new function and the the Poisson process prevents you from being able to see it quickly and clearly. Your inability to measure and respond is where a cheater's profit is. It's hard to not prevent wide variation when someone is secretly trying to force wide variation.

The problem is so simple in its characterization that there should be a standard solution. The median problem is the only thing that I can see would modify it. The variables for input to the solution are average blocks/time setpoint and what max time to solution is allowable X% of the time. The solution should pop out without any theoretical improvements possible. It will also tell you how many blocks per time can be cheated based on your selection of X. If you want minimal coins lost at all cost, then I know the ideal solution: next difficulty = (last difficulty) x (set point) / (last solve time). That's it. It's perfect. It keeps the average. There's no way to cheat it for more than 1 block in a sequence. Miners faced with this might choose a constant hash rate, and simply allow cheaters to get every other block free. But what if there are two cheaters? Three? 400? Would it not balance out based on market forces and perfectly fair mining? Suppose a pool cycles through pools trying to get 1 block at a time. Would that be profitable?

Only partially joking, that's my new recommendation for beta:
(next difficulty) = (last difficulty) x (150 seconds) / (last solve time)

Consequences: 10% of blocks next D = 10x longer than 2.5 minutes. But 25 minutes 57 times a day is not possible. It appears the simple ratio method needs adjusting based on the Poisson distribution being skewed on the low verses high side (asymmetrical). If there are inherent oscillations in the averaging methods, maybe that is the (fixable) cause. so it's not this bad, but something like 25 minutes 4 times a day.

@zawy12
Copy link

zawy12 commented Sep 7, 2016

I'm getting surprisingly good results with the following.It responds to attacks with correct difficulty and returns to normal in 3 blocks. The drawback is that it has maybe twenty 10 minute intervals per day. But as I recall 2.5 minutes is less of a requirement than even distribution of mining. There's no error introduced by a medium.

(next difficulty) = (last difficulty) x SQRT [ (150 seconds) / (last solve time) ]

edit: I made a Poisson tester like str4d's and it doesn't work at all on a Poisson distribution. It just works on the testnet data. But are the accidental 1.5 second solves a Poisson generator gives even a remote possibility on a solver that normally take 150 seconds? These never occur on testnet.

@zawy12
Copy link

zawy12 commented Sep 7, 2016

A Poisson generator will give 1.5 second solves 1% of the time for a 150 second solver. This is what throws off my low N methods (1, 2, or 3 instead of 17) that work on testnet. Is there some way to prove a lower limit on the solve time in order to make a Poisson generator for this testing more realistic?

zkbot pushed a commit that referenced this issue Sep 8, 2016
…, r=ebfull

Tweaks to difficulty adjustment algorithm

This PR changes the difficulty algorithm to adjust from the average difficulty over the
block window instead of from the last difficulty. It also removes the special rules for the
testnet, which are incompatible with difficulty averaging.

Closes #147 again.
@zawy12
Copy link

zawy12 commented Sep 8, 2016

The only good solution I've found is to not let the difficulty decrease. That seems to solve difficulty / attack problems. If miners find it unprofitable to find blocks because of a decrease in coin value, then mining and coin issuance should stop to balance supply and demand, and switch to fees until demand returns by its increased use in the marketplace, or until computing power increases.[edit: It needs a real timestamp]

@daira
Copy link
Contributor

daira commented Aug 2, 2017

Please note that the difficulty algorithm we actually arrived at is described in the protocol spec, section 6.4. It does not use PID control or any square-root operations.

@zawy12
Copy link

zawy12 commented Jan 14, 2018

I did a PID controller difficulty algorithm based on the EMA algorithm and compared it to the best algorithms available. It wins based on my metrics (lowest Std Dev for a given speed and fewest delays) and should be able to be adjusted to win based on any other metrics chosen...if simplicity is not one of the metrics.

zawy12/difficulty-algorithms#20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-consensus Area: Consensus rules A-consensus-genesis Area: Starting a Zcash chain from scratch. A-pow Area: Proof-of-Work and mining C-research Category: Engineering notes in support of design choices D-economics Design issue: Economics I-performance Problems and improvements with respect to performance I-SECURITY Problems and improvements related to security. in 1.0 question special to Daira special to Nathan
Projects
None yet
Development

No branches or pull requests