Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASIC Resistance, TurtleCoin Hashing Algorithm Change #74

Open
RocksteadyTC opened this Issue Mar 5, 2018 · 23 comments

Comments

Projects
None yet
@RocksteadyTC
Copy link
Contributor

RocksteadyTC commented Mar 5, 2018

Credit to @Zedpea for bringing this to my attention. It has come to our attention that there are credible claims of an ASIC miner approaching the market soon. More information can be found in this post.
https://monerobase.com/wiki/DevMeeting_2018-03-04

In the past this topic has been brought up in various chat media in the context of defeating rented hash power and botnets. Other Cryptonote projects have modified their hashing algorithms successfully in the past, and with enough testing, so can we. This thread should serve as a place to discuss two things:

  • Known hashing algorithms that are likely candidates for this change (cryptonight-light, wild-keccak, cuckoo cycle)
  • New ideas for a modified Cryptonote hashing algorithm (GPU mined, proof of work, ASIC resistant)

EDIT: Fri, Mar 23 - Update

Two stage solution

The algorithm switch will occur in two stages, with stage one getting us immediately out of the path of ASIC, and stage two getting us away from rented hash power and lazy botnet owners.

Stage 1:

CN-Lite V7 - We could have chosen Monero's V7 algo, but chose instead to increase the hashpower of our miners due to our fast block speed. This should decrease orphans somewhat, and keep us out of the path of ASIC, while still maintaining main stream compatibility with mining and pool software.

Sat, Mar 24 - deployed on testnet

Stage 2:

Stage 1 affords us a bit of time to make a well tested, informed decision about what our Stage 2 algo should be, and what it should do. Ideally we have as much time as we want for this, but the reality is, these ASIC mfr's are likely to follow any changes in the mainstream XMR/AEON space, and the sooner we make a well planned move to another algo, whether it be cuckoo or argon2 based, the better off we'll be in the constant scramble to get away from unfair mining advantages and centralization.

Timeline:

Stage 1 - Obviously the time is of the essence for Stage 1, and given that it is not the biggest of efforts to port the new hashing algorithm over, it does take time to implement and test correctly. That being said, we're already working on it, and aim to have services and software ported around late April. The reality is, if China has ASIC, they're likely already using it, and it's unlikely to be used on TRTL, though if it were, we'd at least like to get switched by May 1st when the first units begin to arrive in USA. As these things go, there are always delays in these ASIC shipments historically, and we're somewhat banking on that, although not reliant on it.

Stage 2 - It's wise to think that we'd have about 90-120 days after the launch date of the Stage 1 algo switch. This affords us plenty of time to tinker and test, with the biggest concern being porting software and services over with an adequate amount of notice.

References:
https://github.com/tromp/cuckoo/blob/master/doc/cuckoo.pdf
http://boolberry.com/files/Block_Chain_Based_Proof_of_Work.pdf
aeonix/aeon@0474a51

@zpalmtree

This comment has been minimized.

Copy link

zpalmtree commented Mar 5, 2018

One thing I think we should consider is that if we don't use an off the shelf solution, we have to consider the possiblity that people will improve our miner implementation in private and gain a hashrate advantage in comparison to their hardware.

@RocksteadyTC

This comment has been minimized.

Copy link
Contributor Author

RocksteadyTC commented Mar 5, 2018

To a certain extent this is true, but I think if we use a sufficiently modified cryptonight algorithm, we don't open ourselves to that issue more than what we already are, with regard to our current algorithm.

@thinkpol2

This comment has been minimized.

Copy link
Contributor

thinkpol2 commented Mar 5, 2018

Would it make sense to just follow Monero on the PoW? Maintaining compatibility could be beneficial for both you the developers and the users (miners). Monero plans to change PoW as frequently as necessary.

@RocksteadyTC

This comment has been minimized.

Copy link
Contributor Author

RocksteadyTC commented Mar 5, 2018

To a certain extent, yes, there is incentive for us to not deviate much from other projects, but as we have a shorter target time, it also makes sense for us to look at changes which increase the number of rotations we're making in our hashing algorithm, similar to what Aeon did in the link referenced above. This change allows us to decrease the rate of orphaned blocks we're seeing now as a result of low difficulty from the short target time, without altering the target time itself. The positive result of this is eliminating rented hashpower, and Monero specific ASIC hardware, and most Monero webminers and botnets. The only negative is porting that hash algorithm to XMR-Stak unified miner, which is equivalent in effort as it was to add Aeon to the miner. I'd be willing to submit the PR to XMR Stak so that we are not placing undue burden on the developers or our community.

@krruzic

This comment has been minimized.

Copy link

krruzic commented Mar 8, 2018

I like cuckoo-cycle. Grincoin is working on cool stuff with it so there is a knowledgeable community on it. I don't know the timeline for this change but maybe we could follow behind on grincoin R&D? The DAG size used for hashing can be tuned easily, pushing even GPUs to 1H/s. I really don't know that much about it beyond this, but I like the name and the idea behind it.

Modifying existing cryptonight ends up being "let's remove this section and add this section". The changes are trivial and only ward off ASICs for a few months. I say go big or go home!

@RocksteadyTC

This comment has been minimized.

Copy link
Contributor Author

RocksteadyTC commented Mar 9, 2018

I really don't know that much about it beyond this, but I like the name and the idea behind it.

@krruzic Well shit that seals the deal for most of our user-base.

A few users have brought up a cuckoo cycle as a possible solution, and we've been discussing the possibility of what a change to cuckoo might look like on our network; it does have some compelling arguments going for it, as far as preserving GPU minability and few detractors. My main concern is getting it implemented quick enough in a way that is correct and respectful to our users, rather than rushing something out.

We do have a time constraint, which complicates things. I'm not ruling out that we may end up using a modified cryptonight hashing algorithm in the next fork upgrade cycle to get us out from under the crosshairs of ASIC and NiceHash long enough to put together a solid cuckoo implementation, but that is an indeterminate amount of time to implement and review it so we may as well make our modified algo as logical and solid as possible before we switch to it. As a rule, anything that weakens the integrity of the transaction safety is a no-go, so I hope I'm not giving the impression that "let's just tweak a few numbers".

Regardless of our path to a new algo, I have no doubts that we can accomplish this upgrade smoothly as we've already endured one such algorithm change (albeit a difficulty algo change) without major incident. With enough planning and testing this will be a great step forward for us.

@nighthawk18

This comment has been minimized.

Copy link

nighthawk18 commented Mar 9, 2018

A POW change definitely needs to be carefully planned. The old POW nodes can easily surpass the new POW nodes and become the dominant chain leading to required checkpoints and lost blocks, etc.

Even though I work on another coin, we share the majority of the same code base so I'm interested to see the path TurtleCoin follows. I feel the one unique algo per coin is probably not the right approach and having some level of algo sharing makes sense across the broader community.

@nighthawk18

This comment has been minimized.

Copy link

nighthawk18 commented Mar 9, 2018

And yes, Cuckoo looks pretty good at 220 lines of code -- https://github.com/tromp/cuckoo

@zpalmtree

This comment has been minimized.

Copy link

zpalmtree commented Mar 11, 2018

@jdoi13

This comment has been minimized.

Copy link

jdoi13 commented Mar 15, 2018

And another one : http://shop.bitmain.com/productDetail.htm?pid=000201803132107063379CD35Gxy064F
This one comes in May, but has much more hashrate.
+1 on the initiative. Is there a milestone/timeline/deadline on this?

@iamamyth

This comment has been minimized.

Copy link

iamamyth commented Mar 17, 2018

I think the recent developments regarding CryptoNight ASICs illustrate the importance of a sound methodology for reasoning, systematically, about desirable ASIC resistance properties and how to achieve them. In particular, the original CryptoNote whitepaper included the assumption that "A megabyte of internal memory is almost an unacceptable size for the modern ASIC pipeline" ([1], p. 11). While that assumption may have been reasonable in 2012, it no longer holds, not only due to the existence of CryptoNight ASICs, but in the broader sense of what target one should set, in 2018, based solely on existing CPU and GPU specifications and what they imply about possible circuit designs. Put differently, as our knowledge of physics and circuit design improves, the necessary standard for ASIC resistance changes; a 128k scratchpad would have been a very ambitious ASIC target in 1994, but, over time, the assumptions underlying such a target would have been invalidated and in need of refinement, just as the 2MB target set in 2012 no longer holds. Therefore, it seems the most crucial element is not simply which algorithm to use now, but how to decide its effectiveness, and, relatedly, how to determine when to change the algorithm's parameters or the algorithm itself.

For a theoretical framework, as well as a variety of actual algorithm suggestions, I strongly suggest [2], which provides a hardware-cost based evaluation mechanism for an algorithm's relative ASIC resistance, and [3], which proposes concrete limits for today's hardware and a simple mechanism of setting approximate upper bounds on hardware performance. I highlight a few important findings from these works:

  1. The choice of oracle function matters because, though computation may be much cheaper than memory access, it's not free. Therefore, an ideal oracle has a very efficient (near-optimal) implementation in existing hardware. AES might seem like a good fit here, due to AES-NI, but, because AES does not require its entire input to be present in memory at once, the memory behavior and bandwidth hardness can be difficult to model.
  2. Memory latency does not suffice as a measure of ASIC resistance, because "if speed were the sole metric, we could just deploy a million CPUs in parallel to catch up on speed" (Ren and Devadas, p. 6).
  3. A bandwidth-hard function with an efficient CPU/GPU oracle provides a likely path to ASIC resistance.

Inasmuch as predictive value can be seen as a hallmark of good science, I also note how [2] and [3] could have been used to predict the insufficiency of CryptoNight:

  1. CryptoNight targets capacity hardness, not bandwidth hardness.
  2. The CryptoNight target capacity of 2MB falls well below predictions of theoretical capacity for SRAM-only chips (approximately 628MB, see [3]). Therefore, CryptoNight's defense against ASICs relies on very specific assumptions about circuit layout and pipeline efficiency, which necessitate precise calculations and could easily prove invalid over time.
  3. The ~6.7 expected efficiency increase of an AES ASIC, versus CPU AES-NI instructions, means that ASICs would inevitably have a significant advantage in operating costs. The massive improvements in Intel's own AES-NI latency over the past years (from 8 cycles to 4) compound that potential advantage, relative to GPUs and 1+ year old CPUs.

I see Monero's lack of a rigorous methodology in this area as a strong motivation to consider independently selecting a new PoW algorithm (though I do not mean to discourge interim adoption of Monero's minor changes to CryptoNight). Monero's decision to change the PoW seems to have been based not on the feasibility of building a high-performance ASIC (in which case, it probably would have happened much sooner), but instead on the goal of decreasing the expected value of an ASIC by reducing its likely useful life. Though one can certainly model the likelihood of ASICs as an economic problem, estimating an ASIC's expected monetary value seems impossible without models of hardware complexity and its effect on cost and production time. To my knowledge, Monero has no such model, resulting in arbitrary decisions (changing PoW at every fork, while perhaps convenient as a short-term solution, might not sufficiently tip the scales; or, maybe once every year would suffice, so the increased frequency introduces unnecessary risk). A construction which renders the precise expected value less relevant by creating insurmountable physical barriers, coupled with a mechanism to update the algorithm to reflect these assumed physical limits, seems beneficial. And, while frequent changes to the algorithm may offer one such option, a well-picked, bandwidth-hard scheme seems much less costly.

Regarding possible algorithms, [2], [3], and [5] provide additional options:

  1. Scrypt with a large scratchpad [2]: Verium uses this scheme with a 128MB scratchpad (which may be too small) but, if I recall correctly, modern Intel CPUs, e.g. i7s, clock in around 2k H/min, which proves problematic for verification.
  2. MTP-Argon2: Zcoin proposes using this algorithm as their PoW, and has funded bounties to that end, uncovering certain vulnerabilities. Solutions for the bounty vulnerabilities have been proposed (and in some cases implemented), as have additional improvements to the algorithm [3]. Apart from its lack of battle testing, the main drawback to this algorithm would be the proof size of approximately 200KB [4].
  3. Itsuku [5]: A successor to MTP-Argon2 which implements a variety of refinements, including the proposals in [3], and reduces proof sizes to roughly 12.5K [5]. I understand, based on [6], Boolberry planned to adopt this PoW, but now it will be launched as a separate project, Doubloon [10]. Compared to most of the other options, as well as to CryptoNight itself, it has a much more rigorous academic background as a proof of work function.

The project's goals seem highly relevant to the choice of PoW. If the goals necessitate competitive CPU and GPU mining, Cuckoo Cycle does not appear viable, as it offers a considerable theoretical advantage to GPU miners (approximately 10 times the hash rate, see [11]). At the same time, Cuckoo Cycle lacks a particularly efficient GPU implementation [12], and the existing implementation limits the scratchpad size in order to improve the efficiency of GPU mining [12], raising questions about the long-term viability of this algorithm against ASIC development, insofar as one cannot freely alter parameters to counter newer generations of hardware without sacrificing GPU mining. The existing memory latency-bandwidth tradeoff [9] compounds this problem, as a 1GB scratchpad requires only 93MB of memory (well within the theoretical limits proposed in [3]), yet the latency-bound algorithm only loses to the bandwidth-bound one by a factor of four on a CPU, a gap that an ASIC might close. Furthermore, Cuckoo Cycle has seen limited peer review, and past work revealed considerable optimizations [8], so another algorithmic optimization which further alters its properties in a manner that makes it even less desirable, either by increasing the CPU-GPU gap or promoting ASIC development, seems plausible.

Much like CryptoNight itself, Wild Keccak suffers from a lack of peer review and unnecessary alterations to known cryptographic primitives [3], making its properties very much uncertain. Though Wild Keccak has the advantage of a concrete implementation in Boolberry, that hardly seems a sufficient reason to adopt it, given the alternatives of waiting for Doubloon's C++ Itsuku implementation, or porting the existing Python implementation of the algorithm to C++.

I think the best path forward would be to buy time (probably six or 12 months) with minor changes to CryptoNight and implement Itsuku. At least one other project with similar goals and history will use it, it has a solid theoretical foundation, including a concrete model of hardware efficiency, and, at the very least, due to its novelty, it would take quite a while for someone to produce an ASIC. Put differently, Itsuku hedges the risks of adopting a new algorithm with its track record of peer review, while offering the inherent benefit if a new, ASIC-resistant egalitarian proof of work, namely, time to market for an ASIC, which, in CryptoNight's case, took many years.

References

[1] Saberhagen, N. (2012). CryptoNight v 1.0. https://cryptonote.org/whitepaper_v1.pdf.

[2] Ren, L. and Devadas, S. (2017). Bandwidth Hard Functions for ASIC Resistance. https://www.docdroid.net/GYEk2YB/eval-mtp-argon2d-final.pdf.

[3] Coelho, F. (2017). Evaulation of the MTP-Argon2 PoW Scheme. https://www.docdroid.net/GYEk2YB/eval-mtp-argon2d-final.pdf.

[4] Bevand, M. (2017). Attacks on Merkle Tree Proof. http://blog.zorinaq.com/attacks-on-mtp/.

[5] Coelho, F., Larroche, A., and Colin B. (2017). Itsuku: a Memory-Hardened Proof-of-Work Scheme. https://eprint.iacr.org/2017/1168.pdf.

[6] Bitcoin Forum. (2017). Retrieved March 16, 2018 from https://bitcointalk.org/index.php?topic=577267.msg25519288#msg25519288.

[7] Biryukov, A. and Khovratovich, D. (2016). Egalitarian Computing. https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_biryukov.pdf.

[8] Andersen, D. (2014). Exploiting Time-Memory Tradeoffs in Cuckoo Cycle. https://www.cs.cmu.edu/~dga/crypto/cuckoo/analysis.pdf.

[9] GitHub - tromp/cuckoo: a memory-bound graph-theoretic proof-of-work system. Retrieved March 16, 2017 from https://github.com/tromp/cuckoo.

[10] Bitcoin Forum. (2017). Retrieved March 17, 2018 from https://bitcointalk.org/index.php?topic=2539743.180.

[11] Q&A at Reddit with John Tromp, inventor of the Cuckoo Cycle mining algorithm. (2017). Retrieved March 17, 2018 from
https://blog.aeternity.com/q-a-at-reddit-with-john-tromp-inventor-of-the-cuckoo-cycle-mining-algorithm-c316119c07e9.

[12] Progress Report on the Cuckoo Cycle GPU Solver. (2018). Retrieved March 17, 2018 from https://github.com/tromp/cuckoo/blob/master/GPU.md.

@pypper

This comment has been minimized.

Copy link

pypper commented Mar 18, 2018

Excellent proposal @iamamyth . The piece of the Monero PoW change I think you're not giving enough weight to is the huge number of pet rocks being sold right now for today's algorithm..

While someone can go back to the drawing board with the new Monero changes and probably come up with another ASIC -- they know that as soon as they do, Monero will pivot away. The massive investment required to produce these ASIC miners wouldn't be worth it. As soon as they get to market they're slapped with a PoW change that makes more Bitmain DoorstopX3s, they're dissuaded from even putting the effort in. I don't think it's about choosing a PoW algorithm that is un-ASIC-able, but making the environment so hostile to the developers of this tech that they'll stop bothering with it.

@xkasl

This comment has been minimized.

Copy link

xkasl commented Mar 18, 2018

@iamamyth thanks for the very informative post. I agree with @pypper we're not going to stop asic development but rather make them unprofitable to develop. I like the idea of cuckoo cycle.

Here's a little info on cuckoo cycle.

https://blog.aeternity.com/cuckoo-cycle-memory-bandwidth-bound-pow-mining-9aa5695a4c94

@4Reallive

This comment has been minimized.

Copy link

4Reallive commented Apr 3, 2018

I think a huge drawcard with cuckoo is it provides a PoW without burning up as much Power as running a CPU a full load. It certainly would encourage more people to mine, with more profit. Which is needed when switching PoW so that the hasrate doesn't suffer for an extended window.

I have not tested the power requirements for Cuckoo, just assumptions that it would have less power draw. I will consider testing in the near future as I think it would be a great selling point to miners.

@brandonlehmann

This comment has been minimized.

Copy link
Collaborator

brandonlehmann commented Aug 19, 2018

This change was completed. Lol. This one can close.

@SoreGums

This comment has been minimized.

Copy link
Collaborator

SoreGums commented Aug 19, 2018

what about Stage 2??? :)

@brandonlehmann

This comment has been minimized.

Copy link
Collaborator

brandonlehmann commented Aug 20, 2018

@brandonlehmann

This comment has been minimized.

Copy link
Collaborator

brandonlehmann commented Dec 9, 2018

Incoming news...

@brandonlehmann

This comment has been minimized.

@RocksteadyTC

This comment has been minimized.

Copy link
Contributor Author

RocksteadyTC commented Dec 18, 2018

@thriftyMinnow I noticed you posting in another thread[1] about being a good C developer, and I'm curious if you wanted to jump in on this hashing discussion in regards to our upcoming PoW fork. We could use some helpers if there are aspects of the project you think you could pitch in on. If you have a competence in C, I'd hazard a guess you might also be interested in a few of our core suite ports that we're working on as well, currently we're porting the core suite to C# and Golang.

Our Discord is where most of the nerds hang out, and if youd like me to show you around and make the introductions (if you're not already in here) I'd be happy to do that.

[1] - aeonix/aeon#89

@brandonlehmann

This comment has been minimized.

Copy link
Collaborator

brandonlehmann commented Jan 23, 2019

Fork upcoming at block 1.2M (https://github.com/turtlecoin/turtlecoin/releases/tag/v0.12.0)

Working on Chukwa

@thinkpol2

This comment has been minimized.

Copy link
Contributor

thinkpol2 commented Mar 11, 2019

@brandonlehmann

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.