Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
ASIC Resistance #9
I must disagree with these estimates on the ASIC efficiency gains and say they seem too low.
For one thing, isn't floating point a huge part of GPUs? If I understand the history correctly, integer math was emulated by floating point units for quite some time!
But let's ignore floats. There's much more to chip design than shrinking die area, which is actually a very modest cost. Far more important is the power per hash. GPUs are designed to optimize framerate and don't care about power, as long as they can dissipate the heat. Even if an ASIC had the exact same logic requirements and die area as a GPU, significant efficiency can be gained during the physical design process by choosing low power over speed. An ASIC might be slower per silicon area, but it will be better hash-per-watt. The extra capital cost for silicon area is small compared to the operational power efficiency advantage.
It's similar to ARM vs Intel in the CPU world. If you want the absolute fastest chips, you buy Intel, but all cell phones run ARMs. GPUs are Intel and ASICs are ARM.
To be fair, I think ProgPoW is a good try at device-binding, but ultimately GPUs are ill-suited for mining, and the more you utilize the GPU, the greater the gap with ASICs. There's really nothing software writers can do about that, except to minimize the usage of GPU cores by saturating the bandwidth to commodity DRAMs.
Fuller writeups here:
Thanks for listening. We can just insta-close the issue if you want. Just thought I should record these comments somewhere in the project.
I have been following ProgPoW fairly closely and have seen similar comments before. To save some time rehashing the same conversations, below I have posted a conversation between David Vorick and Mr Def of IfDefElse that took place in a public telegram channel on September 20th and a summary of a segment of the Ethereum Foundation Core Dev Meeting 48 From October 12th. https://www.reddit.com/r/ethereum/comments/9jq75n/progpow_algorithm_change_covered_in_todays_eth/:
Ethereum Foundation Meeting:
Alexey Akhunov "I think that we need a bit more exposition about why...we kind of believe that this...from the description of the algorithm that it's supposed to be doing what it's doing, making it harder to implement ASICS and I get the general idea. But I do believe that if people do really know what the reasons (are) that they can actually explain it in some simple way. Maybe not in a very simple way but at the moment I feel like a lot of people including me --either I'm really kind of dumb or-- I don't really know what (it's) doing. I'm just trusting that someone else who is cleverer than me understands this and I don't. I also got (the idea) that some people are talking to each other or (having conversations) that they cannot disclose and it just doesn't really have a good feeling. So maybe somebody can write down...some exposition about why exactly the current technology of ASICs will not be able to do ProgPOW efficiently in a more detailed way so that people can apply critical thinking rather than just trusting that somebody...says that they have experience and (saying) "okay that will be fine."
Mr. Def "Hey Alexey I think that's totally fair. It would be helpful to get specific questions on areas where you want more information and I think we have been very bad about handling the Ethereum Magician's discussion and so we will improve that in the future and be a little bit more responsive.
And so that's the perspective that we started with and so in optimizing for a specific type of hardware the goal is to maximally utilize all the functions of that hardware---a large register space (that's expensive) and of course not to forget the starting point of why Ethash is strong which is it's still memory bound. So the algorithm starts from a place where it's memory bound and its still going to be predominately memory bound. In addition, it also has to use the additional registry space that GPUs are able to provide and are needed for additional math calculations. And, on top of that, adds the programability aspect or the programmatic aspect (where) the exact series of math operations that you're running is changing in every epoch or, actually, as proposed with the stratum implementations, would change every 25 or 50 blocks or something like that --to change even faster.
Now, when you do something like that the problem with implementing an ASIC for something like that or a different ASIC or a more custom ASIC is you would have to design the ASIC to either be flexible enough to capture all the possible variations or evolutions of the algorithm or you'd actually have an ASIC that pre-designs for every variation or every math ordering in the evolving algorithm. So, if you pre-design for every possible variations well you're ASIC just explodes. You're just burning silicon area that's mostly unused. If you try to design for the programmability and the register file size that you would need then you basically have something that is a very big ASIC that is also applicable to many other general math problems. Which is fine because if you're gonna design a general math processor, I think that's the goal of this project. I think having more general math processors in the world is a good thing and having these more flexible computation units is a good thing at least until we have POS. So leveraging off the existing install-base of more general math units was the goal of the project.
So we're basically trying to force a custom design to be not-that-custom because you have to flexible to varying and changing math and a very rapid pace and you have enough variation that you can't pre-design for all of it and you have to pay additional silicon to be able to even execute the math.
If you have specific implementation questions in terms of why ASICs can't keep up with it or can't design for these math variations, we can certainly do a deep dive on this and I think for our responses it would be best to put it on some public forum like Ethereum Magicians so that once you ask a question everyone can see the response and we can just point people to that forum if other people have similar questions"
Economics and can you build an ASIC for ProgPOW
Mr. Def "Right. To be clear on another point, we tried to make the algorithm as optimized as we could for the GPU but it is true that it is not the most optimized piece of hardware simply because things like GPUs have floating point paths that's not really appropriate for cryptography but that's only a small part of the silicon that's unused. There's other parts of the silicon including display outputs and things like that that, of course, are also unused.
In working and having the GPU-makers assess and review this algorithm the conclusion was that it's roughly 20% of the (GPU) area that would be unused (with ProgPOW) and it would not be a 20% power penalty but simply a 20% area penalty. Or, basically, an area savings that you could have if you stripped out all of the unnecessary bits of the GPU.
And then we also asked them to do an economic analysis of what that savings would be in terms of having an ASIC be more economically efficient (by) saving that silicon area. Online, you can look at die-area estimates and how much it would cost and if you look at GPUs that are most popular in the mining world today --i guess that's the 480/580 and the 106--then it's roughly $50-$60 for a piece of silicon and you save roughly 20% of that which is ~$10 and (then) the total manufacturing cost of the board, that's roughly $200, (so) you're really saving an insignificant amount of the total cost of the board.
So, yes, you can have a more custom hardware design for ProgPOW than GPUs and save some silicon-area but economically speaking it's not a significant impact to the economics where it would cause someone to go do a custom design especially given the amount of volume that GPU manufacturers have access to versus someone who would be doing custom design. The economic structure of doing an ASIC just would not be worth it.
There's also been other comments that we've seen where GPUs are moving further away from doing simple math and that might be true but at least in this generation, until we get to PoS, I think (progPOW) is a reasonable interim (solution) until PoS comes in."-
Providing proof and benchmarks
Mr. Def "We reached out to some connections that we had. I don't think this information is public information however they advised that there are some very good reverse engineering analyses--already existing technical analyses--of this generation of silicon. Let me go and try to dig that up and see if I can point those out. I think, in general, I would expect that GPU manufacturers would not be that excited about doing detailed area analyses because they have competitive concerns about doing exact breakdowns which is why we ended up with a hand-wavey rough estimate."
Alexey Akhunov "What I would suggest if it's possible. I've done some GPU programming myself years ago, I know when you run some algorithms you can actually profile it and it shows you how much of the bandwidth you've consumed and how much of the registry you've consumed and how much of these operations and those operations--it would be nice if you could run that (so that we can have) have some data to demonstrate that this algorithm is actually utilizing these resources in a GPU. Like let's say "it's utilizing 90% of bandwidth". Is it possible?
Mr. Def "Yes. It's possible. I think that's a wonderful suggestion. Let me get on that and we'll have someone put that together."
Conversations with GPU manufacturers and confirming Mr. Def's assertions
Hudson Jameson "Yes. So right now we're keeping these conversations private because we want to respect the privacy of the manufacturers we're talking to but yes."
Lane Rettig "I was just wondering...if this has been part of that converation already but just getting some confirmation on the ideas that mrdef has shared with us here would be helfpul."
Hudson Jameson "Absolutely. That's exactly why we're talking with them so that we can come on one of the next calls and say "we've confirmed what they're saying with the manufacturers".
Thanks for the paste. I know I'm late to the game. Was keeping quiet about ASIC miners until very recently.
I don't agree with all that. We've been able to do clever things with other PoW's like RandomJS that purport to require general processing.
I apologize for shooting my mouth off prematurely. Someone on the Monero team asked me about ProgPoW in the context of RandomJS and I didn't study this hard enough before giving an opinion. I deleted both articles until I can discuss a few ideas with 7400 who is an elite chip designer.
Hello @timolson thank you for your opinion about ProgPOW here,
I also disagree that ProgPOW will be a complete solution to make ASICs obsolete forever,
Ethash can be modified a bit to make ASICs obsolete for time being but it can't be forever cause there will be FPGAs and then ASICs for specific mining algorithm
May I ask why did you delete your medium post about ProgPOW? I would like to see them but can't access to the link
I think the @timolson is missing the point here ... Ethereum will be going to pos probably by 2020, ethereum used dagger hashimoto rather than another algo like scrypt or sha256 in order to KEEP asics off the network.
Right now there are lots of asics on the network, If ProgPow is implemented it will ABSOLUTELY brick all existing asics and even if its not 100 percent or even 80 percent asic resistant as you surmise by the time new asics are ‘possibly’ developed (in qoutes because I am still quite skeptical of your research on the matter)
it will be time for pos anyways which will most likely deter LARGE scale development from the bitmains and Innosilicons.
thats why the focus should be on the implementation of progpow as it is already proven to work with ethhash on BCI coin.
All this FUD on the relative asic resistance or not of ProgPow is for another long term discussion not even in the scope of this project imo
Finally I've made FNV changes in MIX parts.
I've analysed FNV , and it is depricated current usages on ethash.
On This time, PoW algorithm change to newer one, it needs crypto analysis and long time verification time. it's not suitable for prevent ASIC , so late!
But, in case of TEthashV1 (Trust Ethash Version1) has most of things are based on ethash, which is verified several years, but MIX FNV part uses deprecated implementation , So, if this is changed then more strong PoW algorithm based on ETHASH.
Clearly, make obsolete the current ASIC miners too.
And, I also researching FPGA Mining, recent days of synthesizer (eg. Xilinx or Altera) it has tremendous optimization options and great synthesis abilities, So, It is not possible to prevent ASIC.
So, Above reasons, it may be regular small parts of algorithm changing is more good for resist centralized mining and prevent ASICs.
In case of ASIC, at least 6 months needs, redesign and consumer products delivery.
Another My opinion on Ethash , is remove 64 rounds of Mix to decrease to 32 rounds.
So, then cheaper or some years ago GPU also effective hash on PoW, and many guys can participates mining.
Now, Above TethashV1 is appied on
also patched to new TEthashV1 algorithms for go-ethash wrapper , cpp ethash library.
please check and comments for us.
@olalawal Hard forking of course breaks ASICs but that is not the point here. You could hard fork to any other PoW, but the question is: does ProgPoW do better than Ethash at making ASIC development economically nonviable? Otherwise you just end up with hard-fork hell like Monero.
@naikmyeong My original posts were misinformed and published without my usual standard of due diligence. I retract them immediately and completely as a matter of honor, with apologies to the ProgPoW team. The conclusions are not necessarily wrong, however...
ProgPoW is clearly expertly written, and it's a really good try at GPU-binding. But just because GPU's are fully utilized doesn't mean there couldn't be a more efficient chip.
I've consulted with my business partner 7400 who's a proper expert, and we do have some interesting ideas for improving on the power claims made by the ProgPoW team in the Readme and quoted chat. However, it's one of those things that we'd have to do quite a bit of implementation to really know the multiples we could get. Maybe ProgPoW is more ASIC-resistant than Ethash, maybe not.
Bottom line: we have some interesting ideas/tricks that may not have been considered, but we can't say for sure if they'd work well without putting a lot of effort into the project. So our position will have to remain undecided on the ASIC-resistance of ProgPoW. The research we'd need to do is not worth our effort unless ProgPoW is going to be part of a coin with big mining revenue.
You may use skeptical "scare quotes" after you do a synthesis of our CryptoNight ASIC design that's 5x better than the Bitmain X3. Use your favorite foundry libraries, and let me know the results. After doing that, you will have a right to slam our credentials if you still don't think we know what we're doing.
We appreciate the healthy skepticism. A lot of groups have made a lot of promises in this area that turned out to be unfounded. For those that haven't seen it please read our post here:
To address a point from your initial post I'm not sure why you claim GPU's don't care about power. The entire computer chip industry has been power limited since the Pentium 4 and GeForce FX days. Recent systems like AMD's RX Vega M and NVidia's Max Q are all about maximizing GPU performance within limited power budgets. Going back a bit Nvidia's Maxwell managed to be both faster and lower power than Kepler while using the same fab process.
We do completely agree with this post: https://medium.com/@timolsoncrypto/cryptonight-is-poison-ab598bfe2d2c
ProgPoW is designed to be as straightforward as possible with a near-optimal solution provided from day 1. The basic requirements of the ProgPoW algorithm are:
This directly maps as something that looks a lot like a GPU:
Looking at the profiler data from our post you can see that ProgPoW matches the throughput provided by current GPUs for these key portions. A ProgPoW ASIC would require a similar register file capacity and math/memory throughputs as those in a current GPU. There shouldn't be any fundamental difference in performance or power between an ASIC or a GPU reading data from a register file/memory, or executing in a programmable vector math unit.
You're right that the floating point logic within the GPU is not used, but I don't expect that to be a huge % of the GPU's area. In most modern chips logic that's not actively in use burns minimal power.
You're also right that an ASIC that only focused on ProgPoW could implement a handful of optimizations over a commodity GPU. Fixed function (ASIC) Keccak and Kiss99 implementations would reduce the power of those, but they're <7% of instructions executed. The merge() ops could be implemented as a single CISC instruction instead of the 2 RISC instructions it takes on a GPU, again a marginal power savings.
Using those types of optimizations our expectation is a ProgPoW ASIC could be ~1.2x better perf/watt. If you compare this to Ethash you can get 1.66x better perf/watt today using FPGA offloading. A proper Ethash ASIC should be >2x better perf/watt.