-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
possible article of interest #18
Comments
Related article mentioned there : https://www.computer.org/csdl/proceedings/ispass/2017/3890/00/07975285.pdf |
Using a well-tested and optimized benchmark suite to design a PoW certainly has its merits. However, if I was peer-reviewing this paper, I'd ask them to include more details about the runtimes of their 'widgets' especially in relation with PoW verification time. They mention "running time on the order of a few seconds or less", which is certainly too high for practical use. Symmetric PoW algorithms must have run times on the order of milliseconds. |
Reading the blurb and beginning of article, + skimming the rest, this is what I get. Optimize how? ASIC like circuitry for specific cases? General optimizations for real world workloads? Something in the microcode? Are the CPU manufacturers making their CPUs for real life workloads or are they cheating on a benchmark (precedent with nividia making drivers for benchmarks https://www.geek.com/games/futuremark-confirms-nvidia-is-cheating-in-benchmark-553361/ ) Sure it makes sense to have good benchmark scores and there is history of cheating on prevalent benchmarks. The article does not show that is happening though. Since SPEC CPU 2017 is a general benchmark designed to test different common workloads, one can imagine that a CPU will be capable of them proportionally to its computing power. A specific end user workload or a representation of it can get its own ASIC circuit and you can massively parallelize it. A CPU might have something to make it 5% faster than the competition because of a benchmark (software) quirk somewhere but it won't be prepared to handle a billion instances of it. (I am not familiar of the nature of the 43 benchmarks) The premise of the paper makes for an "interesting" story, but I think they start at a too high level of abstraction by looking at benchmarks. ASIC resistant pow - why? Custom asics (or even common bitcoin ones) are restricted to few manufacturers and because there is money to be made there is little incentive to sell them to the public (hello dusty ASICs being sold at exorbitant prices just as a new generation is coming out (in secret) that will make the last generation obsolete). CPU does something. You know what you can make it do and you can measure how long it took and what it took. What are the smallest things you can make a CPU do? How much does that use the CPU? What can you make the CPU do so that is uses as many resources as possible? Lets say for a specific sequence of instructions there is dedicated circuitry or other magic on a CPU to boost some edge case in a benchmark/application (dubious). To get the benefit of those hidden optimizations you must use SPEC CPU 2017 esque workloads or because of the prevalence of the workload the CPU is just good at it. Is that good POW? Does that prevent ASIC optimizations to the highest degree? This paper raises an interesting question (CPU and benchmark optimizations/cheating), doesn't answer it, makes a pow out of it (you can make a pow out of anything) and leaves it at that. Is it optimized? Is it asic resistant? The special question is, is there magic sauce in CPUs that was put there to make it especially fast at certain tasks? If so then how flexible is that special sauce (useful for asic resistant POW leverage?) or is it just for cheating on a benchmark? You'd start by testing a suspect benchmark with newer and older gen CPU's and see if there is an uptick not explained by general IPC or frequency. IF you find it you deconstruct the program and figure out what it might be. Test mutations of it on other architectures. A real world workload is good for estimating real world performance (bottlenecks in RAM, GPU, IO, under utilization of hardware). A synthetic workload is good for measuring the performance of parts or combinations of parts of the CPU. An ASIC resistant POW will be a purpose made workload and synthetic in nature when considering all other tasks out there. It will have certain hardware parameters in mind and should be designed to use as much of the CPU as possible and be fine grained to do it. A real world benchmark will remain in the real world. If a CPU can introduce a magic speedup for a piece of code or pieces of code that match a pattern, could't an ASIC do the same? Regarding speculative execution, branch prediction and stuff like that (correct me if I'm wrong). Stuff like this is useful when the CPU is underutilized. A CPU can use 30% resources now when they aren't being used to have 5% of the work done instantly instead of using 5% of resources and 5% of time when it could be used for something else. If a POW is using all of the cpu or most of the cpu or the circuits that would be doing the applicable prediction/speculation math, none of it will matter? |
A question I've been pondering but can't begin to wrap my head around in any meaningful way. Is there an algorithm that:
Since it is a cat and mouse game, none of it is an immediate concern. There is lower hanging fruit. Maybe it will become a whack-a-mole game that Monero can manage to play. |
Funnily enough, they include randomX in their references. At least they've done some useful homework. And yes, CPU optimzations tend to be tailored to the SPEC benchmark, and a SPEC workload might represent a best case for a CPU workload. But the optimizations aren't always in the hardware - frequently they're just tricks built into the compiler. I already investigated this path when I first came up with randomJS. Also, their approach is necessarily tied to a specific ISA. I had this same realization before, and considered bundling gcc into the PoW to avoid being tied to a single ISA. It's really not that practical. |
Now that I've read to the end of the paper... They actually include gcc in their PoW. This opens up a potential hole as far as reproducibility, across gcc versions and target platforms. |
https://arxiv.org/abs/1902.00112
"HashCore: Proof-of-Work Functions for General Purpose Processors"
posted by sarang in #monero-pow
The text was updated successfully, but these errors were encountered: