-
Notifications
You must be signed in to change notification settings - Fork 19.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pow: pure go ethash #3750
pow: pure go ethash #3750
Conversation
9f8ff72
to
7b0e06f
Compare
crypto/crypto.go
Outdated
return result | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is easy to implement on top of hash.Hash
, there is no need to add specialized variants to crypto/sha3.
import "hash"
func NewHasher(h hash.Hash) func ([]byte) []byte {
return func (data []byte) []byte {
h.Write(data)
result := h.Sum(nil)
h.Reset()
return result
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe even
import "hash"
func NewHasher(h hash.Hash) func ([]byte) []byte {
result := make([]byte, h.Size())
return func (data []byte) []byte {
h.Write(data)
h.Sum(result[:0])
h.Reset()
return result
}
}
// You should have received a copy of the GNU Lesser General Public License | ||
// along with the go-ethereum library. If not, see <http://www.gnu.org/licenses/>. | ||
|
||
package pow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move ethash to pow/ethash. We can remove package pow if you want or keep it around for the interface (and FakePow). I'm suggesting this because having ethash here adds a ton of dependencies to the pow package, including "unsafe".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already moved it in my PoA PR, all of it into a consensus
package, and ethash
and rinkeby
into their own packages. If it's ok, let's leave it to that PR to avoid a merge nightmare for me :D
pow/ethash.go
Outdated
logger.Info("Loading ethash DAG from disk") | ||
start := time.Now() | ||
d.dataset = prepare(dsize, bufio.NewReader(dump)) | ||
logger.Info("Loaded ethash DAG from disk", "elapsed", common.PrettyDuration(time.Since(start))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to mmap DAGs and caches. mmap'ing means that the memory used by loaded DAGs doesn't count as used and is not tracked by GC.
Changing to mmap might mean that we need to store DAGs with native endianness. Your code translates everything from LE when loading. Doing this would destroy any advantages of mmap because the whole DAG would be paged in when touching it.
Another advantage would be that memory errors can be caught better. The libethash Go wrapper panics if mmap fails, leading to the infamous reports that you have marked as fixed in the description. These issues happen if the OS doesn't allow mmap of the file. If you mmap the file from Go, errors can be reported and overallocations (e.g. on 32bit) won't crash the process.
@fjl PTAL. Full mmap support done. |
PS: Please don't squash when merging. I'd like to retain the commits mostly as is. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only minor issues remain.
While trying this out I noticed that logging is super noisy:
DEBUG[03-07|23:36:53] Evicted ethash cache epoch=1 used=2017-03-07T23:34:21+0100
DEBUG[03-07|23:36:53] Using pre-generated cache epoch=3
DEBUG[03-07|23:36:53] Requiring new future ethash cache epoch=4
DEBUG[03-07|23:36:53] Failed to load old ethash cache seed=0xb903bd7696740696b2b18bd1096a2873bb8ad0c2e7f25b00a0431014edb3f539 err="open /Users/fjl/Library/Ethereum/testnet/geth/ethash/cache-R23-b903bd7696740696b2b18bd1096a2873bb8ad0c2e7f25b00a0431014edb3f539.le: no such file or directory"
DEBUG[03-07|23:36:53] Generating ethash verification cache seed=0xb903bd7696740696b2b18bd1096a2873bb8ad0c2e7f25b00a0431014edb3f539
INFO [03-07|23:36:55] Generated ethash verification cache seed=0xb903bd7696740696b2b18bd1096a2873bb8ad0c2e7f25b00a0431014edb3f539 elapsed=1.646s
IMHO there shouldn't be an INFO-level message for every new cache. Please move it to DEBUG.
It would be nice if the "Generated..." message had the epoch instead of the seed hash (it's much easier to see what's going on).
The "Generating..." message can be removed.
"Using pre-generated...", "Requiring new future..." should be at TRACE level.
type ChainManager interface { | ||
GetBlockByNumber(uint64) *types.Block | ||
CurrentBlock() *types.Block | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can delete this interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already down in my PoA PR which cleans up the package as a whole (also dropping Blocks). Will rather postpone to that.
pow/ethash.go
Outdated
if err != nil { | ||
logger.Error("Failed to memory map ethash cache", "err", err) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unsafe if multiple processes try to generate the same DAG. You can work around it by creating the file with a temporary name first (e.g. .full-R23-hash.<pid>.tmp
) then moving it into place when the content is valid. AFAIK there is no need to remap the file after the move (not sure about Windows though).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for caches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A follow-up question would be how those .tmp
would get deleted after a process crash. Maybe leave a TODO comment about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test succeeds with nprocs=1 but fails with nprocs=3:
func TestDiskCacheConcurrent(t *testing.T) {
tmpdir, err := ioutil.TempDir("", "ethash-test")
if err != nil {
t.Fatal(err)
}
block := types.NewBlockWithHeader(&types.Header{
Number: big.NewInt(3311058), // Ethereum main net
ParentHash: common.HexToHash("0xd783efa4d392943503f28438ad5830b2d5964696ffc285f338585e9fe0a37a05"),
UncleHash: common.HexToHash("0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347"),
Coinbase: common.HexToAddress("0xc0ea08a2d404d3172d2add29a45be56da40e2949"),
Root: common.HexToHash("0x77d14e10470b5850332524f8cd6f69ad21f070ce92dca33ab2858300242ef2f1"),
TxHash: common.HexToHash("0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421"),
ReceiptHash: common.HexToHash("0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421"),
Difficulty: big.NewInt(167925187834220),
GasLimit: big.NewInt(4015682),
GasUsed: big.NewInt(0),
Time: big.NewInt(1488928920),
Extra: []byte("www.bw.com"),
MixDigest: common.HexToHash("0x3e140b0784516af5e5ec6730f2fb20cca22f32be399b9e4ad77d32541f798cd0"),
Nonce: types.EncodeNonce(0xf400cd0006070c49),
})
var wg sync.WaitGroup
nprocs := 3
wg.Add(nprocs)
for i := 0; i < nprocs; i++ {
go func() {
defer wg.Done()
ethash := NewFullEthash(tmpdir, 0, 1, "", 0, 0)
if err := ethash.Verify(block); err != nil {
t.Error(err)
}
}()
}
wg.Wait()
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the test. Albeit I figured caches won't conflict since they go into different folders and I don't expect people to mine with multi processes (especially with dag rotation). The problem nonetheless stands. Will try to fix.
A side effect is that if geth crashes (terminates) while generating a DAG/cache, currently that will corrupt it and we'll never know. So this is actually a better solution.
@fjl PTAL I've implemented the generate-rename-open mechanism. This passes your test that generate the same things concurrently, at it should also solve issues if a generation is aborted midway through (i.e. no junk will be left there in the main file to be loaded the next time). I also modified the log levels a bit. They now report epochs instead of seeds. Dropped the "generating...", and modified the "generated" so that it's only displayed if it took more than 3 seconds. The rationale is that embedded devices which might take 1+ minutes to generate a mainned dag should not just hang with no output. Btw, a more elegant solution to logging the epochs inside generateCache and generateDataset would be to pass in a logger instance that already has the epoch, instead of passing the epoch just to make a logger out of it. Unfortunately this screws up Go's optimizer/inliner/something, as the performance drops by 50%. Making the logger inside is fast. |
pow/ethash.go
Outdated
|
||
os.Rename(temp, path) | ||
|
||
d.dump, d.mmap, d.dataset, err = memoryMap(path, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the mapping stay established? It should work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move the memory mapping logic into its own function, e.g.
generateMappedFile(dir, name, size, func (dataset []uint32) { generateDataset(dataset, d.epoch, cache) })
This way it can be reused for both cache and DAG.
@fjl Done, PTAL |
pow/ethash.go
Outdated
if write { | ||
file, err = os.OpenFile(path, os.O_RDWR, os.ModePerm) | ||
} else { | ||
file, err = os.OpenFile(path, os.O_RDONLY, os.ModePerm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os.ModePerm
is supposed to be used as a mask. Its value is 0777
, all files will be created with read/write/execute permission for all users. It would be better to use a sane mode like 0644
.
pow/ethash.go
Outdated
if err != nil { | ||
file.Close() | ||
return nil, nil, nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic is more complicated than it needs to be. You can just do:
flag, mapflag := os.O_RDONLY, mmap.RDONLY
if write {
flag, mapflag = os.O_RDWR, mmap.RDWR
}
file, err := os.OpenFile(path, flag, 0644)
if err != nil {
return nil, nil, nil, err
}
mem, err := mmap.Map(file, mapflag, 0)
if err != nil {
file.Close()
return nil, nil, nil, err
}
@fjl Done (squashed the last few commits), PTAL |
pow/ethash.go
Outdated
return nil, nil, nil, err | ||
} | ||
generator(buffer) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add an error-checked call to mem.Flush()
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. PTAL
pow/ethash.go
Outdated
if err = dump.Truncate(int64(size)); err != nil { | ||
return nil, nil, nil, err | ||
} | ||
dump.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I look at this again, it seems weird that you're closing the file and reopen it just below. You can reuse the file that's already open.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. PTAL
pow/ethash.go
Outdated
if err != nil { | ||
logger.Error("Failed generate mapped ethash dataset", "err", err) | ||
|
||
d.dataset = make([]uint32, dsize/2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I like how errors are handled here. If mmap fails because of an out of memory condition, this allocation will crash the process. We could avoid that by always using mmap. If no file is needed (path is "") the mmap call can use the ANON option to allocate a non-file-backed buffer. If that fails, Search/Verify simply can't do any work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
People will not use in memory DAGs anyway, this is only useful for testing. And for that we don't really care.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that it will always try generate it in memory if mmap fails. This means that OOM crashes linked in the PR description are not fixed at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm trying to say: Go handles allocation failures by crashing. This is usually OK, but hurts if the program deals with large buffers like we do here. Using mmap means we can handle allocation errors gracefully, without crashing, regardless of whether the memory is file-backed or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would require extra code to track that generation failed and handle that in Search/Verify. Given that there's no meaningful way to get out of this problem, we might as well crash :P
Will merge when CI is done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file names look compatible now. DAG for epoch 0 is identical to the one generated by libethash.
Acknowledgements
This PR replaces the C++ ethash implementation from https://github.com/ethereum/ethash with a custom one fully written in Go. The reasons for doing this are numerous.
Issues with C++ ethash
Features enabled by Go
Performance
850ms800ms. However with the disk cache extension loading it form there (which happens always apart from the very first startup)takes 100msis instantaneous due to memory mapping.4.6s0. The cache for block 2.6M takes 1m7s to generate butonly 12s to loadstill memory maps instantly.1.6ms1.4ms. The comparison is not fully fair because I didn't write a full benchmark for C++.ethminer
to have a hashrate of 200-250KH/s. Measuring the hashrate of the Go implementation (via gometrics) resulted in 240-250KH/s.Fixes
ethash.dasondisk
flag.ethash.dagdir
flag.Extras
The PR introduces the following CLI flags (and eth.Config varialbes):
Notes
cachedir == "ethash"
,cachesinmem == 2
andcachesondisk == 3
on mobile. This should be a sane default. We can surface these as parameters later, but i'd not go there for now.