New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash Cache #8259

Closed
wants to merge 2 commits into
base: master
from

Conversation

Projects
None yet
7 participants
@NicolasDorier
Member

NicolasDorier commented Jun 24, 2016

Some notes about the implementation:

  • It calculates the three midstate hashes as soon as a CheckSig segwit operation happens, whathever the SigHash,
  • It is possible that two different threads calculate the midstate hashes of a transaction twice,

Befinits are:

  • hashcashes map access is limited so we don't have too much lock contention
  • Fewer conditional branches in consensus code
  • Simple to review

This commit is only for having a cache that is simple to review and understand. It is probably possible to fix the two first points above, but the code overhead is not worth it when our goal is only to fix the O(n²) issue.

(rebased version of sipa#70)

@NicolasDorier NicolasDorier referenced this pull request Jun 24, 2016

Closed

Cached Hashes #101

@NicolasDorier NicolasDorier force-pushed the NicolasDorier:cachedhashes branch from 95f8516 to c2ea4dd Jun 24, 2016

@jl2012

This comment has been minimized.

Member

jl2012 commented Jun 25, 2016

This one is hopefully merged for versions with segwit defined on mainnet

@NicolasDorier NicolasDorier changed the title from Cache hashes to Hash Cache Jun 25, 2016

bool TrySet(uint256 txId, const CachedHashes& hashes)
{
LOCK(cs);
if(map.count(txId))

This comment has been minimized.

@dcousens

dcousens Jun 27, 2016

Contributor

You could avoid two look ups (IIRC) by comparing the size of the container before and after instead.

Aka:

auto sizeBefore = map.size();
map.insert(txId, hashes);
return map.size() != sizeBefore;

This comment has been minimized.

@NicolasDorier

NicolasDorier Jun 27, 2016

Member

Actually, I don't think I really need TrySet to return a bool.

if(!map.count(txId))
return false;
*hashes = map[txId];
return true;

This comment has been minimized.

@dcousens

dcousens Jun 27, 2016

Contributor

Why not:

auto iter = map.find(txId);
if (iter == map.end()) return false;
*hashes = iter->second;
return true;

edit: fixed early-exit

This comment has been minimized.

@NicolasDorier

NicolasDorier Jun 27, 2016

Member

What do you mean by "fixed early-exit" ?
Yes, I think your proposition is better as it makes only one lookup.

This comment has been minimized.

@dcousens

dcousens Jun 28, 2016

Contributor

What do you mean by "fixed early-exit" ?

I accidentally inverted the early exit logic, I've since fixed it, but left the edit in case you read it in an email and followed up.

@dcousens

This comment has been minimized.

Contributor

dcousens commented Jun 27, 2016

hashcashes

hash caches?, had me confused for a second haha

utACK c2ea4dd

class CachedHashes
{
public:
uint256 hashPrevouts,hashSequence,hashOutputs;

This comment has been minimized.

@dcousens

dcousens Jun 27, 2016

Contributor

trivial: maybe spacing between names?

@@ -426,6 +429,7 @@ class CScriptCheck
std::swap(nFlags, check.nFlags);
std::swap(cacheStore, check.cacheStore);
std::swap(error, check.error);
std::swap(cachedHashesMap, check.cachedHashesMap);

This comment has been minimized.

@dcousens

dcousens Jun 27, 2016

Contributor

Would this need to enforce the lock on check?

This comment has been minimized.

@NicolasDorier

NicolasDorier Jun 27, 2016

Member

I don't understand, you mean the internal lock of cachedHashesMap ? no, the lock is only meant to protect the internal map, which is not accessed during the swap.

This comment has been minimized.

@dcousens

dcousens Jun 28, 2016

Contributor

My logic was based on the presence of the lock implies multi-threaded access, and this function implies the entire memory is being swapped when swap() is called.
To me, that would imply that it should be LOCK'd to avoid another thread reading it during this swap.

Perhaps it isn't needed for this case, but I just figured I'd ask.

This comment has been minimized.

@NicolasDorier

NicolasDorier Jun 28, 2016

Member

Ah I see. Well, I don't think we need it. The swap is always called before the closure is executed (which is only when we start having concurrent access)

@laanwj laanwj added the Validation label Jun 27, 2016

@NicolasDorier

This comment has been minimized.

Member

NicolasDorier commented Jun 28, 2016

@dcousens addressed your nits in dc188d8.

@dcousens

This comment has been minimized.

Contributor

dcousens commented Jun 29, 2016

utACK dc188d8

@@ -1110,35 +1110,46 @@ class CTransactionSignatureSerializer {
} // anon namespace
uint256 SignatureHash(const CScript& scriptCode, const CTransaction& txTo, unsigned int nIn, int nHashType, const CAmount& amount, SigVersion sigversion)
uint256 SignatureHash(const CScript& scriptCode, const CTransaction& txTo, unsigned int nIn, int nHashType, const CAmount& amount, SigVersion sigversion, CachedHashes* cache)

This comment has been minimized.

@jtimon

jtimon Jul 28, 2016

Member

Perhaps here we can do the same trick we did with the script/sigcache to keep script/interpreter simpler and more reusable.
ping @sipa

This comment has been minimized.

@jtimon

jtimon Jul 28, 2016

Member

Or maybe move SignatureHash to BaseSignatureChecker. Just brainstorming.

@sipa

This comment has been minimized.

Member

sipa commented Jul 28, 2016

Assume a transaction has many signatures. One is SIGHASH_SINGLE, all the others are SIGHASH_ALL | SIGHASH_ANYONECANPAY.

The first computes and stores hashPrevouts. All the others will compute hashOutputs. However, after the first call, all TrySets will not do anything, as there is already a result in the cache, so it gets computed over and over again.

I think CachedHashes needs a Merge method like this:

void Merge(const CachedHashes& hashes) {
    if (hashPrevouts.IsNull()) hashPrevouts = hashes.hashPrevout;
    if (hashSequence.IsNull()) hashSequence = hashes.hashSequence;
    if (hashOutputs.IsNull()) hashOutputs = hashes.hashOutputs;
}

which can then be called from TrySet (instead of insert, use map[txid].Merge(hashes)).

@NicolasDorier

This comment has been minimized.

Member

NicolasDorier commented Jul 28, 2016

so it gets computed over and over again

That's not true, I would be calculating the three hashes at the same time during the SIGHASH_SINGLE.
Take a look at the SignatureHash method, I changed it to calculate the three hashes aggressively.

I prefer not doing a merge. Without a merge my lock only have to protect the internal map as CachedHashes instances are read only. If we calculate the mid states lazily, I also need to be careful about locking at the CachedHashes instance level.

@sipa

This comment has been minimized.

Member

sipa commented Jul 28, 2016

@NicolasDorier Oh, I see. I agree that the current code is fine in that case.

I don't understand the argument about the lock. The Merge function would also grab the lock, and be the only code that touches the map.

@NicolasDorier

This comment has been minimized.

Member

NicolasDorier commented Jul 28, 2016

@sipa The Merge function as you did here can't grab the lock, because the lock is at the cache map level, not at the CachedHashes instance level.

But basically, if doing that way, I would need to grab the lock of the map around the Merge, as well as around any hash read of the HashedCaches instance. (so one can't read and call merge on the HashedCaches at the same time)
This would make lots of contention on a single lock. An alternative is to have one lock per HashedCaches... not sure if it is worth the complexity though, knowing that the only type of transaction which does not need the 3 midstate hashes are transaction which have no SIGHASH_ALL... this is very marginal case.

@NicolasDorier NicolasDorier referenced this pull request Jul 29, 2016

Closed

Cache hashes #8422

@NicolasDorier

This comment has been minimized.

Member

NicolasDorier commented Jul 29, 2016

Closing this one in favor of #8422 which calculate hashes lazily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment