-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Epoch boundary fix #1707
Epoch boundary fix #1707
Conversation
Map.fromListWith (+) (map (\(_, TxOut a c) -> (a, c)) $ Map.toList u) | ||
-- Map.fromListWith (+) (map (\(_, TxOut a c) -> (a, c)) $ Map.toList u) | ||
Map.foldr accum Map.empty u | ||
where accum (TxOut a c) ans = Map.insertWith (+) a c ans |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you're no longer using aggregateOuts
at all, so the comment below may no longer be relevant:
When looking at the profiteur output (attached) of cardano-node
doing a full sync with MC1, you can drill down on the top left corner until you get to aggregateOuts
(38.6% of the total time), when you go deeper, you'll see that 24.9% of the total time is spent in deserialiseAddr
. That's because pattern matching on a TxOut
, which is a pattern synonym for a TxOutCompact
, deserialises the address (from a ShortByteString
):
pattern TxOut ::
Crypto crypto =>
Addr crypto ->
Coin ->
TxOut crypto
pattern TxOut addr coin <-
(viewCompactTxOut -> (addr, coin))
where
TxOut addr (Coin coin) =
TxOutCompact (BSS.toShort $ serialiseAddr addr) (fromIntegral coin)
viewCompactTxOut :: forall crypto. Crypto crypto => TxOut crypto -> (Addr crypto, Coin)
viewCompactTxOut (TxOutCompact bs c) = (addr, coin)
where
addr = case deserialiseAddr (BSS.fromShort bs) of
Nothing -> panic "viewCompactTxOut: impossible"
Just (a :: Addr crypto) -> a
coin = word64ToCoin c
Idea: wouldn't it be more efficient to keep using the ShortByteString
s as keys in the map while doing Map.insertWith (+)
and afterwards converting the keys in the final map from ShortByteString
s to Addr
s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an excellent observation. But the problem really has two components. The observation is that
(TxOut addr (Coin coin)) is a pattern and the pattern has to deserializes the whole address,
to get the part of the address that holds the credential. It turns out there is a second problem that
that makes this really slow. The 'addr' is stored as a ShortByteString, and that has to be converted
to a ByteString before it can be deserialized. The solution is to deserialize directly from the ShortByteString, and only look
at the Credential part of the address. We implemented this, and created some benchmarks, The
new code is 10x faster than the old code.
case Map.lookup p ptrs of | ||
Just cred -> (cred,c):ans | ||
Nothing -> ans | ||
accum _other ans = ans |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here we skip the AddrBootstrap
case:
data Addr crypto
= Addr !Network !(PaymentCredential crypto) !(StakeReference crypto)
| AddrBootstrap !(BootstrapAddress crypto)
All translated Byron addresses are AddrBootstrap
's, so (at least in the beginning) the majority of all Addr
s will be AddrBootstrap
s. In other words, we're wasting most of our time deserialising (see my other comment about deserialiseAddr
) Byron addresses that we'll ignore anyway.
Idea: in accum
:
- Pattern match on the
TxOutCompact
constructor to avoid callingdeserialiseAddr
unconditionally for each address (ShortByteString
). - Write a function
deserialiseStakeReference
(or a better name) with as type:ByteString -> Maybe (StakeReference crypto)
which returnsNothing
forAddrBootstrap
(just test a bit to see whether it's Byron and then skip it) and only deserialises theStakeReference
in case it's a ShelleyAddr
(you could deserialise theNetwork
andPaymentCredential
as well if needed elsewhere). Maybe throw in anEither
there, because we're conflating "deserialisation failure" with "not a Shelley Addr". - Pattern match on the result of calling
deserialiseStakeReference
on theShortByteString
argument toTxOutCompact
to get the same three cases you currently have inaccum
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So all the above comments have been explored. I wrote a dozen different algorithms to try and make aggregateOuts (and its successors) faster. All of them were benchmarked and the fasted were wriiten into the latest version. We ran all the tests and all the benchmarks, every thig is working.
I think something went wrong rebasing this branch (was master merged in this branch?). It makes it hard to look at the actual changes of this PR. |
…places aggregateOuts (EpochBoundary.hs) and pulls the computation of a constant out of the inner loop of a list comprehension (Rewards.hs) (Really from Tim Sheard)
0718eb4
to
21e9781
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@mrBliss I'm sorry if I merged this before you had a chance to review after I cleaned up the PR. |
@@ -880,7 +903,7 @@ data Query k v where | |||
-- ====================================================================================== | |||
|
|||
smart :: Bool | |||
smart = True -- for debugging purposes, this can be set to False, in which case no rewrites occurr. | |||
smart = False -- True -- for debugging purposes, this can be set to False, in which case no rewrites occurr. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intentional? This change is still there in master.
mRewards = | ||
Map.fromList | ||
[ ( hk, | ||
memberRew poolR pool (StakeShare (fromIntegral c % tot)) (StakeShare sigma) | ||
) | ||
| (hk, Coin c) <- Map.toList stake, | ||
hk `Set.notMember` (KeyHashObj `Set.map` _poolOwners pool) | ||
hk `Set.notMember` poolHashes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we avoid this O(n) operation entirely:
poolHashes = (KeyHashObj `Set.map` _poolOwners pool)
We just need to check if kh
is in poolOwners pool
, something like
notPoolOwner (KeyHashObj hk) = hk `Set.notMember` _poolOwners pool
notPoolOwner (ScriptHashObj hk) = False
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, notPoolOwner
is all we need
Tries to fix a bottle neck at epoch boundaries. Two small changes. Replaces aggregateOuts (EpochBoundary.hs) and
pulls the computation of a constant out of the inner loop of a list comprehension (Rewards.hs)