Skip to content

Conversation

kderme
Copy link
Contributor

@kderme kderme commented Jul 9, 2020

@kderme kderme marked this pull request as draft July 9, 2020 14:15
@kderme kderme force-pushed the kderme/generalize-db-converter branch from 7fbcc60 to 97a6fda Compare July 9, 2020 14:22
@kderme kderme force-pushed the kderme/generalize-db-converter branch from 97a6fda to b7ed0ad Compare July 9, 2020 21:17
@kderme
Copy link
Contributor Author

kderme commented Jul 13, 2020

I tested this on a small part of mainnet and it seems to work ok. However I noticed the validation lasted way longer when the cardano flags where enabled (instead of simple Byron). Outputs with time(1):
Byron:

DB tip: At (Block {blockPointSlot = SlotNo {unSlotNo = 66130}, blockPointHash = ByronHash {unByronHash = 736e8429a6c547ff7a11d837c8f0695a271367a6538bee528faffdec3ff6cb90}})
16.44user 0.37system 0:16.81elapsed 100%CPU (0avgtext+0avgdata 162060maxresident)k
0inputs+5520outputs (0major+64089minor)pagefaults 0swaps

Cardano:

DB tip: At (Block {blockPointSlot = SlotNo {unSlotNo = 66130}, blockPointHash = "sn\132)\166\197G\255z\DC1\216\&7\200\240iZ'\DC3g\166S\139\238R\143\175\253\236?\246\203\144"})
206.96user 1.29system 3:28.27elapsed 99%CPU (0avgtext+0avgdata 182224maxresident)k
0inputs+5528outputs (0major+130727minor)pagefaults 0swaps

Notice 0:16.81 vs 3:28.27 !

@kderme kderme marked this pull request as ready for review July 13, 2020 06:39
@mrBliss
Copy link
Contributor

mrBliss commented Jul 13, 2020

I tested this on a small part of mainnet and it seems to work ok. However I noticed the validation lasted way longer when the cardano flags where enabled (instead of simple Byron). Outputs with time(1):
Byron:

DB tip: At (Block {blockPointSlot = SlotNo {unSlotNo = 66130}, blockPointHash = ByronHash {unByronHash = 736e8429a6c547ff7a11d837c8f0695a271367a6538bee528faffdec3ff6cb90}})
16.44user 0.37system 0:16.81elapsed 100%CPU (0avgtext+0avgdata 162060maxresident)k
0inputs+5520outputs (0major+64089minor)pagefaults 0swaps

Cardano:

DB tip: At (Block {blockPointSlot = SlotNo {unSlotNo = 66130}, blockPointHash = "sn\132)\166\197G\255z\DC1\216\&7\200\240iZ'\DC3g\166S\139\238R\143\175\253\236?\246\203\144"})
206.96user 1.29system 3:28.27elapsed 99%CPU (0avgtext+0avgdata 182224maxresident)k
0inputs+5528outputs (0major+130727minor)pagefaults 0swaps

Notice 0:16.81 vs 3:28.27 !

This is worrying! This is reproducible, right? Not just a fluke?

Can you try replacing the last Nothing argument to protocolInfoCardano with Just 190 and see whether that has an effect?

The next step would be to start profiling.

mrBliss added a commit that referenced this pull request Jul 13, 2020
In #2375, we discovered that validating a Byron chain took 10x as long using
`CardanoBlock` than using `ByronBlock`.

Looking at the profiling report, we see it being dominated by:

```
exp' src/Shelley/Spec/NonIntegral.hs:(171,1)-(175,43) 52.1 56.7
ln'  src/Shelley/Spec/NonIntegral.hs:(159,1)-(162,27) 22.7 26.1
```

Which is strange, as we're validating a Byron-only chain! It turns out that
these two functions are called by `mkActiveSlotCoeff`, which is called as part
of `mkShelleyLedgerConfig`, which we are calling in `completeLedgerConfig`. This
happens for *each* block.

To fix this, we replace the partial ledger config of Shelley with a non-partial
one containing a dummy `EpochInfo Identity`. In `completeLedgerConfig` we
replace the dummy `EpochInfo Identity` with the correct one. This means we only
call `mkShelleyLedgerConfig` once and cache its result.
@mrBliss
Copy link
Contributor

mrBliss commented Jul 13, 2020

I tested this on a small part of mainnet and it seems to work ok. However I noticed the validation lasted way longer when the cardano flags where enabled (instead of simple Byron). Outputs with time(1):
Byron:

DB tip: At (Block {blockPointSlot = SlotNo {unSlotNo = 66130}, blockPointHash = ByronHash {unByronHash = 736e8429a6c547ff7a11d837c8f0695a271367a6538bee528faffdec3ff6cb90}})
16.44user 0.37system 0:16.81elapsed 100%CPU (0avgtext+0avgdata 162060maxresident)k
0inputs+5520outputs (0major+64089minor)pagefaults 0swaps

Cardano:

DB tip: At (Block {blockPointSlot = SlotNo {unSlotNo = 66130}, blockPointHash = "sn\132)\166\197G\255z\DC1\216\&7\200\240iZ'\DC3g\166S\139\238R\143\175\253\236?\246\203\144"})
206.96user 1.29system 3:28.27elapsed 99%CPU (0avgtext+0avgdata 182224maxresident)k
0inputs+5528outputs (0major+130727minor)pagefaults 0swaps

Notice 0:16.81 vs 3:28.27 !

This is worrying! This is reproducible, right? Not just a fluke?

Can you try replacing the last Nothing argument to protocolInfoCardano with Just 190 and see whether that has an effect?

The next step would be to start profiling.

We found the cause. See #2390 for a fix.

mrBliss added a commit that referenced this pull request Jul 13, 2020
In #2375, we discovered that validating a Byron chain took 10x as long using
`CardanoBlock` than using `ByronBlock`.

Looking at the profiling report, we see it being dominated by:

```
exp' src/Shelley/Spec/NonIntegral.hs:(171,1)-(175,43) 52.1 56.7
ln'  src/Shelley/Spec/NonIntegral.hs:(159,1)-(162,27) 22.7 26.1
```

Which is strange, as we're validating a Byron-only chain! It turns out that
these two functions are called by `mkActiveSlotCoeff`, which is called as part
of `mkShelleyLedgerConfig`, which we are calling in `completeLedgerConfig`. This
happens for *each* block.

To fix this, we replace the partial ledger config of Shelley with a non-partial
one containing a dummy `EpochInfo Identity`. In `completeLedgerConfig` we
replace the dummy `EpochInfo Identity` with the correct one. This means we only
call `mkShelleyLedgerConfig` once and cache its result.
iohk-bors bot added a commit that referenced this pull request Jul 13, 2020
2390: Avoid recomputing the ShelleyLedgerConfig r=mrBliss a=mrBliss

In #2375, we discovered that validating a Byron chain took 10x as long using
`CardanoBlock` than using `ByronBlock`.

Looking at the profiling report, we see it being dominated by:

```
exp' src/Shelley/Spec/NonIntegral.hs:(171,1)-(175,43) 52.1 56.7
ln'  src/Shelley/Spec/NonIntegral.hs:(159,1)-(162,27) 22.7 26.1
```

Which is strange, as we're validating a Byron-only chain! It turns out that
these two functions are called by `mkActiveSlotCoeff`, which is called as part
of `mkShelleyLedgerConfig`, which we are calling in `completeLedgerConfig`. This
happens for *each* block.

To fix this, we replace the partial ledger config of Shelley with a non-partial
one containing a dummy `EpochInfo Identity`. In `completeLedgerConfig` we
replace the dummy `EpochInfo Identity` with the correct one. This means we only
call `mkShelleyLedgerConfig` once and cache its result.

Co-authored-by: Thomas Winant <thomas@well-typed.com>
@kderme kderme force-pushed the kderme/generalize-db-converter branch from 10be7d5 to 0a8f3e2 Compare July 13, 2020 18:14
@kderme kderme force-pushed the kderme/generalize-db-converter branch from 0a8f3e2 to 4b79937 Compare July 14, 2020 11:49
@mrBliss
Copy link
Contributor

mrBliss commented Jul 14, 2020

bors merge

@mrBliss
Copy link
Contributor

mrBliss commented Jul 14, 2020

bors cancel

@mrBliss
Copy link
Contributor

mrBliss commented Jul 14, 2020

I just realised this will break https://github.com/input-output-hk/ouroboros-network/blob/master/nix/validate-mainnet.nix

We should update that too.

@iohk-bors
Copy link
Contributor

iohk-bors bot commented Jul 14, 2020

@iohk-bors iohk-bors bot merged commit d016025 into master Jul 14, 2020
@iohk-bors iohk-bors bot deleted the kderme/generalize-db-converter branch July 14, 2020 12:19
@kderme kderme mentioned this pull request Jul 14, 2020
iohk-bors bot added a commit that referenced this pull request Jul 14, 2020
2400: Fix validate mainnet job r=kderme a=kderme

#2375 broke a nightly job that should be fixed.

Co-authored-by: kderme <k.dermenz@gmail.com>
@mrBliss mrBliss mentioned this pull request Jul 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants