Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate anti-diff prototype in consensus packages #4036

Merged
merged 12 commits into from
Oct 14, 2022

Conversation

jorisdral
Copy link
Contributor

@jorisdral jorisdral commented Sep 22, 2022

Description

Resolves #4010. This is a follow-up PR to #3997.

This PR integrates the anti-diff diff sequences in consensus. In addition to integration, this PR includes two major changes:

  • Anti-diff invariants are now enforced through non-empty sequences.
  • The DiffSeq datatype now checks more slot-related invariants.

The need for these changes only became apparent during integration. The invariants are now more heavily enforced, and should prevent misuse of the datatypes.

My advice is to look at the changes on a per-commit basis, which should make the diffs easier to parse.

Checklist

  • Branch
    • Commit sequence broadly makes sense
    • Commits have useful messages
    • New tests are added if needed and existing tests are updated
    • If this branch changes Consensus and has any consequences for downstream repositories or end users, said changes must be documented in interface-CHANGELOG.md
    • If this branch changes Network and has any consequences for downstream repositories or end users, said changes must be documented in interface-CHANGELOG.md
    • If serialization changes, user-facing consequences (e.g. replay from genesis) are confirmed to be intentional.
  • Pull Request
    • Self-reviewed the diff
    • Useful pull request description at least containing the following information:
      • What does this PR change?
      • Why these changes were needed?
      • How does this affect downstream repositories and/or end-users?
      • Which ticket does this PR close (if any)? If it does, is it linked?
    • Reviewer requested

Parts of the various `consensus` packages need to access the internals
of the `Diff` datatype directly, so these packages would rely on the
`Data.Map.Diff.Strict.Internal` module. However, relying on the internal
module could break the invariant that diff histories should be
non-empty, which is essential if we want `Group` type class laws to hold
for the `Diff` datatype. This commit refactors diff histories such that
they are now non-empty by construction. That is, a diff history is now a
thin wrapper around a non-empty sequence.
The internal module has become obsolete, so we move all its contents
into its parent module.
An important invariant for the `DiffSeq` datatype is that the slot
numbers of consecutive elements are strictly increasing. This invariant
was not being enforced before, but should be harder to break after these
changes because the datatype now keeps track of more slot information,
and performs more checks to ensure the invariant.

Note: The `UnsafeDiffSeq` constructor is exported, so in theory the
invariant can be broken anyway because client code can manipulate the
underlying `StrictFingerTree`. However, the constructor name warns
client code that its use could be unsafe.
@jorisdral jorisdral requested a review from jasagredo September 22, 2022 11:25
@jorisdral jorisdral self-assigned this Sep 22, 2022
@jorisdral jorisdral added consensus issues related to ouroboros-consensus optimisation Performance optimisation UTxO-HD 📒💽 labels Sep 22, 2022
@jorisdral jorisdral linked an issue Sep 22, 2022 that may be closed by this pull request
@jorisdral jorisdral requested review from dnadales and removed request for dnadales and jasagredo September 22, 2022 11:34
Diff calculation did not check if values changed when a key is in
both maps that are being diffed. This means that it was detecting an
update, while in fact the value had not changed.
Scrutinous applying of diffs entails that we perform sanity checks in
the diff application function. However, values can be associated with
different eras inside consensus (e.g., a Shelley UTxO vs. an Allegra
UTxO). This causes errors, because some of this era information is lost
in translation steps, causing sanity checks to fail while they logically
should not. For now, we disable the sanity checks, and leave enabling
the sanity checks as a TODO/FIXME.
@jorisdral jorisdral marked this pull request as ready for review September 30, 2022 12:47
@jasagredo
Copy link
Contributor

Linked #3943 as WithoutLedgerTables becomes obsolete once this is merged

Copy link
Contributor

@jasagredo jasagredo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in general concerned that we are loosing the previous implementation. We might as well just delete the HD module right?

Would it be possible to unify the interface and define a CPP flag that switches implementations? Is it even worth? Ping @dnadales

@@ -375,14 +376,14 @@ class ( ShowLedgerState (LedgerTables l)

pureLedgerTables ::
(forall k v.
Ord k
(Ord k, Eq v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure why are these Eq constraints propagated if we are not comparing the values in the end?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I will remove them again.

This might be a bit of tangent, but I was wondering if it is necessary to have this Rank-2 type here, and why we could not have k and v mentioned in the header of TableStuff. It's somewhat cumbersome that we have to add new constraints on k and v in pureLedgerTables every time we want to apply pureLedgerTables to some new function that places these additional constraints on k and v. Would that be an alternative, or would other parts of the code break?

Copy link
Contributor Author

@jorisdral jorisdral Oct 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I will remove them again.

I was wrong. The constraints are there because there are other anti-diff functions that require the equality constraint, such as diff and the Semigroup, Monoid and Group instance functions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can remove some of the Eq constraints. For instance, does the patch below make sense? (ouroboros-consensus compiles with it, but I haven't tried the rest of the code, however, the question is rather about the purpose of the Eq constraint).

diff --git a/anti-diff/src/Data/Map/Diff/Strict.hs b/anti-diff/src/Data/Map/Diff/Strict.hs
index 0ea9cf480..e63cf065b 100644
--- a/anti-diff/src/Data/Map/Diff/Strict.hs
+++ b/anti-diff/src/Data/Map/Diff/Strict.hs
@@ -88,7 +88,7 @@ import           NoThunks.Class (NoThunks (..), noThunksInValues)
 -- @'DiffHistory'@ is non-empty. This prevents the @'Map'@ from getting bloated with
 -- empty diff histories.
 newtype Diff k v = Diff (Map k (NEDiffHistory v))
-  deriving stock (Generic, Show, Eq, Functor)
+  deriving stock (Generic, Show, Functor)
   deriving anyclass (NoThunks)
 
 -- | A history of changes to a value in a key-value store.
@@ -101,20 +101,20 @@ newtype UnsafeDiffHistory t v = UnsafeDiffHistory (t (DiffEntry v))
 
 deriving stock instance Show v => Show (UnsafeDiffHistory Seq v)
 deriving stock instance Show v => Show (UnsafeDiffHistory NESeq v)
-deriving stock instance Eq v => Eq (UnsafeDiffHistory Seq v)
-deriving stock instance Eq v => Eq (UnsafeDiffHistory NESeq v)
+-- deriving stock instance Eq v => Eq (UnsafeDiffHistory Seq v)
+-- deriving stock instance Eq v => Eq (UnsafeDiffHistory NESeq v)
 
 {-# COMPLETE DiffHistory #-}
 
 newtype DiffHistory v = MkDiffHistory (UnsafeDiffHistory Seq v)
-  deriving stock (Generic, Show, Eq, Functor)
+  deriving stock (Generic, Show, Functor)
   deriving newtype (NoThunks, Foldable)
 
 {-# COMPLETE NEDiffHistory #-}
 
 -- | A non-empty diff history.
 newtype NEDiffHistory v = MkNEDiffHistory (UnsafeDiffHistory NESeq v)
-  deriving stock (Generic, Show, Eq, Functor)
+  deriving stock (Generic, Show, Functor)
   deriving newtype (NoThunks, Foldable)
 
 pattern DiffHistory :: Seq (DiffEntry v) -> DiffHistory v
@@ -142,7 +142,7 @@ data DiffEntry v =
     | Delete !v
     | UnsafeAntiInsert !v
     | UnsafeAntiDelete !v
-  deriving stock (Generic, Show, Eq, Functor, Foldable)
+  deriving stock (Generic, Show, Functor, Foldable)
   deriving anyclass (NoThunks)
 
 {------------------------------------------------------------------------------
@@ -290,8 +290,17 @@ invertDiffEntry = \case
 --
 -- For simplicity, we simply compare the inverse of the first argument to the
 -- second argument. That is, inversion should be invertible.
+--
+-- TODO: we only use this in teststing code it seems. I wonder if we can make
+-- this def local.
 areInverses :: Eq v => DiffEntry v -> DiffEntry v -> Bool
-areInverses e1 e2 = invertDiffEntry e1 == e2
+areInverses (Insert x) (UnsafeAntiInsert y) = x == y
+  -- Delete x           -> UnsafeAntiDelete x
+  -- UnsafeAntiInsert x -> Insert x
+  -- UnsafeAntiDelete x -> Delete x
+
+
+-- e1 e2 = invertDiffEntry e1 == e2
 
 {------------------------------------------------------------------------------
   Values and keys
@@ -399,7 +408,7 @@ data ApplyDiffError k v =
   | BadDelMatchingKey k v v (NEDiffHistory v)
   | BadDelInsMatchingKey k v v v (NEDiffHistory v)
   | InsDelMatchingKey k v (NEDiffHistory v)
-  deriving (Show, Eq)
+  deriving (Show)
 
 -- | Applies a diff to values, performs sanity checks.
 --
diff --git a/ouroboros-consensus/src/Ouroboros/Consensus/Ledger/Basics.hs b/ouroboros-consensus/src/Ouroboros/Consensus/Ledger/Basics.hs
index a635f81f3..ec7e4296c 100644
--- a/ouroboros-consensus/src/Ouroboros/Consensus/Ledger/Basics.hs
+++ b/ouroboros-consensus/src/Ouroboros/Consensus/Ledger/Basics.hs
@@ -821,16 +821,16 @@ mapValuesAppliedMK f = \case
   ApplyQueryAllMK      -> ApplyQueryAllMK
   ApplyQuerySomeMK vs  -> ApplyQuerySomeMK (fmap f vs)
 
-instance (Ord k, Eq v) => Eq (ApplyMapKind' mk k v) where
-  ApplyEmptyMK          == _                     = True
-  ApplyKeysMK   l       == ApplyKeysMK   r       = l == r
-  ApplyValuesMK l       == ApplyValuesMK r       = l == r
-  ApplyTrackingMK l1 l2 == ApplyTrackingMK r1 r2 = l1 == r1 && l2 == r2
-  ApplyDiffMK l         == ApplyDiffMK r         = l == r
-  ApplySeqDiffMK l      == ApplySeqDiffMK r      = l == r
-  ApplyQueryAllMK       == ApplyQueryAllMK       = True
-  ApplyQuerySomeMK l    == ApplyQuerySomeMK r    = l == r
-  _                     == _                     = False
+-- instance (Ord k, Eq v) => Eq (ApplyMapKind' mk k v) where
+--   ApplyEmptyMK          == _                     = True
+--   ApplyKeysMK   l       == ApplyKeysMK   r       = l == r
+--   ApplyValuesMK l       == ApplyValuesMK r       = l == r
+--   ApplyTrackingMK l1 l2 == ApplyTrackingMK r1 r2 = l1 == r1 && l2 == r2
+--   ApplyDiffMK l         == ApplyDiffMK r         = l == r
+--   ApplySeqDiffMK l      == ApplySeqDiffMK r      = l == r
+--   ApplyQueryAllMK       == ApplyQueryAllMK       = True
+--   ApplyQuerySomeMK l    == ApplyQuerySomeMK r    = l == r
+--   _                     == _                     = False
 
 instance (Ord k, NoThunks k, NoThunks v) => NoThunks (ApplyMapKind' mk k v) where
   wNoThunks ctxt   = NoThunks.allNoThunks . \case
diff --git a/ouroboros-consensus/src/Ouroboros/Consensus/Storage/LedgerDB/HD/DiffSeq.hs b/ouroboros-consensus/src/Ouroboros/Consensus/Storage/LedgerDB/HD/DiffSeq.hs
index 82e502a85..559f88865 100644
--- a/ouroboros-consensus/src/Ouroboros/Consensus/Storage/LedgerDB/HD/DiffSeq.hs
+++ b/ouroboros-consensus/src/Ouroboros/Consensus/Storage/LedgerDB/HD/DiffSeq.hs
@@ -138,7 +138,7 @@ newtype DiffSeq k v =
       (InternalMeasure k v)
       (Element k v)
     )
-  deriving stock (Generic, Show, Eq)
+  deriving stock (Generic, Show)
   deriving anyclass (NoThunks)
 
 -- The @'SlotNo'@ is not included in the root measure, since it is
@@ -149,7 +149,7 @@ data RootMeasure k v = RootMeasure {
     -- | Cumulative diff
   , rmDiff   :: !(Diff k v)
   }
-  deriving stock (Generic, Show, Eq, Functor)
+  deriving stock (Generic, Show, Functor)
   deriving anyclass (NoThunks)
 
 data InternalMeasure k v = InternalMeasure {
@@ -173,7 +173,7 @@ data Element k v = Element {
     elSlotNo :: {-# UNPACK #-} !Slot.SlotNo
   , elDiff   ::                !(Diff k v)
   }
-  deriving stock (Generic, Show, Eq, Functor)
+  deriving stock (Generic, Show, Functor)
   deriving anyclass (NoThunks)
 
 -- | Length of a sequence of differences.

Copy link
Member

@dnadales dnadales Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test suite will fail to compile with the patch above, but maybe the question is how to test the root measure semigroup laws without requiring a Eq constraint in the package that does not use them (only the test do).

test-storage/Test/Ouroboros/Storage/LedgerDB/HD/DiffSeq.hs:23:9: error:
    • No instance for (Eq (RootMeasure Key Val))
        arising from a use of ‘testSemigroupLaws’
    • In the expression: testSemigroupLaws
      In the second argument of ‘testGroupWithProxy’, namely
        ‘[testSemigroupLaws, testMonoidLaws, testGroupLaws]’
      In the expression:
        testGroupWithProxy
          (Proxy @(RootMeasure Key Val))
          [testSemigroupLaws, testMonoidLaws, testGroupLaws]
   |
23 |         testSemigroupLaws
   |         ^^^^^^^^^^^^^^^^^

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch I proposed will also break this:


src/Ouroboros/Consensus/Shelley/Ledger/Ledger.hs:269:10: error:
    • Could not deduce (Eq
                          (ApplyMapKind'
                             'Ouroboros.Consensus.Ledger.Basics.EmptyMK'
                             (SL.TxIn (Core.Crypto era))
                             (Core.TxOut era)))
        arising from the superclasses of an instance declaration
      from the context: ShelleyBasedEra era
        bound by the instance declaration
        at src/Ouroboros/Consensus/Shelley/Ledger/Ledger.hs:269:10-81
    • In the instance declaration for
        ‘TableStuff (LedgerState (ShelleyBlock proto era))’
    |
269 | instance ShelleyBasedEra era => TableStuff (LedgerState (ShelleyBlock proto era)) where

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I don't get is why the Eq constraints that I removed are not needed in ouroboros-consensus, where the core logic of the LedgerDB is implemented, but they are needed by the clients.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope you guys don't mind that I mark this as unresolved for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the extra noise. One important aspect that I overlooked is that at the moment you need the equality constraints to be able to invert values, eg:

areInverses :: Eq v => DiffEntry v -> DiffEntry v -> Bool
areInverses (Insert x) (UnsafeAntiInsert y) = x == y

So this means that we can only use anti-diffs with values that have an Eq instance, right? In that case that is an argument for keeping the old implementation around.

Coming back to the original question of this thread, I wonder if it'd be possible to re-engineer this in such a way that:

  • define table stuff independently of the Eq constraint.
  • if client code wants to use the anti-diff implementation they will need to provide these Eq instances.

@jasagredo
Copy link
Contributor

The groupoid laws seem to fail on Hydra? https://hydra.iohk.io/build/19760053/nixlog/8

@jorisdral
Copy link
Contributor Author

The groupoid laws seem to fail on Hydra? https://hydra.iohk.io/build/19760053/nixlog/8

You're right, and it should fail as far as I can tell with the current Arbitrary instances. I'll have to look into whether this is a bug or a feature

@jorisdral
Copy link
Contributor Author

I am in general concerned that we are loosing the previous implementation. We might as well just delete the HD module right?

Would it be possible to unify the interface and define a CPP flag that switches implementations? Is it even worth? Ping @dnadales

If you consider it a risk to remove the previous implementation, we could keep it around for a while on the feature branch at least

@jorisdral jorisdral force-pushed the jdral/4010-integrate-anti-diffs branch from 316b5dd to 16c93d5 Compare October 11, 2022 11:12
@dnadales
Copy link
Member

I am in general concerned that we are loosing the previous implementation. We might as well just delete the HD module right?
Would it be possible to unify the interface and define a CPP flag that switches implementations? Is it even worth? Ping @dnadales

If you consider it a risk to remove the previous implementation, we could keep it around for a while on the feature branch at least

I'd say we should keep HD around definitely during the prototyping phase. @jorisdral can you remind me if anti-diffs are guaranteed to exist for any choice of k v? If so I do not see a reason why we should keep the old implementation around afterwards.

@jorisdral
Copy link
Contributor Author

I am in general concerned that we are loosing the previous implementation. We might as well just delete the HD module right?
Would it be possible to unify the interface and define a CPP flag that switches implementations? Is it even worth? Ping @dnadales

If you consider it a risk to remove the previous implementation, we could keep it around for a while on the feature branch at least

I'd say we should keep HD around definitely during the prototyping phase. @jorisdral can you remind me if anti-diffs are guaranteed to exist for any choice of k v? If so I do not see a reason why we should keep the old implementation around afterwards.

Anti-diffs are guaranteed to exist for any choice of k v (modulo some constraints like Ord)

@jorisdral jorisdral force-pushed the jdral/4010-integrate-anti-diffs branch from 16c93d5 to e20de2b Compare October 13, 2022 09:47
@jorisdral jorisdral merged commit 121843c into feature/utxo-hd Oct 14, 2022
@iohk-bors iohk-bors bot deleted the jdral/4010-integrate-anti-diffs branch October 14, 2022 14:28
jasagredo pushed a commit that referenced this pull request Nov 17, 2022
…ti-diffs

Integrate anti-diff prototype in `consensus` packages
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
consensus issues related to ouroboros-consensus optimisation Performance optimisation UTxO-HD 📒💽
Projects
None yet
4 participants