Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImmutableDB: allow to lookup blocks by slot, leverage in db-{analyser,truncater} #1143

Merged
merged 4 commits into from
Jun 17, 2024

Conversation

amesgen
Copy link
Member

@amesgen amesgen commented Jun 13, 2024

This PR adds a new function to the internal ImmutableDB API:

-- | Get the hash of the block in the given slot. If the slot contains both
-- an EBB and a non-EBB, return the hash of the non-EBB.
getHashForSlot :: SlotNo -> m (Maybe (HeaderHash blk))

It is then used in two ways:

  • In db-analyser: Many analyses do not actually need a ledger state to perform their work. This PR adds support for avoiding this redundant and long work on startup. However, the existing ImmutableDB streaming API needs a point (previously read from the ledger state), not just a slot to start streaming. This is where getHashForSlot comes in.

  • In db-truncater: Previously, truncating after a slot took linear time (by iterating over the entire database). With getHashForSlot, it is easy to now change it to take constant time. Note that the behavior changes slightly, see the commit message for details.

@amesgen amesgen force-pushed the amesgen/db-immutaliser branch 2 times, most recently from e01abc7 to 9939044 Compare June 13, 2024 13:53
@amesgen amesgen marked this pull request as ready for review June 14, 2024 12:56
@amesgen amesgen requested a review from a team as a code owner June 14, 2024 12:57
Base automatically changed from amesgen/db-immutaliser to main June 17, 2024 14:32
See later commits for how this is useful in db-{analyser,truncater}.
This also slightly changes the semantics: Previously, it would truncate to the
block with the largest slot that is smaller or equal to the slot argument. Now,
it will only truncate to the (non-EBB) block in exactly that slot, and fail
otherwise.

In fact, the new behavior more closely corresponds to the CLI description:

  --truncate-after-slot SLOT_NUMBER
                           The slot number of the intended new tip of the chain
                           after truncation
@amesgen amesgen added this pull request to the merge queue Jun 17, 2024
Merged via the queue into main with commit 4e3ff22 Jun 17, 2024
16 checks passed
@amesgen amesgen deleted the amesgen/immdb-slots branch June 17, 2024 22:42
facundominguez pushed a commit that referenced this pull request Jun 25, 2024
…,truncater} (#1143)

This PR adds a new function to the internal ImmutableDB API:
```haskell
-- | Get the hash of the block in the given slot. If the slot contains both
-- an EBB and a non-EBB, return the hash of the non-EBB.
getHashForSlot :: SlotNo -> m (Maybe (HeaderHash blk))
```
It is then used in two ways:

- In db-analyser: Many analyses do not actually need a ledger state to
perform their work. This PR adds support for avoiding this redundant and
long work on startup. However, the existing ImmutableDB streaming API
needs a point (previously read from the ledger state), not just a slot
to start streaming. This is where `getHashForSlot` comes in.

- In db-truncater: Previously, truncating after a slot took linear time
(by iterating over the entire database). With `getHashForSlot`, it is
easy to now change it to take constant time. Note that the behavior
changes slightly, see the commit message for details.
facundominguez pushed a commit that referenced this pull request Jun 28, 2024
…,truncater} (#1143)

This PR adds a new function to the internal ImmutableDB API:
```haskell
-- | Get the hash of the block in the given slot. If the slot contains both
-- an EBB and a non-EBB, return the hash of the non-EBB.
getHashForSlot :: SlotNo -> m (Maybe (HeaderHash blk))
```
It is then used in two ways:

- In db-analyser: Many analyses do not actually need a ledger state to
perform their work. This PR adds support for avoiding this redundant and
long work on startup. However, the existing ImmutableDB streaming API
needs a point (previously read from the ledger state), not just a slot
to start streaming. This is where `getHashForSlot` comes in.

- In db-truncater: Previously, truncating after a slot took linear time
(by iterating over the entire database). With `getHashForSlot`, it is
easy to now change it to take constant time. Note that the behavior
changes slightly, see the commit message for details.
github-merge-queue bot pushed a commit that referenced this pull request Aug 2, 2024
Closes #1202

This PR reverts the behavioral change of #1143, specifically
5747d3c. Concretely,
`--truncate-after-slot slotNo` will now remove all blocks with a slot
number higher than `slotNo` in the ImmutableDB, but does not require
that a block with exactly that slot number exists. This is convenient eg
for truncating all blocks after an epoch without having to find out the
exact slot of the last block in the epoch just before.

At the same time, the run time is still much faster than before #1143:
We iteratively check all slot numbers descending from the given one, and
truncate to the first point that is in the ImmutableDB. As realistic
ImmutableDBs are only somewhat sparse (active slot coefficient is `f =
1/20`), this should be very fast (ie still constant time in the length
of the chain if we consider the slot distance between any two adjacent
blocks to be bounded). In addition, we explicitly check whether the
given argument is beyond the tip of the ImmutableDB, and immediately
exit (successfully) in that case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

2 participants