Skip to content
This repository has been archived by the owner on Jun 11, 2022. It is now read-only.

Upstreaming OBFT stuff #693

Closed
wants to merge 137 commits into from
Closed

Conversation

vsubhuman
Copy link
Contributor

@vsubhuman vsubhuman commented May 17, 2019

Upstreaming all the changes we have done in the fork. Includes:

  1. OBFT processing (does not include OBFT signature validation)
  2. Rollback handling
  3. Special new index for epoch-lengths and loose blocks
  4. Changes in epoch pack files to include epoch length and flags for whether it's an EBB epoch or not
  5. Changes in storage to support block-by-height and status endpoints in the bridge ("status" returns information about currently known local and network tips)

…lso had to add changes to rust-cardano itself, in order to support block location by height
…umber as ordinal so that loaded from disk lookups can be sorted
…mented new `BlockLocation::Height`, and extended `read_block_at` to process this new location type
…in `lib::read_block_at` into reusable manner
…h binary offset from epoch pack. Updated `lib::block_location_by_height` to return new block location type - offset, so it gets processed directly into reading raw block from pack
…ose_index` where references to all loose blocks with their order and height are stored.
… loose index with new blocks or with packed epochs
vsubhuman and others added 11 commits April 24, 2019 21:45
HeaderHash and BlockHash are the same hash but just Blake2b256 vs raw byte slice of the same hash.
This caused header_hash() to re-hash the hash, giving incorrect hashes
and thus breaking the rollback code that it was written for.

This change was tested with the new rollback code and with a test
assert.
Epoch boundary blocks don't contribute to the difficulty of the chain,
but some asserts in Storage written for the rollback feature did not
take this into account, which caused panics during rollback.

These changes were tested under rollback (and existing unit tests).
Fixed formatting across all our fork changes
// TODO: should cleanup EBB state here?
//self.last_boundary_block = None;
//self.last_boundary_block_epoch = None;
//self.slot_leaders = vec![];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: gotta decide whether we need to drop "Last EBB" state here after switch to OBFT. Cuz right now it just hangs there forever.

ℹ️ Gotta remember that "Last EBB" state is used to determine whether current epoch is an EBB or not, when storing epoch packs with meta-flags.

// First next-epoch block supposed to be EBB,
// unless we are in OBFT era
if !date.is_boundary() {
// TODO: validate we are in OBFT?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: gotta decide whether we need to add additional checks for proper OBFT switch (e.g. block version, or some other weirdo parameters from blocks). Right now we just allow new epochs to have no EBB, and that's it.

// against the genesis block.
if slot_leader != &address::StakeholderId::new(&blk.header.consensus.leader_key) {
return Err(Error::WrongSlotLeader);
// TODO: validate OBFT leader
Copy link
Contributor Author

@vsubhuman vsubhuman May 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: No OBFT leader validation present. We don't have the documentation yet on how it's done.

let pack_filepath = config.get_epoch_pack_filepath(epochid);
tmpfile::atomic_write_simple(&pack_filepath, hex::encode(packref).as_bytes()).unwrap();
epoch_write_pack(config, packref, &None, epochid, index.offsets, flags).unwrap();
// TODO: need to put new entry to storage, but storage is not present here =(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Idk what to do with this function, cuz by function definition we don't have all the context we need in here, but this function is never actually called anywhere, so we gotta either just delete it, or maybe add required context to function arguments, so if it's ever called - caller will be aware of it.

@vsubhuman
Copy link
Contributor Author

@disassembler

exe-common/src/config.rs Outdated Show resolved Hide resolved
@vsubhuman
Copy link
Contributor Author

vsubhuman commented May 17, 2019

Notes to self:

  • Remove obft-testnet (along with the weird genesis hash hack)
  • Check and fix if needed loose blocks deleting from file-system at rollback

@vsubhuman
Copy link
Contributor Author

Discovered that we, after all, need to fix loose blocks deleting at rollbacks, and also need to close some rollback scenarios that we made possible in the code, but which are impossible in Cardano. @rooooooooob is working on it atm. Will activate PR after this.

@vincenthz
Copy link
Member

@vantuz-subhuman there's no reason to explicitly delete loose blocks that are not used anymore, unless you really really want to delete them now. the storage design is that eventually this is taken care by packing/gc (think of git)

@vsubhuman
Copy link
Contributor Author

the storage design is that eventually this is taken care by packing/gc (think of git)

🤔 Interesting. I thought loose blocks only deleted when epoch becomes stable and gets packed, but then we go from tip back in history to link the epoch blocks, so blocks that were forked out would get lost.

Seems I might be missing something there then. Will check it out, thank you! But then we are still waiting for some small fixes in rollback logic, to not allow it to go ways it shouldn't.

@vincenthz
Copy link
Member

All the code to do that is not necessary in place (can't remember the details right now), but it's not a problem to have unreferenced blocks on disk, just like a commit in git that is not used anymore, it is eventually gc

@vincenthz
Copy link
Member

vincenthz commented May 21, 2019

I started looking at the PR, but this likely suffer the same too-many-things-at-once problem of the first one.

Please cut the obvious improvements, refactoring and adjustment into multiples PRs, leaving the more sizeable parts as logical improvement, and this would help massively get this push through

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants