Skip to content

tortoise: faster and memory-efficient data structure for storing votes #3569

Closed
@dshulyak

Description

@dshulyak

supersedes: #3007, #3005

overview

local opinion

type id [20]byte

type vote uint8

// ~30bytes of metadata + N*blockInfo (N is expected to be 1)
type layerInfo struct {
         // empty weight is used to count weight if there are no blocks in the layer
  	 empty weight
         blocks []blockInfo
  
  	 // used by verifying tortoise
  	 verifying struct {
              // good weight is a weight of good ballots in this layer
              good weight
              // abstained weight is the weight of ballots that are counted by
              // verifying tortoise that abstained on this layer
              abstained weight
              referenceHeight uint64 
     }
}

// around ~50bytes
type blockInfo struct {
        id id
  	layer uint32
        height uint64
    
  	// hare may support only one block
  	// if hare is not termianted - all blocks will have neutral
  	hare vote
        // validity is either loaded from database or computed by comparing margin with threshold
  	validity vote // support, against, neutral
  	margin weight
}

ballot opinions (global opinion)

type layerVote struct {
    layer *layerInfo
    vote vote // valid values are abstain or against
    votes []*blockVote
}

type blockVote struct {
    block *blockInfo
    vote vote
}

type ballotInfo struct {
      id id
      height uint64
      weight weight
      goodness goodness
      // beacon is copied from reference ballot
      // to check if beacon is consistent with local beacon compare values
      beacon [20]byte 
      
      // copied from base ballot, updated with a difference
      // note that this is a naive version, and likely can't be used in practice
      votes []*layerVote
}

full state

type state struct {
  // localOpinion
  localOpinion []layerInfo
  
  // evicted starts with effective genesis - 1
  evicted uint32
  // votes are stored only for layers in the sliding window 
  votes 	   [][]ballotInfo
  
  // ballots reference is used to get base and reference ballot
  ballots map[id]*ballotInfo
  // blocks references are used t 
  blocks  map[id]*blockInfo
}

state modifications

on block

  • add block to the correct layer in the local opinion
  • if there are ballots that can vote on this block, add vote on this block to every ballotInfo.votes

on hare output

  • expected to be called after block was added to the state
  • find a block in the local opinion, set hare validity to support for this block, set other blocks in same layer to against

on ballot

  • expected to be called after referenced blocks, base and ref ballots are added to the state
  • copy pointers to votes from base ballot
    • with 2000 sliding window naive implementation copies ~ 16KB per ballot (2000 pointers), to keep all votes in memory we will need ~1.6GB
  • iterate over support, against, abstain arrays - modify ballotInfo.votes based on those differences

efficient votes data structure

supported operations:

  • iterate over all votes in any direction (either in order or in reverse, no need to support both directions)
  • efficiently copy base ballot votes, also without huge memory waste

options

bit per vote

bit per vote encoding. for sliding window 2000 every ballot will store ~250bytes (2000 / 8). this is also assumes that there is atmost one block per layer. if not - encoding becomes much more complex

persistent list with votes

copy base ballot last vote from list where opinion was the same and update with new opinions.
example:

  • A: 1 <- 2 <- 3 <- 4
  • B doesn't overwrite any votes of A 4 <- 5
    • this is a perfect case. every new ballot will add only 2 pointers (16 bytes) to store votes. 800 bytes per 50 ballots. with 2 000 layers in the window - ~16mb. it doesn't take into account actual data (e.g ballot id, weight, beacon) but that is negligible in comparison with votes.
    • we can assume that majority of the nodes will try to use this pattern, with the exception of bugs and attacks
  • C overwrites some previous votes: 2 <- 3' <- 4' <- 5
    • synthetic worst case is if every ballot voting pattern votes differently than the base starting from effective genesis. every ballot adds 2*2000 pointers (32kb per ballot), 3.2GB per sliding window.

other questions

how to reset weight for verifying tortoise?

iterate over all layers that needs to be reset and set verifying.weight and verifying.abstained to zero

what happens with votes if verified layer is below layer that supposed to be evicted?

the memory will grow above expected value. it could happen if partition existed for longer then sliding window. but there is no better option

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

✅ Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions