Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Voting as separate table #368

Closed
5 of 6 tasks
rhsimplex opened this issue Jun 8, 2016 · 6 comments · Fixed by #379
Closed
5 of 6 tasks

Voting as separate table #368

rhsimplex opened this issue Jun 8, 2016 · 6 comments · Fixed by #379
Assignees

Comments

@rhsimplex
Copy link
Contributor

rhsimplex commented Jun 8, 2016

Problem

Per the white paper § 4.7.3, votes are appended directly to the blocks. While the votes are signed, and can thus be validated, there is no notion of when a vote is written or if its value has been changed. This makes it impossible to revert Byzantine changes to the block table as long as changes are handled by reacting to a changefeed -- a changed vote could be changed back, but that would trigger a change and so on. This was recognized while trying to resolve #195 and #194.

Solution

@ttmc proposed the following solution: write votes to a table separate from blocks. This has a few advantages (see initial discussion on #195):

  1. All updates to the block and voting tables (except appends) can be treated as incorrect. This is conceptually simpler.
  2. Votes are hashed like blocks now, providing a verifiable chain(?)
  3. The "update causes an update causes an update..." quagmire is avoided. Updates to vote and block tables need only verify the new value fits in the chain (e.g. hash of next/previous block is correct). This doesn't protect against the last vote being altered by the voting node, but once another vote is cast, it's locked.

Tasks:

  • BigchainDB setup initializes third table for votes -- should include a "genesis" vote
  • New votes append to votes table instead of block table. Primary key should be something like block_id + voting_node_id -- this is unique, and another insert with this id will be rejected.
  • New vote should include hash of previous vote. (Is this necessary?)
  • Queries will need to be updated, particularly those which depend on block validity. Block validity will be decided via query to the votes table rather than the block table
  • Some processes (block, voter) will need modification
  • Update documentation and WP
@ttmc
Copy link
Contributor

ttmc commented Jun 8, 2016

I'm not sure we need a hash chain of votes. Building that would be just as problematic as building the hash chain of blocks (a problem we solved by letting each node declare its own view of which block is the previous block, in its vote). If we don't need a hash chain of votes, then we also don't need a "genesis vote."

Each vote already includes pointers (hash/id) to the block-being-voted-on and the previous-block (from the voting node's point of view). That's what builds the "block chain." You can think of the votes as the chain links.

Votes are signed, so we can tell if one has been tampered-with (by someone other than the node which wrote it).

I'm not sure of the best way to build the vote id (primary key). It should mix block_id and voting_node_id in such a way that no other pair leads to the same vote id, and it should also be a "good" primary key, in some sense. (What makes for a good primary key in RethinkDB?)

@r-marques r-marques mentioned this issue Jun 8, 2016
17 tasks
@rhsimplex
Copy link
Contributor Author

So every vote should be allowed then? Because any node could write a vote with the correct primary key. Of course the signature will reveal it's false, but when should it happen? During voting? This will become voting on votes...

During query? The first correct valid votes? How do we prevent a node from from jamming incorrect ids in and preventing legitimate votes?

Maybe we shouldn't key the votes this way, then we only need the first valid votes from a node...?

@rhsimplex
Copy link
Contributor Author

rhsimplex commented Jun 8, 2016

Ok, talked to @ttmc here is the flow we propose for changes to the vote table:

Create

There are only two possiblities:

  1. nothing ↦ valid vote: do nothing
  2. nothing ↦ invalid vote: delete vote (triggers changefeed)

Update

There are four possibilities:

  1. valid ↦ valid: nothing. A node should not be changing its vote, but we cannot revert a valid to valid change. Why? Because this itself would trigger and update: valid ↦ valid.
  2. valid ↦ invalid: revert (triggers change)
  3. invalid ↦ valid: this change must be accepted because of the previous change
  4. invalid ↦ invalid: delete, something weird is going on

Delete

valid ↦ nothing: create (triggers changefeed)?
invalid ↦ nothing: do nothing

Any change except adding a new valid vote should trigger a warning to the federation of possible Byzantine behavior

@ttmc
Copy link
Contributor

ttmc commented Jun 9, 2016

I was thinking about this some more...

We really don't want to allow a valid vote to be changed (e.g. from "this block is valid" to "this block is invalid"), because that could cause the validity of the block to change (e.g. from valid to invalid), which could make decisions that had been made, based on the validity of that block, to become invalid... a snowball effect.

My original thought was to have the votes table be "append-only," i.e. where the only thing that can happen is insertion of votes (i.e. the database itself only allows Create to happen, not Update or Delete). Unfortunately, RethinkDB does allow Update and Delete... We could:

  1. switch to a truly-append-only / logging database for the votes table
  2. use our future RethinkDB wire-protocol sniffer firewall thingy to ignore all incoming requests to Update or Delete stuff in the votes table (and to sound the alarms).
  3. figure out a way to detect the infinite loop that happens if we revert a valid ↦ valid change. How can we detect that's happening and shut it down?
  4. something else

Before we explore option 3 (the most pragmatic one), there's another issue: even if we can revert changes somehow, there will be a brief period where the vote is changed and that can cause the above-mentioned snowball effect. In other words, allowing any kind of change, even briefly, is a recipe for misery.

Proposal: Assume that we will go with option 1. or 2. Pretend that the votes table is append-only. Allow both valid and invalid votes to be written to it. Ignore the invalid votes at query time. If a node ever notices an Update or Delete on the votes table, it should raise an exception / alarm.

@ttmc
Copy link
Contributor

ttmc commented Jun 9, 2016

Suggested by @r-marques and Bruce:

Maybe another option would be: 4. create a modified version of RethinkDB in which user tables don't allow Update or Delete, and use that for the bigchain and votes tables. (It's probably still necessary for the RethinkDB System Tables to allow Update and Delete.)

(The backlog table must allow Update and Delete operations. Maybe it can just live in its own vanilla RethinkDB database.)

@ttmc
Copy link
Contributor

ttmc commented Jun 10, 2016

I'm not sure if I should be amused or concerned... I was reading about append-only databases and stumbled across an interesting fact: RethinkDB started out as an append-only database! Apparently there were problems...

http://www.xaprb.com/blog/2013/12/28/immutability-mvcc-and-garbage-collection/

If you read that and became pessimistic about append-only databases, here's a 2015 article that's more optimistic (and which even responds to the above blog post): "The rise of immutable data stores"

http://www.pwc.com/us/en/technology-forecast/2015/remapping-database-landscape/immutable-data-stores--rise.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants