Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Whitepaper] Incorrect claims around BitShares #43

Open
xeroc opened this issue Nov 25, 2016 · 49 comments
Open

[Whitepaper] Incorrect claims around BitShares #43

xeroc opened this issue Nov 25, 2016 · 49 comments

Comments

@xeroc
Copy link

xeroc commented Nov 25, 2016

Hello there,
I recently went through the cosmos whitepaper and am quite impressed by your work. Adding collateral to block producers is a great idea to keep them honest. This is certainly an improvement over DPOS as used in BitShares. However, I would like to request correction of the nothing-at-stake problem you claim exists in BitShares, because it does not. Here is why:

The Nothing At Stake problem assumes certain properties of POS based upon the Peercoin design.

It is impossible for DPOS to produce an alternative longer chain without collusion of 51% of the delegates. With 51% collusion you are back to the 51% attack that all systems are subject to.

In this case the delegates also have something at stake: their job.

So how has DPOS changed the game?

  1. "Miners" are now generally public, known individuals rather than anonymous individuals.
  2. Secret attacks are no longer possible
  3. Alternative chains are not possible
  4. In the time it takes Bitcoin to generate 1 confirmation (10 mins on average) dozens of witnesses would have confirmed a block in BitShares/DPOS already

Some more details

Thus, the claims in the whitepaper:

By using and improving upon proven BFT algorithms developed at MIT in 1988 [20], the Tendermint team was the first to conceptually demonstrate a proof-of-stake cryptocurrency that addresses the nothing-at-stake problem suffered by first-generation proof-of-stake cryptocurrencies such as NXT and BitShares.

and

Though BitShares achieves high performance (100k tx/s, 1s latency) in ideal conditions, it is subject to double spend attacks by malicious witnesses which fork the blockchain without suffering an explicit economic punishment -- it suffers from the "nothing-at-stake" problem.

... are inaccurate. Besides that, you correctly noticed that forks are restricted to a certain maximum length by means of TaPOS (Transactions as proof of stake) by referring to the blockhash in each transactions.

Furthermore, there has been made a rather easy change that affects the whole discussion immensely and should maybe added to cosmo/tendermint aswell and that is irreversible block. Basically, once 2/3 (66%) -- or 75% if you want -- of the block producers have signed a block (by adding a block ontop of a particular fork) then the block is called irreversible. The software cannot revert/reorg the blockchain beyond this point. This makes an irreversible block become a checkpoint (in terms of bitcoin). If you compare pow with dpos, then you see that while bitcoin's pow generates 1 confirmation after 10 minutes (on avg.) which makes a reorg unlikely after 6 confirmations (60minutes), BitShares' DPOS makes a reorg impossible after ~60 seconds (30 witnesses, 2/3 makes it 20 witnesses, 3 seconds per block). A "fork" can thus be at most 60 seconds old. A double spend is only possible within these 60 seconds then. We have a similar recommendation to the "wait for 6 confirmations" on bitcoin which goes like this: "Use the irreversible head block to be on the safe side".

(Also, it is the committee that can change blockchain parameters, not the delegates.)

While I understand that you paper wants to highlight the improvements of cosmos/tendermint over existing blockchain technologies, I would kindly request to consider correcting the whitepaper for the sake of correctness.

Kind regards and thanks for reading my issue
-- Fabian

@tomtruitt
Copy link

"In this case the delegates also have something at stake: their job."

A job isn't very tangible and says little about what is at stake. Did they pay $10,000 for that job and now will lose the money they put in due nefarious activity? That seems to be an important part of having something at stake...

@shelby3
Copy link

shelby3 commented Nov 25, 2016

xeroc the white paper I am composing now will refute you. I can't make the entire white paper open source right now, and I don't have time to figure out which portions I could extract to make public now to try to convince you. Instead I will endeavor to make my project open source asap. Apology.

I'll offer this teaser:

https://bitcointalk.org/index.php?topic=1319681.msg16983107#msg16983107

@clayop
Copy link

clayop commented Nov 26, 2016

@tomtruitt
Witnesses will lose their position as well as accompanied income (block validation reward) and never get voted back again. They also put much efforts to become witness sometimes financially as well, which become their reputation in the community. IMO being a witness is like monetizing your reputation, and a nefarious behavior can destroy all this opportunity.

@jaekwon
Copy link
Contributor

jaekwon commented Nov 27, 2016

When the Tendermint whitepaper was out, in early 2014 it described how to incorporate a classical BFT algo into a blockchain to ensure that validators can come to agreement about the next block, safely with tolerance for up to 1/3 of Byzantine nodes, prior to signing commit votes that finalized the block. To become a validator, nodes have to post significant collateral, and there would be an unbonding period prior to the tokens being transferrable. The solution split the problem into short range and long range problems and provided a solution for the first, and a way to circumvent the second.

While Tendermint was published, Bitshares DPOS had no BFT algo nor unbonding period. The only thing at stake, as you said, was the job of the 101 chosen delegates, which you could get by paying a small application fee, and getting voted by the token holders. The problem with such a scheme is that general users of a platform don't have much information to go by in voting for or against candidates, and in fact I recall that quite a few of the chosen delegates were chosen by virtue of advertising some incentive for users to vote for them. Users didn't have much at stake either, regardless of who they vote for. And as I just mentioned, the application fee to become a candidate was low, only to prevent spam. So, strictly speaking, nothing much was at stake.

The second paragraph you quoted was referring to the graphene system. You mentioned something about irreversible blocks, I'd like to see a reference to a spec for that, if it applies to graphene.

Every block in Tendermint is final. We can get finality on the order of 1 second, depending on the number of validators. Each validator can have huge collateral behind it, and it does not get punished for as long as it follows the rules of the protocol. You can't get these properties without building around a solid BFT algorithm that can provide you with absolute fork accountability.

If you want to protest, please first post a link to a proof of safety and liveness, and the minimal conditions in which a fork or halt would happen. And finally, a proof of fork accountability, so proof that you can determine who are responsible for a fork or halt condition.

@MikeSofaer
Copy link

@xeroc I'd like to help find language for the white paper that you think is more appropriate. I don't fully understand your critiques, but it seems like

  1. there is some disagreement on what should count as a solution to the "nothing at stake" problem

and

  1. it's not clear there is agreement on what "subject to double spend" means.

Can you suggest some clarifying edits that preserve what Jae is trying to say?

@xeroc
Copy link
Author

xeroc commented Dec 1, 2016

@tomtruitt, that is true. No funds need to be put upfront to become a witnesses. Though it only changes the incentives and costs of a witnesses and thus the economics, not the security. Today, the fee involved in getting a witness account is some $150 that are removed from supply and don't give the witness a guaranteed ROI

@jaekwon, back in 2014, we were running BitShares in version 1.0 which had exactly the problems you described. Today, with Bitshares 2.0, we have proxy voting. It allows shareholders to direct their voting power to another account which then has more total voting power. This allows to establish a group of arbitrary size to be able to react quickly shall it be necessary (e.g. fire a witness)

You mentioned something about irreversible blocks, I'd like to see a reference to a spec for that, if it applies to graphene.

Goo point, we should probably have a spec about it somewhere, until then, I can refer to a forum thread of the devs: https://bitsharestalk.org/index.php?topic=18720.0

Each validator can have huge collateral behind it, and it does not get punished for as long as it follows the rules of the protocol. You can't get these properties without building around a solid BFT algorithm that can provide you with absolute fork accountability.

I don't fully understand your statement. You mention that each block is final, yet you tell me that you have accountability for forks. How does that fit together? How can you have finality in blocks just because you can "claim" the collateral of the validator? If I can make more money from creating a bad block than I loose as collateral, can I do that attack? Obvisouly not because you would need to do a double spend which your protocol certainly doesn't allow. It's the same thing in graphene, the validators cannot possible do anything except validate or not validate transactions. The worst thing that can happen in both cases is that validators can censor transactions if they all collude. correct?

If you want to protest

I don't mean to protest against the security model behind cosmos or tendermint at all. I'd like to request an update of the claims around BitShares and Graphene in the whitepaper.

please first post a link to a proof of safety and liveness, and the minimal conditions in which a fork or halt would happen.

This leads me to one of the major problems with DPOS that we haven't fixed for years now: There is no whitepaper that describes it thoroughly and that is also the reasons why I am posting here to make sure the documents that refer to DPOS are accurate.

And finally, a proof of fork accountability, so proof that you can determine who are responsible for a fork or halt condition.

Let me tell you about a feature that we had in one of our Graphene-based forks called STEEM. In Steem, validators are paid in SteemPower which is vested for some time and cannot be withdrawn immediately. This is similar to collateral except that they can earn their collateral. Now, we had a feature that allowed you to claim all the collateral if you could just proof that a witness has signed a block twice (since witnesses are order and have certain slots, this can easily be identified). However, it wasn't received very well mostly because a length-1 fork allows to claim everything already why not presenting any harm to the network. However, given that witnesses are required to be highly available, they happen to run two machines at the same time when upgrading the software which can lead to length-1 forks easily if not done right. It sounds inappropriate to claim all their stake due to length-1 forks that don't mean any harm to the network.

So, what I want to really say is that if you want collateralized witnesses, you can easily do so even with DPOS and even though it is not widely used among existing blockchains.

@MikeSofaer

Can you suggest some clarifying edits that preserve what Jae is trying to say?

Do you want me to send a Pull request?

@xeroc
Copy link
Author

xeroc commented Dec 1, 2016

@shelby3

I don't mean to start a massive discussion here but let me try to clarify on the points you published on bitcointalk

0:40 - "witness are under control of the shareholders". Incorrect. They can do whatever they want until they are replaced by an election.

There is no such thing as an election. Shareholders can vote at any time. If they don't have the resources to do so, they can redirect their voting power to a proxy.
Votes are tallied every 24h (a period that can easily be modified by the committee, whose members are shareholder approved aswell)

0:45 - "actually more secure than Bitcoin". Incorrect. I explain why in my white paper.

Can't comment much here, but the claim is at most inaccurate. It should go like this:
DPOS is more accurate than POW under short to medium time constraints, e.g. for 5-60 minute old transactions
The proof is quite obvious if you read about irreversible blocks in the opening post.

0:48 - "proof-of-work is ... burn most money is most secure ... proof-of-stake ... is also a scarce resource". Incorrect.

Well, stake in BTS is at most 3.7B, so it is limited and thus scarce.

1:40 - "all systems are vulnerable to 51% attack ... no such thing as avoiding 51% attack". Incorrect. PoS is economically vulnerable to double-spends, Bitcoin is not. I explain this in my white paper. Correct that all systems have some form of 51% attack, but PoS has additional 51% attack vectors.

You are mixing up two things here. One is 51% attack, the other is double spending. The claim you quoted is correct because that is the definition of "consensus".
Not sure what you mean by additional 51% attack vectors. That makes no sense to me -- either you have 51%+ of the stake, or you don't.
I would love to read your proof about why POS (and we are talking about DPOS here, btw) is "economically vulnerable to double spends" (what ever that even means).

2:10 - "where we get the advantage over proof-of-work is that the cost of acquiring the 51% is much higher in proof-of-stake". Incorrect. Much easier to borrow or rent 51% stake than to rent 51% of mining farms. The mining farms have too much at stake. The shareholders have nearly nothing-at-stake because of their shares being an undersupplied public good. Worse yet, it may be more profitable for the whales to double-spend and short, than to sell their stake straight up.

Whales cannot double spend. And the fact that the share price of a given DPOS-driven chain is low, doesn't mean you can easily acquire 51%+ of the stake.

2:20 - "because DPoS uses deterministic manner of producing blocks, we don't have to rely on random chance". Incorrect. He fundamentally does not understand resiliency and liveness. I explain this in detail in my white paper. Thus he doesn't understand why Graphene will never scale up to the world. Dunning-Kruger-esque.

Oh, it seems you need to read up on DPOS block generation process. Let me give you a quick TL;DR;

A set of witnesses is approved by the shareholders. Each witnesses can produce one block in a round. Once the round is finished, a new round with randomly ordered witnesses starts.
The randomness is pseudo random as it uses data from the chain to derive the new sorting. That makes it 'deterministic' as every participant knows who is going to produce the next block.

3:30 - "get non-linear growth in the ability to achieve things when you concentrate capital". Incorrect. He is touting the concentration of nodes to 110 witnesses as being some advantage, because he never figured out how to otherwise solve the propagation scaling issue that plagues an unbounded number of nodes. But it doesn't follow that a concentration, bounded, and permissioned, provides resiliency and liveness. He is conflating.

The number of witnesses has indeed been constraint in BitShares 1, but is no longer in BitShares 2. It is yet another shareholder voted parameter that could go from 11 up to 1100 (arbitrary number). The scaling constraints are in fact the time it takes for packages to travel around the world. If you had all the validators in one room, you could easily scale to billions of transactions with block confirmation times of less than 1sec, but a publicly distributed and decentralized network is located around the globe with worst case scenarios of 1-2secs for a roundtrip of packages.

Did you notice how I distinguish between the scaling problem being a latency thing and the concentration problem being a shareholder thing?

@MikeSofaer
Copy link

@xeroc I feel like this thread is picking up a lot of scope for a Github issue on Cosmos. Maybe a pull request is the way to go, so we don't get too distracted by high-level approach questions. No one wants to say things about Graphene that aren't true, but there are distinctions that we want to highlight.

@jaekwon
Copy link
Contributor

jaekwon commented Dec 1, 2016

@xeroc, first of all, thanks for coming here and opening the discussion. This is exactly why we have our whitepaper on Github.

@tomtruitt, that is true. No funds need to be put upfront to become a witnesses. Though it only changes the incentives and costs of a witnesses and thus the economics, not the security. Today, the fee involved in getting a witness account is some $150 that are removed from supply and don't give the witness a guaranteed ROI

What happens to a shareholder when a witness that they "approve" (delegate voting power to?) happens to launch an attack?

@jaekwon, back in 2014, we were running BitShares in version 1.0 which had exactly the problems you described. Today, with Bitshares 2.0, we have proxy voting. It allows shareholders to direct their voting power to another account which then has more total voting power. This allows to establish a group of arbitrary size to be able to react quickly shall it be necessary (e.g. fire a witness)

So can we change the wording to the following?

By using and improving upon proven BFT algorithms developed at MIT in 1988 [20], the Tendermint team was the first to conceptually demonstrate a proof-of-stake cryptocurrency that addresses the nothing-at-stake problem suffered by first-generation proof-of-stake cryptocurrencies such as NXT and BitShares-1.0.

It's hard to make any statement about BitShares-2.0/Graphene, without a spec. But I'm interested in learning more and discussing.

Goo point, we should probably have a spec about it somewhere, until then, I can refer to a forum thread of the devs: https://bitsharestalk.org/index.php?topic=18720.0

It seems that the design decisions of Graphene is still in flux. Is that true? (If so great, we might be able to help).

Each validator can have huge collateral behind it, and it does not get punished for as long as it follows the rules of the protocol. You can't get these properties without building around a solid BFT algorithm that can provide you with absolute fork accountability.

I don't fully understand your statement. You mention that each block is final, yet you tell me that you have accountability for forks. How does that fit together?
How can you have finality in blocks just because you can "claim" the collateral of the validator?

It's the other way around. We can have huge collateral because we can have finality of blocks & fork-accountability. The latter two come from the consensus algorithm. A block is "created" as soon as it gets signed by over 2/3 of validators (by voting power) such that the block is immutable. But that's an oversimplification... for more information on the consensus algorithm, check out http://github.com/tendermint/tendermint/wiki .

If I can make more money from creating a bad block than I loose as collateral, can I do that attack? Obvisouly not because you would need to do a double spend which your protocol certainly doesn't allow. It's the same thing in graphene, the validators cannot possible do anything except validate or not validate transactions. The worst thing that can happen in both cases is that validators can censor transactions if they all collude. correct?

Validator == witness in Graphene?
Double-spends are always possible, even in Bitcoin. It's just a matter of what it takes to perform such an attack. In Cosmos/Tendermint, a fork is not supposed to happen, but it can. And when it does, we can determine who is responsible. It seems that this term is confusing, because in Graphene, forks are allowed as long as it comes after the most recent "irreversible" block. When we say "fork", we're referring to the violation of "irreversibility". Does that make sense?

I don't mean to protest against the security model behind cosmos or tendermint at all. I'd like to request an update of the claims around BitShares and Graphene in the whitepaper.

Ok. Lets make corrections, and thanks for the contribution.

please first post a link to a proof of safety and liveness, and the minimal conditions in which a fork or halt would happen.

This leads me to one of the major problems with DPOS that we haven't fixed for years now: There is no whitepaper that describes it thoroughly and that is also the reasons why I am posting here to make sure the documents that refer to DPOS are accurate.

It's really hard / impossible to have a discussion about something that doesn't have a defined spec, because most likely it means that the spec is in flux. So I'm happy to refer to BitShares-2.0 as being different than BitShares-1.0, but we can't make any definitive statements about it.

And finally, a proof of fork accountability, so proof that you can determine who are responsible for a fork or halt condition.

Let me tell you about a feature that we had in one of our Graphene-based forks called STEEM. In Steem, validators are paid in SteemPower which is vested for some time and cannot be withdrawn immediately. This is similar to collateral except that they can earn their collateral. Now, we had a feature that allowed you to claim all the collateral if you could just proof that a witness has signed a block twice (since witnesses are order and have certain slots, this can easily be identified). However, it wasn't received very well mostly because a length-1 fork allows to claim everything already why not presenting any harm to the network. However, given that witnesses are required to be highly available, they happen to run two machines at the same time when upgrading the software which can lead to length-1 forks easily if not done right. It sounds inappropriate to claim all their stake due to length-1 forks that don't mean any harm to the network.

So, what I want to really say is that if you want collateralized witnesses, you can easily do so even with DPOS and even though it is not widely used among existing blockchains.

That's not quite what we're talking about when mentioning fork-accountability. I'll try to make an analogy to what I see about BitShares2.0 below.

From DanLarimer here: https://bitsharestalk.org/index.php?topic=18720.0 ,

Sort N witnesses by the last block number they signed, then take the highest block number that is lower than 66% of all other witnesses. This will indicate that said block has been confirmed by 66% of all witnesses and is clearly irreversible.

After 2/3 of all witnesses have signed, they cannot "go back" and sign something else. We know a fork to an earlier point is impossible without malicious collusion of witnesses.

It means that 2/3 of the witnesses have signed a block that built off of the block in question. It means that assuming the witnesses are cooperating they are all on the same page.

The entire protocol is based upon the assumption that the witnesses will not collude to produce a fake chain. Thus the last irreversible block is the last block that you can prove 2/3 the witnesses agreed upon.

A proof of safety will prove that if one person sees an irreversible block, then everyone sees that irreversible block. "This will indicate that said block has been confirmed by 66% of all witnesses and is clearly irreversible." What if a witness built off of a block that isn't a descendent of said block? Why is that impossible? "After 2/3 of all witnesses have signed, they cannot "go back"." Why can't they go back? Is this a rule enforced subjectively? What if one witness saw 2/3 sign after said block, but due to network latency many others witness doesn't see those 2/3 and decides to go back?

A proof of liveness will prove that the network will always make progress (eventually). We only care about fully committed "irreversible" blocks, so the proof needs to show that more "irreversible" blocks will be created. I can't comment more without learning more about the questions I asked above, but usually what happens is, people forget to take into account network problems / subjective views, and end up designing an algorithm that can come to a deadlock, e.g. because no witnesses can make a move that won't punish them.

A proof of accountability will prove that when the blockchain fails due to malicous actors, that those actors (or a significant portion, like 33%) can be determined, and thus can be punished. All blockchain consensus algorithms can fail (e.g. double-spend an immutable block) if enough actors are malicious. The trick is proving that when it happens, there's a way to make the attackers accountable for their actions. Here it's important to prove that honest behavior cannot result in being wrongfully framed. Only then would rational agents want to participate && post large amounts of collateral.

I'd like to modify the second contested quote, quoted below for convenience:

Though BitShares achieves high performance (100k tx/s, 1s latency) in ideal conditions, it is subject to double spend attacks by malicious witnesses which fork the blockchain without suffering an explicit economic punishment -- it suffers from the "nothing-at-stake" problem.

But first I need to know a few things, like, whether shareholders have anything at stake for backing the wrong witnesses. Is a validator a witness? And proofs of safety/liveness/accountability, as mentioned previously. I mean, otherwise, maybe the best we can do is to say that we don't know about BitShares2.0 because it isn't spec'd yet. The if that's true, we can help you construct these proofs.

@xeroc
Copy link
Author

xeroc commented Dec 2, 2016

What happens to a shareholder when a witness that they "approve" (delegate voting power to?) happens to launch an attack?

Nothing. No penalties what soever. I think I can see where this is heading. The point your trying to make is probably that there is not much incentive for shareholders to actively participate in monitoring witnesses. Together with voting apathy, we either end up in a setting where witnesses can act as maliciously as they can (which is basically just censorship) without having to fear any consequences in the short term. Fortunately we have proxy voting which adds a competitive political role to the game (as big proxies have more influence on witnesses and committee members).

I think, BitShares could benefit from requiring collateral from witnesses to add an extra layer of incentive to act honestly.

So can we change the wording to the following?

By using and improving upon proven BFT algorithms developed at MIT in 1988 [20], the Tendermint team was the first to conceptually demonstrate a proof-of-stake cryptocurrency that addresses the nothing-at-stake problem suffered by first-generation proof-of-stake cryptocurrencies such as NXT and BitShares-1.0.

Absolutely

It's hard to make any statement about BitShares-2.0/Graphene, without a spec. But I'm interested in learning more and discussing.

We desperately need more documentation about the consensus mechanics on bitshares! In the meantime I will try to answer all your questions!

It seems that the design decisions of Graphene is still in flux. Is that true? (If so great, we might be able to help).

There is graphene which is mostly a toolkit and there is BitShares (2.0) which builds on top of Graphene. Steem is another blockchain that builds ontop of graphene. BitShares is currently not much in flux as shareholders asked for more stability in the protocol while Steem is in constant flux due to it's less crypto-centric community.
However, the foundations of DPOS/TaPOS are not changing much (except for the financial aspects of validation).

A block is "created" as soon as it gets signed by over 2/3 of validators (by voting power) such that the block is immutable.

Oh, I missed that part. Would it be correct to say that in Tendermint, the validators first agree (by at least 2/3) and then build an (irreversible) block, while DPOS witnesses build blocks which become irreversible after 2/3 of the witnesses in the round have approved it (e.g. built a block after it)?
If so, then I would go as far as saying that if you only look at irreversible blocks (which we recommend to everyone doing transactions with medium to larget stakes), DPOS is pretty similar to the consensus scheme behind Tendermint (except for the incentives of the witnesses, e.g. collateralization vs. job position)

Validator == witness in Graphene?

Yes. More accurately, validators = block producing witnesses

Double-spends are always possible, even in Bitcoin. It's just a matter of what it takes to perform such an attack. In Cosmos/Tendermint, a fork is not supposed to happen, but it can. And when it does, we can determine who is responsible. It seems that this term is confusing, because in Graphene, forks are allowed as long as it comes after the most recent "irreversible" block. When we say "fork", we're referring to the violation of "irreversibility". Does that make sense?

It does make sense and is accurate. I should have said that double spends are not possible if you only look at blocks that are irreversible (again, a recommendation). In fact, the BitShares software has a so called delayed node that comes after a regular node and only carries the blockchain up to the irreversible block. From that point on, double spends are impossible. The blockchain (e.g. the software) prevents forking the blockchain prior to the irreversible block (that's why it is so similar to a checkpoint in bitcoin, except that there is an actual protocol about how to deal with those checkpoints and that they are not made up arbitrarily by the developers like in bitcoin).

It's really hard / impossible to have a discussion about something that doesn't have a defined spec, because most likely it means that the spec is in flux. So I'm happy to refer to BitShares-2.0 as being different than BitShares-1.0, but we can't make any definitive statements about it.

That would be great!

A proof of safety will prove that if one person sees an irreversible block, then everyone sees that irreversible block. "This will indicate that said block has been confirmed by 66% of all witnesses and is clearly irreversible." What if a witness built off of a block that isn't a descendent of said block? Why is that impossible? "After 2/3 of all witnesses have signed, they cannot "go back"." Why can't they go back? Is this a rule enforced subjectively? What if one witness saw 2/3 sign after said block, but due to network latency many others witness doesn't see those 2/3 and decides to go back?

In fact, there is no they, there is only the one witnesses that is supposed to build the subsequent block (deterministic order of validators). It happens from time to time that latency to the subsequent validator is higher than the block confirmation time, but it rarely happens that latency is higher than 2 or 3 block confirmation times. This means that if I don't see my previous block (due to latency), I can still build a block based on the block before that and broadcast it to the network. In the mean time, the other witnesses see the signed block and the notice that the witness linked to the wrong block (as they had lower latency) and neglect the signed block. The witnesses that comes right after signs a block and links to the original block making the high latency witnesses miss his block.
Does that make sense to you?

A proof of liveness will prove that the network will always make progress (eventually). We only care about fully committed "irreversible" blocks, so the proof needs to show that more "irreversible" blocks will be created. I can't comment more without learning more about the questions I asked above, but usually what happens is, people forget to take into account network problems / subjective views, and end up designing an algorithm that can come to a deadlock, e.g. because no witnesses can make a move that won't punish them.

Given that we don't have punishments, every witness can produce blocks as long as the software doesn't break.

A proof of accountability will prove that when the blockchain fails due to malicous actors, that those actors (or a significant portion, like 33%) can be determined, and thus can be punished. All blockchain consensus algorithms can fail (e.g. double-spend an immutable block) if enough actors are malicious. The trick is proving that when it happens, there's a way to make the attackers accountable for their actions. Here it's important to prove that honest behavior cannot result in being wrongfully framed. Only then would rational agents want to participate && post large amounts of collateral.

The question to me here is: what is an attack, actually. If you take a look at the blockchain starting at genesis up to the irreversible block, then all transactions/operations in that blockchain are valid per definition. Thus, attacks can only happen after the irreversible block and they will only result in forking blockchains that are shorter because the honest witnesses won't sign blocks that increment the attackers blockchain due to invalid transactions.

Of course, if enough actors are malicious, (e.g. >66%), then they can double spend, but only after the irreversible block!

But first I need to know a few things, like, whether shareholders have anything at stake for backing the wrong witnesses. Is a validator a witness? And proofs of safety/liveness/accountability, as mentioned previously. I mean, otherwise, maybe the best we can do is to say that we don't know about BitShares2.0 because it isn't spec'd yet. The if that's true, we can help you construct these proofs.

A validator is a (block producing) witness (non-block producing witnesses are witnesses that don't have sufficient approval by shareholders and can thus not sign blocks).
I would like to provide that proof to you, but am not sure what form it should take. Is the above rational description sufficient or do you need something else?

@shelby3
Copy link

shelby3 commented Dec 5, 2016

@xeroc wrote:

There is no such thing as an election...Votes are tallied every 24h (a period that can easily be modified by the committee, whose members are shareholder approved aswell)

Semantic shell games doesn't refute my correct statement, “They can do whatever they want until they are replaced by an election.

0:45 - “actually more secure than Bitcoin”. Incorrect. I explain why in my white paper.

DPOS is more accurate than POW under short to medium time constraints, e.g. for 5-60 minute old transactions
The proof is quite obvious if you read about irreversible blocks in the opening post.

DPoS due to TaPoS is more final in the medium-term. But you are incorrectly conflating finality of consensus with security.

DPoS it is less secure than “one chain to rule them all” PoW for numerous reasons which I cover in my white paper. An additional excerpt is:


6.8 Less Secure

As previously explained in the sub-section 4.1 Selfish Mining Example, a mining cartel controlling a majority of the hashrate in PoW which is mined on non-repurposable ASICs has a disincentive to double-spend. Whereas, as previously explained in the sub-section 5.7.2 Security, a cartel of coordinated whales controlling a majority of the stake may find that double-spends (possibly combined with shorting) are more profitable than any loss of liquid value of their stake. If there is a significant enough exchange transaction ecosystem, it might conceivably be the most liquid means for a whale cartel to exit their investment. The control of the majority stake can select all of the witnesses and thus can double-spend at will. There is nothing the minority stake can do to unelect the rogue witnesses. The majority stake attacker could even block the minority stake from selling thus trapping everyone else while the attacker exits with a short position.

Even in the case where there is no cartel of coordinated whales controlling a majority of the stake in order to maintain tight control over the witnesses, a majority of the witnesses could collude to double-spend. And previously mentioned in the section 2. Byzantine Agreement vs. Proof-of-Work Consensus, if more than ²/₃ of the witnesses collude to double-spend, it can be structured in such a way that the honest minority can’t even prove which witnesses are colluding to double-spend.[DPoS_cost]

In another form of insecurity, participants can lose value in other insidious ways due to malevolent witnesses as explained in the sub-section 6.7 Debasing and opportunity cost losses due to resiliency and liveness issues due to lack of asynchrony.


@xeroc wrote:

Well, stake in BTS is at most 3.7B, so it is limited and thus scarce.

If you don't even comprehend the link which was provided, then why bother to reply on that issue? I mean what you wrote is Not Even Wrong.

For one thing, there is nothing-at-stake in terms of electing a set of rogue witnesses by the attacker who controls 51% of the stake, which thus makes double-spending and shorting potentially viable. Whereas, for PoW, there is mining equipment and leases at stake which require commitment to the long-term income of mining. The stake is really not scarce, because it potentially costs nothing (or perhaps less than nothing, i.e. a positive gain relative to normal illiquidity of half of the money supply) to short it and destroy it. There is no way to generate income by destroying mining equipment and paying ongoing mining farm data center and power supply leases that can't be terminated.

You are mixing up two things here. One is 51% attack, the other is double spending.

Sorry you seem to be the one who is confused. A double-spend is achieved with a 51% attack. 51% collusion/control of the stake can select ALL of the witnesses and thus can double-spend at will. There is nothing the minority stake can do to stop it. The 51% attacker could even block the minority stake from selling thus trapping everyone else while they exit with a short position.

I would love to read your proof about why POS (and we are talking about DPOS here, btw) is "economically vulnerable to double spends" (what ever that even means).

I included above the excerpt from my whitepaper which explains that.

Whales cannot double spend. And the fact that the share price of a given DPOS-driven chain is low, doesn't mean you can easily acquire 51%+ of the stake.

By whales I meant collusion/control of 51% of the stake. I am not referring to acquiring 51%, rather to liquefying the illiquid 51% of the stake (held by the upper 1% due to power-law distribution of wealth) by double-spending and shorting the coin. What you don't seem to appreciate is that stake is a power vacuum that over time will always trend to a winner-take-all cartel or control. This is economics. And you do not seem to be very knowledgeable about economics (given you apparently did not comprehend the link you were provided).

Oh, it seems you need to read up on DPOS block generation process. Let me give you a quick TL;DR;

A set of witnesses is approved by the shareholders. Each witnesses can produce one block in a round. > Once the round is finished, a new round with randomly ordered witnesses starts.
The randomness is pseudo random as it uses data from the chain to derive the new sorting. That makes it 'deterministic' as every participant knows who is going to produce the next block.

That is irrelevant. I am writing about the fact that shit happens. Propagation hiccups. Witnesses end up out-of-order, malevolent, etc.. There is no way you can model resiliency and liveness of a naturally asynchronous system with a synchronized (synchronous) round-robin witness model. It is fundamentally broken. Will never scale up reliably.

Please don't presume I don't understand. You might start to consider that you don't understand.

The number of witnesses has indeed been constraint in BitShares 1, but is no longer in BitShares 2. It is yet another shareholder voted parameter that could go from 11 up to 1100 (arbitrary number). The scaling constraints are in fact the time it takes for packages to travel around the world.

The entire design is fundamentally flawed.

You'll understand better when my whitepaper is published.

@shelby3
Copy link

shelby3 commented Dec 5, 2016

@clayop wrote:

Witnesses will lose their position as well as accompanied income (block validation reward) and never get voted back again.

How can they lose their block validation reward if they remain elected for years? You mean you never pay them until they quit. (obviously not) You mean they lose some short-term portion of their rewards, but if they can do more damage (e.g. short the coin, while they drop transactions or fail to generate blocks messing up the speed and reliability) than the reward, then they don't care about the reward.

You don't always know who you are voting for. The witnesses can be Sybils or political hacks.

They also put much efforts to become witness sometimes financially as well, which become their reputation in the community.

Reputation is a clusterfuck of fighting and vested interests scuffles (as Bitshares 1 discovered). The only way DPoS works is with 51% control by a set of colluding whales as is the case for Steem. Wherein the whales are also the witnesses and vote for themselves as much rewards as the market will bear. It is a monopoly leeching power vacuum winner-take-all system.

IMO being a witness is like monetizing your reputation, and a nefarious behavior can destroy all this opportunity.

Naive. Please take some political science and economics classes at the university.

@shelby3
Copy link

shelby3 commented Dec 5, 2016

@jaekwon wrote:

The second paragraph you quoted was referring to the graphene system. You mentioned something about irreversible blocks, I'd like to see a reference to a spec for that, if it applies to graphene.

Afaik, this is primarily due to TaPoS (transactions as proof-of-stake). Over time this provides irreversible finality and solves the long-range nothing-at-stake problem. I was one of the people who helped Dan refine the initial design of TaPoS. This is cited in my whitepaper and I explain why it is secure.

Every block in Tendermint is final. We can get finality on the order of 1 second, depending on the number of validators. Each validator can have huge collateral behind it, and it does not get punished for as long as it follows the rules of the protocol. You can't get these properties without building around a solid BFT algorithm that can provide you with absolute fork accountability.

This is correct. And the Bitshares' replies here seem to indicate they don't understand the implications.

@jaekwon
Copy link
Contributor

jaekwon commented Dec 5, 2016

@shelby3, I appreciate the desire to debate, but your comments are out of the scope of this issue. Feel free to link to say a bitcointalk thread.

More importantly, we don't tolerate insults. E.g. calling someone naive, or telling people to take classes at a university. This is your only warning. Please keep it civil.

@shelby3
Copy link

shelby3 commented Dec 5, 2016

@jaekwon:
What comments are out-of-scope? I am replying on topic of all the points that have been made in this thread. Are you afraid you are been upstaged? Ego?

I didn't appreciate his implied insults saying that I am mixing up things, etc.. You are free to ban me. I will speak how I speak. If you can't tolerate free speech, then make your own closed system jail.

Your warning further insults me.

Open source is not a church. Grow some skin on your back son.

@jaekwon
Copy link
Contributor

jaekwon commented Dec 5, 2016

We welcome the debating of ideas. Criticism should not be about the person, but the ideas expressed by that person.

Rule #1 is, don't be an asshole. Sexism, racism, etc also won't be tolerated.

Just keep it civil and we won't have problems.

@shelby3
Copy link

shelby3 commented Dec 5, 2016

@MikeSofaer wrote:

there is some disagreement on what should count as a solution to the "nothing at stake" problem

Agreed. The NaS term has been used to describe many different attacks which result from the fact that stake is self-referential. In my prior comments, I have mentioned the long-range chain rewrite attack and also the shorting attack both of which nothing is risked at stake, i.e. there is no cost to attacking which exceeds the potential gain.

With PoW, one could argue that 51% hashrate could be rented to perform a double-spend that would exceed the block rewards for that attack duration, but the problem is that the mining farms aren't going to rent their hashrate to an attacker, because they have long-term sunk costs to recover.

@shelby3
Copy link

shelby3 commented Dec 5, 2016

Criticism should not be about the person, but the ideas expressed by that person.

Why are you introducing unnecessary noise into a technical discussion. You are imposing your apparently (Millennials?) politically correct culture on me. Please respect that I don't come from your generation. I have my own value system. I respect competence. When incompetent people make incorrect statements about economics and repeat them over and over again. After hearing that same crap over and over again on Bitcointalk and else where, it becomes easier to just say, "please stop talking nonsense about economics". I (and Paul Sztorc) don't have time to teach every person who thinks they are an expert on economics but aren't. Sometimes people need a little hint to please do their homework.

If you want to discourage participation then fine I will leave.

The Bitshares' people came into your Issues tracker and made incorrect accusations. So they are the ones who have not been humble. And their replies to you in this thread and their replies to me continue to reflect a hubris which exceeds their breadth of understanding. Their replies seem to indicate they didn't even comprehend what you are trying to say about the fact that without a BFT protocol then there is no near-term finality. I just see the discussion as a comedy of errors and you are being far too accommodating. They don't even have a specification.

But if you think it benefits your project to kick out the experts and cow-tail to the the noise makers, then by all means, I am gone.

@MikeSofaer
Copy link

I think there's value in helping people articulate their objections clearly so they can be addressed to everyone's satisfaction. If we avoid being brusque then this thread will be a better resource for people who make the same objection in the future.

@shelby3
Copy link

shelby3 commented Dec 5, 2016

@MikeSofaer I agree but perhaps you have more time and patience than I do. So I will leave so as to not offend. Perhaps we can do this a different time when I have more free time.

Edit: I was 5am for me when I wrote the replies above. I had been working since the prior morning. Sorry this is nature of high pressure work situations. That is nice that you guys have all the time to be so patient and accommodating. I don't disagree with the desire to get the issues discussed clearly. I don't disagree with the desire to have maximum participation by attempting to be amicable. Some where in there is reality of the imperfection of life... sorry I don't have nirvana to offer you...

Rule #1 is, don't be an asshole. Sexism, racism, etc also won't be tolerated.

One person's asshole is another person's freedom-of-expression, personality, or culture. You claim to be for diversity, but then try to force your intolerance for reality on others. Diversity is in fact discord, not concordance. As for assholes, everyone has and is one. The holier than thou stuff is great for lynchings. Vote on. Pile on. Groupthink ain't for me. The Millennials future is scary as fuck.[1]

I don't have the same age and wisdom as you young guys. Sorry. You are too "wise" for me. I know you guys have studied 5000 years of human history and know well the basis for your politically correct oppression.

[1] https://bitcointalk.org/index.php?topic=1703293.msg17080501#msg17080501
https://bitcointalk.org/index.php?topic=1703300.0
https://www.armstrongeconomics.com/uncategorized/what-socialism-destroyed-govt-shutdown/
https://www.armstrongeconomics.com/uncategorized/who-is-in-that-hated-top-1-could-it-be-you/
https://www.armstrongeconomics.com/uncategorized/its-not-the-rich-its-the-total-cost-of-govt-that-is-killing-the-economy/

jaekwon added a commit that referenced this issue Dec 9, 2016
@faddat
Copy link

faddat commented Jan 4, 2017

@shelby3

I regretfully agree. But these are surely not the worst offenders.... and worse still, while I agree with you, many who share your and my point of view do so out of a desire to be assholes.

It's complex, and it sucks.

@bytemaster
Copy link

bytemaster commented Jan 25, 2017

Hello everyone, I would like to take a moment to address this discussion and commend everyone involved for their efforts to describe technical details of DPOS. I have come to believe that much of the confusion comes from a combination of definitions and understanding the true sources of security in a consensus system.

In this thread the term "witness", "miner", "validator", and "producer" are used in imprecise ways. So let me include an excerpt from a paper I am writing:

Block Producer

A block producer is responsible for grouping transactions into a block and broadcasting it to the network.The number of block producers per confirmation period is limited by block production frequency and incentive structures.

Block Validator

Anyone running a full node is a block validator. A block is considered valid if and only if it follows the open source rules of the blockchain. It does not matter how many block producers collude, a block validator (and the rest of the world) will reject all blocks that fail to conform to internal consensus. In theory there are an unlimited number of block validators, limited only by the desire of individuals and businesses to independently verify block producer behavior. This motivation is often to prevent a business from being defrauded by following a fork that violates the consensus rules.

Last Irreversible Block

A block that has been widely acknowledged as being valid and immutable. This block must be accepted by all validators you trust and confirmed by the majority of block producers.

Now that those terms are out of the way we can conclude that the act of producing a block is independent of the act of validating a block. Under DPOS every produced block can be viewed as a proposal and nothing more. This is similar to the first step in a multi-party block producing process such as Tenderment or Ripple.

Under DPOS every subsequent block is both a confirmation of multiple prior blocks and a proposal for the next block. The set of blocks between the last irreversible block (confirmed by 2/3 of producers) and the head block is like a pipe-line of pending block proposals.

By using a pipeline approach, DPOS has an average latency until irreversibility of 2 * BLOCK_INTERVAL * WITNESSES / 3. Which for STEEM is 40 seconds. While the latency for a single transaction is high, the pipe line allows for higher throughput where new transactions are made irreversible every 3 seconds. Furthermore, the pipeline gives everyone variable probability of irreversibility as blocks grow from 1 signature to 2/3 of required signatures.

What we can conclude is that the last irreversible block is identical to advancing blocks in Ripple and Tenderment in that it is a proposal accepted by 2/3 of block producers.

Security Provided by Non Producing Validators

The act of producing blocks means nothing if the produced blocks are not accepted by everyone else in the world. There can be 100,000 independent validators all talking with each other (relaying blocks and transactions). Each of these validators is running a state machine that will not allow them to roll back past their own perception of the last irreversible block.

If the majority of the block producers collude to produce a longer chain in an attempt to fork beyond the last irreversible block, then no exchange, block explorer, merchant, or other validator would switch to that fork. The entire world will agree on the "first fork seen".

This means that the only way to effect a double spend / reversal beyond last irreversible block is to isolate your victim and partition the network + collusion of producers. With a small amount of non-consensus changes it would even be possible for a particular node to require proof of TaPoS from a majority of non-block-producing trusted peers that periodically broadcast transactions. Under this model, even colluding block producers who isolate one victim will not be able to incorporate transactions from the other parties. TaPoS does this organically, but it could be made explicit.

In terms of Collateral

A Job has a net-present-value equal to the value of the future revenue stream combined with a sunk cost of campaigning. Losing the job has a real economic cost. Getting fired has an even greater loss due to value of reputation.

If there was any way for the majority of block producers to collude and cause actual harm to someone running a full validating node with good, long-standing, connections to a large number of peers then I could see the need for additional collateral. But considering every peer is able to independently verify that they and everyone they do business with is on the same irreversible fork, then there is no ability to deceive a single node.

There are two things a block producer can do to "harm" the network:

  1. not produce a block
  2. skip the block producer before them (this will likely orphan the attacker rather than the attacked)

As a group the block producers can prevent the advancement of the last irreversible block until one of the potentially reversible forks is able to elect a majority of block producers who then advance the last irreversible block. This means that absent a clear majority, any minority of producers can successfully hold an election and keep the last irrevrsiible block advancing.

Aside from halting advancement of the last irreversible block, the majority of block producers can also:

  1. Ignore all minority producers and effectively increase the average block interval by 50%
  2. Ignore / censor transactions / hinder election process

In all conceivable attack scenarios, there is no potential for a double spend of a traditional transfer. Users are only ever at risk if they face financial loss due to the censorship of their transaction. This risk applies to every blockchain and is therefore pointless to consider.

Security, Accountability, and Liveliness

I believe I have proven the the block producers cannot defraud 1000's of independent validators without partitioning the network and while physical network partitions may be possible, logical network partitions defined by non-consensus TaPoS trust links cannot be partitioned. The most that could be said is that TaPoS trust links are not currently deployed as a pro-active defense against a network partition attack.

The only reason for bonded block producers is to enforce a penalty for the network partition attack. If it is possible to prevent it in the first place by TaPoS links then no bond is necessary.

The network of 1000's of validators provide accountability through the ability to detect and report production of fraud chains. The probability of getting caught is 100% and the consequences involve both job loss, reputation loss, and potentially legal consequences of theft/fraud (because the parties are known and the double spend involved and off-chain business transaction).

The blockchain will remain live (advancing the last irreversible block) so long as at least 1 block producer is able to process enough pending transactions to elect a new set of witnesses who then start producing the last irreversible block. Even a loss of 100% of block producers will not prevent the network from advancing assuming a hard fork to enable one witness to hold a new election.

Conclusion

There are no known strategies by which a well connecting full validating node can be defrauded and the damage any individual block producer or even collusive majority group can do is so insignificant that a bonding requirement beyond job/reputation loss is unnecessary.

Do bonds actually make a platform more secure? They act as a barrier to entry that keeps block production (and therefore censorship rights) in the hands of the rich. The power to censor is far more valuable than any microscopic probability for producing two alternative chains in an effort to defraud an isolated victim.

Additional Discussion here: https://steemit.com/steem/@dantheman/response-to-cosmos-white-paper-s-claims-on-dpos-security

@matthewniemerg
Copy link

I have some questions and concerns with the previous post's section on "In Terms of Collateral," specifically with your use of majority. Could you comment on how 'majority' still achieves a solution to the Byzantine Generals Problem?

@bytemaster
Copy link

A 51% majority is unable to advance last irreversible block, that requires a 2/3 majority. In that section I am referring to attempted attacks with any majority being ineffective other than censorship.

@xeroc
Copy link
Author

xeroc commented Jan 26, 2017

Maybe we should clarify that the requirement for the irreversible block to advances is that 2/3 *NUM_BLOCK_PRODUCERS different block producers produce blocks. That means that if 1/3 of all block producers go down at the same time, then the irreversible block will not advance, even though the blockchain does grow from the still running block producers!

@jaekwon
Copy link
Contributor

jaekwon commented Jan 26, 2017

Hi Bytemaster, and everyone from the BitShares community,

Thank you for the writeup, lets dive in -- happy to discuss.

I want to first point out that our design principle is to account for network partitions, so that needs to be taken into consideration when proving safety.

On TaPoS

I would argue that TaPoS can't be used for finalization of blocks. The objective is to create a block finalization (e.g. irreversibility) algorithm, and in our case (3 seconds) or BitShare's case (e.g. 40 seconds), the time is not sufficiently long to settle anything with transaction volume or coin-days-destroyed. TaPoS is fine for dealing with successful attacks or long range attacks and figuring out how to recover, nothing wrong with that. But for the sake of near-term block finality/irreversibility, TaPoS should not be discussed. In addition, during a network partition, it is also true that not all transactions will be visible, so it can't help there. In any case, I hope we can agree to put TaPoS aside, when we're discussing finality/irreversibility.

Next I want to bring up safety and liveness with regards to the BitShares scheme.

Safety and Liveness in BitShares

My understanding is that there is a designated signer for every block height, and that if the designated signer is not available, that height may be skipped. Please correct me if I'm wrong.

It's fine to say that every validator, locally, will compute and hopefully progress their "irreversible block", (which for specificity I will refer to as LIB ("local immutable/irreversible block"), and that it requires 2/3+ of different block producers to produce blocks. What's important is to prove that up to some tolerance threshold, no matter the network circumstances (e.g. partitions or not), that when one signer increments its LIB, that it is guaranteed that everyone else will also (eventually) increment the exact same LIB. That is required for safety (and liveness), for otherwise there may be situations where the LIB diverges and violates safety (or network deadlocks, and is unable to progress any further, violating liveness). And by liveness, I'm not referring to additional blocks being proposed... I'm referring specifically to the LIB progressing. The only thing that matters is the progression of the LIB, and that there is consensus about what that chain looks like. (Thus making "local irreversible block" == "global irreversible block")

In order to analyze this for Bitshares, I need to first ask,

  1. When a block signer signs a block Bzero for height H, is it OK for this block signer to, at a later time, sign a block Bx for height H+x where that block does not include Bzero in the chain?

I assume that the answer is yes, for otherwise it would be really easy for a signer to end up in a deadlock scenario. So then the followup question is,

  1. Due the nature of the internet, and especially given the desired block finality time of 40 seconds, it is not safe to assume that all block signers will see all the blocks proposed. So we need to take into account network partitions and really any strange network behavior. Given this, what mechanism is there to prevent block signers from switching back and forth, and end up deadlocking by ending up with different LIB? Here, I'll give a more concrete example below:

Given the scheme that I know so far for BitShare's PoS algorithm, and assuming that the answer to (1) is yes, then I can construct a situation where the LIB diverge and result in a deadlock, without any block signer being malicious at all, and only due to network circumstances.

For example, lets say there are 4 signers, S1, S2, S3, and S4. And 3 signatures from 3 different signers is required for a block to be irreversible.

For example, ideally:

B1(S1) -> B2(S2) -> B3(S3) ->B4(S4) -> B5(S1)

And in this case, if everyone had seen B5(S1), then everyone would have B3 as their LIB. On the other hand, if S4 was offline for B4, then B2 would be everyone's LIB.

B1(S1) -> B2(S2) -> B3(S3) ->B4(nil) -> B5(S1)

Now, lets assume everyone starts with B0 as the LIB ("local immutable/immutable block"), and network irregularities causes the following to happen:

B0 -> B1(nil) -> B2(S2) -> B3(nil) ->B4(S4) -> B5(S1) # B5 seen by S1 and S4
B0 -> B1(S1) -> B2(nil) -> B3(S3) ->B4(nil) -> B5(nil) -> B6(S2) # B6 seen by S2 and S3

S2 signed B2 but didn't see B3, B4, or B5. Then, S1/S4 will have B2 as the LIB, whereas S2/S3 will have B1 as the LIB, and the network has gotten deadlocked.

What prevents such a deadlock in BitShares?

On Stake and Bonding

More to say here, but we have to discuss the above first.

@bytemaster
Copy link

Thanks for putting so much thought into our algorithm and presenting an example for consideration.

For starters, I have mapped this example out sequentially and concluded the following:

Block 5 signed by S1 proves that S1 chose to switch from his prior fork where he produced B1 to the fork produced by S2 and that presumably S2 decided to switch to the fork started by S1. This particular case would never happen by nodes running the default algorithm because nodes only switch forks when they see a chain that is longer (skipped blocks don't count).

Given presented order S1 would see B1(S1)->B3(S3) first and not switch to B2(S2)->B4(S4) because it is not longer. Likewise, S2 would not switch because he would also see two forks of the same length so would not switch (preferring the first fork he saw).

So we can conclude that the pattern you demonstrated would only be possible under the condition of colluding malicious producers who broadcast B5(s1) and B6(S2) at the same time. Under that situation half of the observers would switch to one fork and half would switch to the other fork.

For simplicity I will rename your producers to A, B, C, and D (S1, S2, S3, S4). I assume round robin scheduling so B0 is produced by D.

Half would see DBDA and the other half would see DACB which would cause them to each advance to different local irreversible blocks. Because you only have 4 producers 2 of them being bad can cause a lot of harm. The required percentage of confirmation in order to advance the last irreversible block ends up rounding up for low numbers of producers. I may even be convinced that we should change it to 66% + 1 from 66%. It certainly wouldn't hurt to be extra conservative and say 75% + 1.

Assuming we keep the constraint that all trusted peers must agree on advancement then the attack I presented where two producers collude to broadcast at the same moment in an effort to split the network would fail. Half the nodes seeing peers on a different fork would result in them failing to advance their local LIB until there was a clear winning chain.

With respect to TAPOS.... it works if additional validators are broadcasting every 30 seconds and validators are configured to expect and respect those trusted validators in addition to the producers.

In effect, the consensus used by the validators need not be restricted to the block producers.

@jaekwon
Copy link
Contributor

jaekwon commented Jan 27, 2017

Hi Daniel,

B0 -> B1(nil) -> B2(S2) -> B3(nil) ->B4(S4) -> B5(S1) # B5 seen by S1 and S4
B0 -> B1(S1) -> B2(nil) -> B3(S3) ->B4(nil) -> B5(nil) -> B6(S2) # B6 seen by S2 and S3

This particular case would never happen by nodes running the default algorithm because nodes only switch forks when they see a chain that is longer (skipped blocks don't count).

Given presented order S1 would see B1(S1)->B3(S3) first and not switch to B2(S2)->B4(S4) because it is not longer. Likewise, S2 would not switch because he would also see two forks of the same length so would not switch (preferring the first fork he saw).

I'm assuming that S1 never saw B3,6 and S2 never saw B4,5; due to network conditions. Sorry if that wasn't clear. So, everyone is acting honest, and yet S1 and S2 switched chains.

Or in ABCD notation,

B0 -> B1(nil) -> B2(B) -> B3(nil) ->B4(D) -> B5(A)
B0 -> B1(A) -> B2(nil) -> B3(C) ->B4(nil) -> B5(nil) -> B6(B)

A: saw B0, signed B1, saw B2, saw B4 (switched chains), signed B5, made B2 local-immutable
B: saw B0, signed B2, saw B1 (late), saw B3 (switched chains), signed B6, made B1 local-immutable
C: saw B0, saw B1, signed B3, saw B6, made B1 local-immutable
D: saw B0, saw B2, signed B4, saw B5, made B2 local-immutable

@clayop
Copy link

clayop commented Jan 27, 2017

@jaekwon If you don't mind, can you cross-post on Dan's Steemit post?

@jaekwon
Copy link
Contributor

jaekwon commented Jan 27, 2017

I can't sign up because it won't accept my phone number. You can do it for me though!

@bytemaster
Copy link

@jaekwon thank you for your persistence in pursuing this, it is my hope that we can resolve this issue. I hope you don't mind me switching notation to simply imply producers ABC and D. I use '-' to indicate a missed block. Each time a Witness switches forks I start a new line to indicate the change.

Witness A: D A
         : D - B - D A

Witness B: D - B
         : D A - C - - B

Witness C: D A - C - - B

Witness D: D - B - D A

So the question at hand is what witnesses, if any, advanced the last irreversible block. What we can see here is that for witnesses A and D they have the sequence DBDA while B and C have DACB.

Edge cases can be problematic, so lets clarify that a block cannot confirm itself. The first block A only has 2 confirmation (C and B) and the first block B only has two confirmations (D and A).

The chains will remain forked producing DADADADA... and BCBCBC... until connectivity is restored at which point the two chains will eventually look like this:

1] DA-C-B--BC--BC--BC        <-- Length 10
2] D-B-DAD--DA--DA--DA       <-- Length 11

Once this happens the the peers on the shorter chain switch and you get

1] DA-C-B--BC--BC--BC        <-- Length 10
2] D-B-DAD--DA--DA--DAB       <-- Length 12 and LIB advances one block

Then you get:

2] D-B-DAD--DA--DA--DABC       <-- Length 13 and LIB advances 8 blocks (to last D)

Because the witnesses are shuffled every round we are guaranteed not to get stuck in an alternating pattern between two forks that never out grow the other.

After reviewing this example I have inspected our code and identified that Steem needs to swap two lines of code to prevent this theoretical edge case. We will incorporate this fix in the next hard fork.

All of that said, this contrived example assumes some very unlikely things:

  1. Only 4 witnesses and an even number (against protocol spec) at that
  2. Very sporadic and intermittent connectivity among all witnesses (complete down time more likely)
  3. Perfectly timed swap among two witnesses

After several years of operation of BitShares and 9 months of Steem, neither network has ever actually experienced this theoretical hick up and if it were to happen then at most ONE block would be mistakenly advanced.

I have taken a few other thoughts from this discussion:

  1. In the nominal case (99.99% of the time, maybe higher) every block ends up confirmed
  2. In extremely rare case this is not sufficient, you need to wait 40 seconds

Fortunately we have an additional metric that is applied, witness participation rate. In the nominal case this is 100%, in every network disruption event this number falls due to missed blocks. Each and every one of our nodes is configured to reject transactions if there are no confirmations in the last 8 seconds (2 full blocks). This means the P2P network will not relay transactions if there is a high probability that a node is on a minority fork. When the network is operating normally missing two blocks is a red flag and users are automatically protected because the chances of missing the 3rd is very high. Obviously, getting a single block allows propagation again so this system is mostly an early warning indicator of lost connectivity.

This is a design tradeoff... in theory each of our nodes could require 2/3 of witness signatures on every block before any witness will generate a new block. This would require network traffic equivalent to 5 transactions per second sustained (assuming 14 signatures every 3 seconds). This extra data wouldn't actually need to be recorded in the blockchain, because it is implicitly recorded like it is today. It would merely have to be maintained on all blocks that don't meet the current LIB standard.

Steem and BitShares accomplish 99.999% of the security for 93% reduction in network overhead when looking at 3 second confirmation. With 40 second confirmation you have equal security / immutability guarantees.

What About Bad Producers

Now that we have established a stable protocol under honest conditions, what happens if nodes misbehave. Assume 2/3 of producers collude to exclude the other 1/3 and then produce 2 chains in parallel. Due to TaPoS all transactions would be restricted to which ever fork a node happened to be on. The network would split and two chains with different LIB would emerge. It would require manual intervention via checkpoints to resolve this attack. Note that this is also true of all federated systems.

The solution is to elect new block producers and fire those who attempted double production. There is ample cryptographic evidence available to prove this kind of behavior. In fact, the original STEEM protocol had an operation that would enable such a witness to be fired if proof of producing two blocks in the same time slot were provided. This limited each witness to 1 vote per scheduled time slot. In this event the witness would lose their Steem Power which acted as a bond. Because witnesses are paid in Steem Power and it takes months to liquidate it, you could say that all witnesses were effectively bonded by 13 weeks of pay, they just didn't have to front the money.

So should we restore the operation to fire witnesses who double produce... maybe. It may be overkill. Even a large bond would not preclude the profit of pulling of a double spend by 2/3 of the witnesses. The best protection is actually to have the clients detect the double production and halt the advancement of the LIB until the double production stops. In this situation everyone is protected because double production has the same impact as lack-of-production. It effectively nullifies a block confirmation if there are two competing blocks for the same time slot.

Thanks again for engaging in this discussion and challenging my assumptions. I hope this process has been as educational to you all as it has been to me.

@xeroc
Copy link
Author

xeroc commented Jan 27, 2017

After reviewing this example I have inspected our code and identified that Steem needs to swap two lines of code to prevent this theoretical edge case. We will incorporate this fix in the next hard fork.

BitShares unaffected?

Also, I think in BitShares, the shareholders could technically vote for an even number of witnesses, am I correct?

Bad Producers

Another way of dealing with "bad producers" is to add the requirement of having at least 1 block per round be solved via proof of work. The current STEEM design has a slot reserved for a POW block, such that a POW miner can obtain a single-use ticket into the active set of witnesses at a specific time in the future. All those POW miners are managed in a POW queue. Once you are on the top spot, you basically become a witness for a single round. However STEEM does not require a pow block to be produced in that slot, it the miner misses to produce the block then the next regular witness would continue.
This setup has the disadvantage that the subsequent witness could ignore the block produced by the pow-miner and censor any transaction that comes with it.
If you required that a pow block was produced every round, you remove that disadvantage but get another disadvantage and that is an unpredictable and possibly abusable delay when the round is going to end and the blockchain needs to wait for a miner to produce a block. The network would need to go through the entire mining queue and try to obtain a pow block from everyone in there until (in the worst case) the mining queue is emptied (in the mean time, no new pow-tickets can be obtained because no block is generated).

This would make the network entirely censorship resistant (assuming POW achieves that) but it comes with a massive drawback of irregular block times

@jaekwon
Copy link
Contributor

jaekwon commented Jan 27, 2017

Edge cases can be problematic, so lets clarify that a block cannot confirm itself. The first block A only has 2 confirmation (C and B) and the first block B only has two confirmations (D and A).

Ok, I've changed the example so you need 1 more confirmation.

B0 -> B1(_) -> B2(B) -> B3(_) ->B4(D) -> B5(A) -> B6(_) -> B7(_) -> B8(D) -> B9(_) -> B10(B)
B0 -> B1(A) -> B2(_) -> B3(C) ->B4(_) -> B5(_) -> B6(B) -> B7(C) -> B8(_) -> B9(A) -> B10(_)

A: saw B0, signed B1, saw B2, saw B4 (switched chains), signed B5, saw B3 late, saw B6, saw B7, signed B9, made B1 local-immutable
B: saw B0, signed B2, saw B1 late, saw B3 (switched chains), signed B6, saw B4 late, saw B5 late, saw B8 (switched chains), signed B10, made B2 local-immutable
C: saw B0, saw B1, signed B3, saw B6, signed B7, saw B9, made B1 local-immutable
D: saw B0, saw B2, signed B4, saw B5, saw B8, saw B10, made B2 local-immutable

Only 4 witnesses and an even number (against protocol spec) at that

It's possible to extend any of our examples by e.g. saying, each letter represents many signers.

Perfectly timed swap among two witnesses

I admit it looks contrived, but maybe this is because we're only discussing 1 concrete contrived example. I think there are other sequences that arrive at the same deadlocked outcome.

Thanks again for engaging in this discussion and challenging my assumptions. I hope this process has been as educational to you all as it has been to me.

Thank you too. I hope you take a look the Tendermint spec too: https://github.com/tendermint/tendermint/wiki/Byzantine-Consensus-Algorithm

@bytemaster
Copy link

@jaekwon thank you for continuing your efforts, it takes a lot of work to contrive these examples. I only wish that you would contrive an example with an odd number of produces and immutability of 4 out of 5 or 2/3. One thing is certain, it isn't easy tracking all of this manually.

For the sake of argument, lets assume we have some nodes with the sequence: DBDADB and the others with the sequence DACBCA and therefore the bold block is considered immutable.

This means that half of all nodes will end up on a chain with CACACACA and the others with DBDBDBDBDB. In this case the last irreversible block (local) could not advance for either chain until one of the producers intervened to resolve the issue. The odds of getting stuck must be astronomical given the 10's of millions of blocks produced without deadlock in real life.

It is my belief that with 4 nodes you need 4/4 to avoid deadlock in the byzantine case. You require 1/1, 2/3, 4/5, 5/7, 6/9, 8/11, 9/13, 10/15, ... 14/21... in other words greater than 2/3 and an odd total to avoid deadlock in all cases. If you have an even number then you need more than 75%.

I will take a look at tendermint and see what I can learn. Thanks again for the engaging discussion.

@bytemaster
Copy link

I have reviewed the tendermint algorithm and see that +2/3 are required to advance consensus and that 1/3+ can halt the consensus process. This property appears to be shared between DPOS and TBCA (Tendermint BCA). I would be suspicious of any algorithm that could operate with less than 2/3.

The next thing I see is that there are designated proposers that must propose within a timeout window. In DPOS our timeout window is 3 seconds which accounts for propagation delay and network latency in the majority of cases.

Given that pre-vote and pre-commit messages are propagated on a P2P network we can assume they share similar latency issues with similar timeout requirements.

Correct me if I am wrong, but I would guess that TBCA takes at least 5 seconds to agree on a block on a distributed network?

Overall I would say that the algorithm looks solid and has no immediate faults.

@matthewniemerg
Copy link

I have reviewed the tendermint algorithm and see that +2/3 are required to advance consensus and that 1/3+ can halt the consensus process. This property appears to be shared between DPOS and TBCA (Tendermint BCA). I would be suspicious of any algorithm that could operate with less than 2/3.

It's an immediate consequence (or should be, rather) from Lamport, Shostak, and Pease's paper on the Byzantine Generals Problem that any consensus mechanism would have to require 2/3 majority to advance the process.

@jaekwon
Copy link
Contributor

jaekwon commented Jan 27, 2017

@jaekwon thank you for continuing your efforts, it takes a lot of work to contrive these examples. I only wish that you would contrive an example with an odd number of produces and immutability of 4 out of 5 or 2/3. One thing is certain, it isn't easy tracking all of this manually.

To make it work for an odd number, just do something like this...

Replace each A, B, C, D with sub-signers, like A = (A1,A2,A3), B = (B1,B2,B3), C = (C1,C2,C3), D = (D1,D2,D3,D4). Notice that D has 4 signers in its group, so overall there are an odd number of nodes. And imagine that when A "sees" a block, it means all A1, A2, A3 saw it, and when A "signs" a block, it means A1, A2, A3 signed in order.

B0 -> B1(A) -> B2(B) becomes B0 -> B1(A1) -> B2(A2) -> B3(A3) -> B4(B1) -> B5(B2) -> B6(B3) and so on.

This satisfies your conditions: odd number of producers (13), with immutability greater than 2/3 (actually 9/13).

@liondani
Copy link

@jaekwon consider to make a post on steemit with an introduction to cosmos whitepaper. Many of us will upvote it to say thank you for your above contributions/brainstorming and your project will be well promoted overall.

@matthewniemerg
Copy link

If you read the thread, you will easily find the following tidbit of valuable information ...

I can't sign up because it won't accept my phone number. You can do it for me though!

@bytemaster
Copy link

I have provided a final opinion here: https://steemit.com/blockchain/@dantheman/the-problem-with-byzantine-generals

My conclusion is that the LIB algorithm is similar to the 6 block algorithm of Bitcoin. If you adapted the rule to require 100% witness participation and 21 confirmations then you would never get stuck and would have finality within 63 seconds.

Lastly I have concluded the Cosmos algorithm to be vulnerable to deadlock with 33% failure where as DPOS continues to function with extremely high probability of confirmation (far better than Bitcoin). Likewise, DPOS enables a stakeholder election to replace failed generals and return to 100% production.

Pick your threat: double spend, network partition, malicious minority, etc and I am confident that DPOS provides sufficient guarantees and protections with 2 confirmations and 100% participation.

Meanwhile, attacks that would be annoyances on DPOS would result in complete failure on COSMOS (requiring a hard fork to change the generals).

@faddat
Copy link

faddat commented Mar 9, 2017

Hey here's a thought: Why not simply work on making tendermint chains talk with graphene chains? Having used both, I've got to say that while there are big implementation differences, most of those are due to the language used to implement, and not the actual purpose of the software. Both Tendermint and Graphene help people create fast, permissioned blockchains. Graphene has a well-developed quasi-democratic aspect to it, while tendermint lacks one (though there's some progress in that direction over at github.com/tendermint/basecoin.

I think tendermint is a bit easier to develop against, but this is nothing that is impossible to fix.

@bytemaster I hope you see the fact that TM closely mirrors graphene as a validaton of many of your ideas.
@jaekwon I've no idea where your inspiration comes from for building TM/cosmos (and actually I don't think it's graphene: lack of docs) but I do know you want to launch a Zerg swarm. And really, both of you should want a swarm.

Not entirely sure how one might build this, however. I have some thoughts but they're too nascent to bother sharing at this stage. Fact is, POS chains should work very well with one another. Derp. I had not read this recently enough. It seems that Jae and Dan are already beginning to make love, not war. The newer comments here are just fantastic! Thank both of you!

@shelby3
Copy link

shelby3 commented Nov 2, 2017

Let’s teach everyone about Byzantine fault tolerance and especially the DPoS people need an education:

https://busy.org/blockchain/@anonymint/consortium-blockchains-e-g-dpos-and-tendermint-can-t-internet-scale

https://medium.com/@shelby_78386/most-of-what-you-have-written-has-been-holistically-refuted-in-my-newest-blog-which-i-plan-to-also-747fcb9f3e6b

I’m not coming back for a debate. I’m busy.

@lichuan
Copy link

lichuan commented Jan 26, 2018

Recently, I'm developing blockchain-based project "askcoin", I want adopt DPOS, but discover some consensus problem of it too, so I feel that DPOS lacks security and decentrlized. DPOS doesn't resolve Byzantine problem at all.

@lichuan
Copy link

lichuan commented Jan 26, 2018

It just like traditional server-side programming, not decentralized programming. Besides, DPOS system like Bitshares, Steemit, they depend on central NTP time sync service, so strictly, they are not decentralized blockchain.

@liondani
Copy link

Because time doesn't exist in reality

@lichuan
Copy link

lichuan commented Jan 27, 2018

Real decentralized consensus doesn't need NTP TIME-SYNC service, and it should not. In blockchain area, there is not a real and accurate time tick, so the consensus should not depend on it.

@lichuan
Copy link

lichuan commented Jan 27, 2018

Bitcoin's consensus doestn't need NTP service. As far as I know, bitcoin is the most decentralized blockchain, because it doesn't depend on too much outer world service, and every node acts as the same role, but the DPOS blockchain depend on too much of human activaty and outer service.

@lichuan
Copy link

lichuan commented Jan 27, 2018

Satoshi is the most clever leader in blochain area.

@lichuan
Copy link

lichuan commented Jan 27, 2018

There are so many problems caused by various network conditions, can not be solved by DPOS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests