Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting rid of PoW (suggestion) #1064

Open
alandefreitas opened this issue Aug 14, 2018 · 7 comments
Open

Getting rid of PoW (suggestion) #1064

alandefreitas opened this issue Aug 14, 2018 · 7 comments
Labels
major This item indicates the need for or supplies a major or notable change question

Comments

@alandefreitas
Copy link

Now that representatives will vote with musig (and only then the node will relay the final transaction with the vote aggregation to the network) couldn't we just get rid of PoW? Other DPoS currencies have done it.

Not trivially, of course:

A preliminary idea is that representatives would attribute a timestamp t for the hash and sign both. This would become the official and final timestamp for the transaction. After that, spam could be filtered in one of two ways:

  • Static: If the user initiates a new transaction before t+s, the transaction is not signed by the representative. s can take as long as PoW currently takes.
  • Dynamic: An user can initiate a transaction at any time. Representatives keep a list of unprocessed transactions in a Fibonacci heap (Boost implements it as a container) where the criterion for priority is the distance from the last timestamp. Spam would just go to the end of the queue.

The dynamic approach would not only allow transactions to go as fast as the representatives can process but it would also give a clear advantage to honest users, who would have their transactions processed almost instantly.

The obvious attacks are:

  • A spammer could use multiple accounts:
    • Note that the user can use multiple accounts but he can preprocess the PoW because what would count is the real timestamp. The problem of multiple accounts to get around the timestamps can be solved by using (i) the value of the output, (ii) the history of the output, and (iii) the IP of the node initiating the transaction in the priority criteria of the Fibonacci heap. Over time, we could in principle think of other useful criteria. It seems to me that the best future criteria would be a Bayesian estimate of the probability that something is spam. All kinds of data could be in the estimate and the cost of evaluating these is constant.
  • A simple DoS attack with many invalid transactions to take down the representative or make the heap bigger than the representative's capacity
    • This problem is no different than what it would be if the unprocessed transactions are kept in a vector or any other data structure. In any case, we need logical cut-offs that could be evaluated in constant time such as (i) accounts should only have one transaction on the list, and (ii) timestamp deltas that are very different from the network capacity could be discarded immediately as outliers.

Regarding the capacity of representatives, they can also increase their capacity by validating transactions in parallel according to the transaction hash. Representative k can be online on IPk1, IPk2, ... IPkn. If a transaction is a preimage for the hash h, such that h%n==i, this transaction will be validated by IPk(i+1). The user can send the transaction directly to IPk(i+1). If the transaction goes to another IP such that h%n!=i, the representative will send it to the proper IPk(i+1) and it's not included in the heap. So any node can validate x times more transactions by setting up more nodes with the same private key.

This would move NANO closer to what the internet (and e-mail) looks like. Invalid transactions (like spam) would be invalidated or given preference according to their properties rather than imposing a cost on all honest users.

@kayabaNerve
Copy link

Your static implementation kills off mass-transfers, like what services/exchanges would use. Your dynamic idea seems good except it encourages, and is easily defeated by, mass accounts.

You then cover defeating mass accounts via limits by IP idea (which I like except you only know the IP of the relayer) and by account linkage (which could penalize innocent people that just happen to do frequent transfers with a malicious actor). That said, I think it's a great idea.

I do want to note, it is harder to a DoS attack when you must pre-generate all the spam, which can take hours and sets a fixed time of when the DoS attack will end.

I personally would not see this in Nano versus marked for implementation years down the line.

Finally, for your analogy, EMails have centralized spam filters and the internet uses paid gateways called ISPs. ;p

@alandefreitas
Copy link
Author

@kayabaNerve, thank you so much for your feedback. I've been thinking about something like that since I've been studying about how other DPoS currencies work. The idea is still very rudimentary but I think it would be very useful to light clients if we came up with something smart. Some questions:

  • The static idea is really just an intermediary step towards the dynamic idea, and I don't really know how exchanges operate. The static idea would be a problem for them because they send all transactions from the same address? The static idea doesn't involve IPs yet.
  • Anyway, the dynamic idea is what concerns me the most in the long run. When you say it would encourage mass-transfers, do you mean the first idea of using the distance from the last transaction as a sorting criterion, the second idea of using Bayesian estimates, or both? I don't know if I made that clear, but a Bayesian estimate would not have cut-off limits by IP. It would be a Bayesian aggregate of the probability of a transaction being spam, adding up various criteria, just like old-school spam filtering. A first Bayesian estimate would position transactions on the Fibonacci heap (O(1)) and a second estimate could be used to discard very extreme outliers. There would be no limit by IP per se.

When you say it is harder to a DoS attack when you must pre-generate all the spam, I do agree with that. What concerns me, more than a heap being enough to eliminate PoW, is what we can do to defend ourselves when they do pre-generate the spam. I understand there are two kinds of DoS spam:

  1. The first one is having many transactions that wouldn't be valid anyway.
    • This doesn't take PoW but it can also bring down a (naive) node because it would be trying to deal with the receiving messages and running the hashing function.
    • This works with or without valid proof-of-work.
    • A solution is probably to disconnect the client for a while like many servers do
  2. The second one is DoS with PoW.
    • This is more problematic because the transactions are valid and the node has to do something about it.
    • Another problem with these attacks is that the cost of processing these transactions propagate to the network and they do look like valid transactions.
    • A third problem is that any threshold without a Bayesian estimate would be arbitrary and open the node to exploitation because the transactions are, in fact, valid according to the protocol.
    • As I understand, nodes do nothing at the moment regarding pre-processed PoW, which is dangerous.
    • The economic incentive is useful but it's still dangerous. Many people have resources to generate enough PoW in advance for such an attack. One with access to a university computer or a number of hacked devices could generate PoW for months for "free" by externalizing the costs.
    • A Fibonacci heap of that kind could be implemented even with PoW as the second line of defense.
    • This would be mitigated by sending the transactions to the end of the heap (I'm ignoring the issue of finding out the best criteria for now)

From the perspective of the attacker, the first attack might just be cheaper than the second one because he could use many devices he hacked. Which attack is cheaper will depend on contingent factors, such as the number of devices available, time, etc. That kind of heap could be a defense again the second attack with pre-generated PoW and partial defense against the first kind of attack (too many invalid transactions). Maybe the required PoW could be reduced little by little if the heap itself allows the representatives to handle more transactions.

I know Nano wouldn't implement this anytime soon but having the discussion seems worth it to me. I think some kind of strategy like that could provide (i) extra security for pre-processed spam attacks even with PoW, and (ii) a better experience for light know (if it's ever useful to reduce PoW difficulty). Thanks again for the feedback.

PS: I understand the disanalogy. I guess I have to stop referring to analogies because for every analogy there's a disanalogy. It seems like it always makes it harder to transmit the original idea. :P

@kayabaNerve
Copy link

Both discourage mass transfers; the Bayesian idea, which I do like, is just more likely to not harm legit services like payment processors/exchanges. That said, since Nano doesn't have fees, it's important to remember I can bounce back 100 Nano infinitely (meaning the filter for spam would have to be very carefully crafted with balance across the various factors).

The first is just a DoS attack and not directly related to Nano. The second, however, is directly related to Nano, and is the trickier. Unfortunately, I don't have enough knowledge to continue this. :P

@alandefreitas
Copy link
Author

I still don't understand. Which kind of mass transfers are important? Are there any kinds apart from exchanges? Why would these be discouraged?

How would you bounce back 100 Nano infinitely if nodes have a Bayesian estimate that penalizes according to output history? Unless there is something I'm missing, the velocity of the output/money would make these transactions very slow so one wouldn't be able to do that. Exchanges could still be operating because they are working on different outputs and not just sending the money back and forth.

I agree that the first DoS attack is not directly related to Nano. I just wanted to make the distinction clear because it seems like people are conflating DoS and spam all the time in this community.

It just seems to me like this is what other DPoS currencies are doing and I don't see many problems so far. Thanks for the feedback again.

@kayabaNerve
Copy link

Mass withdraws/payouts. They're discouraged because the first system (based on time since last TX) could have a legit service's TXs treated worse than spam TXs.

The Bayesian estimate stops that if you have the correct criteria. That was my statement. Just needs to be setup properly,

@rkeene
Copy link
Contributor

rkeene commented Aug 31, 2018

See also Issue #506

@alandefreitas
Copy link
Author

For the record, when I say "Other DPoS currencies have done it", I mean EOS, BitShares, and Steem. They're based on Graphene and allocate bandwidth according to the "bandwidth model".

The bandwidth model guarantees a certain bandwidth to users. This is proportional to their stake. New users can use spare bandwidth, which is usually available. There are also some other details, like the possibility of renting bandwidth but that's not very relevant in our context.

My suggestion is a little different because it wouldn't explicitly depend on stakes, although it could also include consider stakes in the equation. The bayesian estimate would primarily include things such as (i) the value of the output, and (ii) output history. Output history, in economic terms, is the equivalent to the velocity of money. Other criteria could be included to represent priorities of the community, such as (i) IP addresses, (ii) voting power, (iii) previous timestamp, (iv) you name it. We would have to be careful about these extra criteria because they involve trade-offs. For instance, IP addresses could be a problem for exchanges, voting power could be a good extra incentive for people to run nodes.

Finally, we have to note that it's only a solution to avoid a possible problem. These criteria would only be effectively applied when the network is on the limit, so it's not like when we choose transactions from users X over transactions from users Y, users X would be completely benefited all the time.

@rkeene rkeene added question major This item indicates the need for or supplies a major or notable change labels Sep 10, 2018
@zhyatt zhyatt added this to the Research for Future Release milestone Dec 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
major This item indicates the need for or supplies a major or notable change question
Projects
None yet
Development

No branches or pull requests

4 participants