Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too many newcomer blocks without transactions #1827

Closed
girazoki opened this issue Jan 18, 2021 · 10 comments
Closed

Too many newcomer blocks without transactions #1827

girazoki opened this issue Jan 18, 2021 · 10 comments
Labels
discussion 💬 Does not need implementing, only discussing

Comments

@girazoki
Copy link
Contributor

girazoki commented Jan 18, 2021

Recently there have been many blocks in the network that do not include any TX apart from the tally and the mint transactions (mandatory). After investigating these block producers, we have observed that all of them come from newcomers. The proportion of blocks coming from newcomers is bigger than we expected, so after reviewing our equations we have found that the probability for newcomers is wrongly calculated in accordance to our DoS protections. Currently, the eligibility probability is being calculated utilizing the active nodes as a denominator, but then only those blocks produced by reputed nodes are given priority over the rest. With current numbers this yields:

Active Nodes: ~6000

Reputed Nodes: ~2000

RF=3

P(no Reputed node produces block) = P(one reputed node does not produce block)^(number of reputed nodes) = (1 - (3/6000))^2000 = 0.37

In other words, currently there is a 37% chance that a newcomer can insert its node in the ARS, and thus also censor TXs by creating empty blocks. This is far from the numbers we analyzed, and in accordance the following preemptive measures have been discussed:

  • Utilize as a denominator for the eligibility the Reputed number of nodes, instead of the ARS length. This would increases the probability for reputed nodes to become eligible, and thus would make it more difficult for an outsider to censor TXs. The only caveat is that it increases the number of candidates.

  • Utilize as a tie-breaker the fact that a node IS IN THE ARS, instead of a node having reputation. This also decreases the number of odds for an outsider to censor TXs, as usually the ARS is larger than the number of reputed nodes.

  • Increase the replication factor while removing the back up factor. We are not even near of being epochs without block candidates, so the back up factor removal is something we should start considering. Further, an incerase on the replication factor gives more probability for the reputed/ARS nodes to find a valid candidate and thus prevent an outisder from continously proposing blocks.

  • Take into account the block weight in terms of data requests, giving of course a lower priority to empty candidates.

This should follow a proper discussion, and most likely yield a hard fork. I see option 1 and option 2 as exclusive, while 3 and 4can be added on top any of the first two

Resources: https://github.com/witnet/research/blob/master/reputation/docs/randpoe.md

@girazoki girazoki changed the title Too many empty newcomer blocks without transactions Too many newcomer blocks without transactions Jan 18, 2021
@drcpu-github
Copy link
Collaborator

1 and 2 seem pretty reasonable to me. However, we do need to be careful that we do not again introduce a balance as we saw in one of the testnets where almost all blocks and data requests are minted / solved by a very limited ARS. If newcomers have too small of a chance to mint blocks it could happen that it is not economically viable to start nodes when the network has been running for a longer period of time. In the long term, this will lead to serious centralization. Obviously a 37% chance for a new node to get its block elected is too high, but I would think something along the lines of 5% is reasonable?

As a question to point 4, does this not introduce a gameable weakness into the network? If you are eligible to mint a block and there are no DR's in the pool, you can quickly create one, include it in your proposed block and get priority over other miners. Obviously if multiple miners do this, the net result is no different than before, but it does give an advantage to a subset of miners.

@lrubiorod
Copy link
Contributor

lrubiorod commented Jan 19, 2021

For the 4 point I would use some measures related with RevealTransactions due that nobody can "invent" a new one. The point is that some miners could not broadcast some Reveals to increase their mining possibilities, but they would risk to be slashed in case of not be selected to create a block (something that it always happens if you do not control a big percentage of the network). So... maybe use the number of Reveals could be usefull, or at least something binary as "does the block contain reveals?"

@tmpolaczyk
Copy link
Contributor

Regarding point 4, data requests already have a commit_and_reveal_fee to incentivize miners to include the commits and reveals, but as we see it's not working. So maybe users need to pay higher fees? Otherwise if we give more priority to blocks that contain more commits and reveals, then why do we need a commit_and_reveal_fee?

Regarding possible attacks abusing this rule, an attacker needs to keep at least one "secret" reveal in order to have an advantage, so that other nodes see N reveals but the attackers' nodes see N+1. But as soon as they create a block that includes those "secret" reveals, the other nodes can include them in the mempool and create a block with N+1 reveals as well. So this would likely result in a race where the last node to send its secret reveal wins, similar to a last-revealer attack. But since the network is already very sensitive to blocks that arrive later than expected, because they cause forks, we should try to avoid incentivizing this behavior.

@drcpu-github
Copy link
Collaborator

drcpu-github commented Jan 19, 2021

@lrubiorod Okay, I interpreted the solution as checking for a data request transaction in a block, but I'd guess that using reveals is a safer solution as you cannot really "fake" those.

@tmpolaczyk I don't think the issue is necessarily with the fee not being high enough (I don't believe this is fee based censoring). I'd guess this is just because some nodes are not receiving the transactions fast enough leading to proposing empty blocks. If empty blocks are dropped in favor of more populated blocks, that solves the issue, I'd say.

@tmpolaczyk
Copy link
Contributor

@drcpu-github interesting, if it's only a networking issue then maybe we can fix it, I'm sure there is room for improvement there. But I would only expect this issue to affect commit transactions because they expire after 1 epoch. Reveal transactions only expire when their data request is resolved, so there should be more time for them to reach all the nodes. Maybe there is a bug in the code because it looks like we clear all the reveal transactions after consolidating one block, here:

// Remove reveals because they expire every consolidated block
transactions_pool.clear_reveals();

That comment from the code is not true, but maybe we needed to clear them because of some validation or something, what do you think @lrubiorod ?

@lrubiorod
Copy link
Contributor

It is true, if we handle the reveals cleaning in another way different to in each block consolidation, nodes with limited resources that would be failing including commits and reveals (assuming that it is not an attacker), they should more time to process and include them. The case is that now, the nodes that don't see their reveals, they will broadcast them again, so we should manage also this. In that case, the point 4 wouldnt be a hard fork change

@drcpu-github
Copy link
Collaborator

I can obviously not be sure that this is not someone censoring the network. However, based on the distribution of commit and reveal rounds from the small data request profile I did yesterday it seems more likely the root cause is transactions propagating slower through the network (otherwise someone is controlling a lot of nodes):
commit_rounds
One other reason why I believe it's the transaction propagation speed is because I saw 'erroneous' blocks from multiple different miners (based on the fact they employed different mint-splits) and I find it slightly unlikely multiple big miners are attacking the network at the same time with the same type of attack.
Giving nodes more time to include reveal transactions sounds like a reasonable approach (though I am not sure how widespread the implications for this are).

@lrubiorod
Copy link
Contributor

Telegram thoughts:

I suppose that when the ARS is lower (~5K), the probability to create a valid block is lower so miners which control several nodes with reputation will earn mining percentage. And when ARS is higher, it is more difficult for those miners, so newcomers would go to the system with maybe not so reliable configurations, so it could produce those rollbacks...

The existence of empty blocks, produce several slashed nodes, so it also produce delays for the data requests and reduce the number of identities with reputation, that it doesnt good enough for the network, unless some miners get more mining percentage and there are less rollbacks. The Witnet functionality is to solve requests, so whatever affects to this functionality it is a problem in my opinion.

I agree with gunray, that some days before the planned hard fork, this strange behavior starts. So it is possible that one or some miners start to put more nodes in the system, and their configuration could be not enough to manage them, and it can not handle the commits, and reveals transactions, and finally it creates empty blocks. Or the other possibility, that I think is less probable, it is an attacker that try to deny the data requests in the network.

Anyway, we detected that the probability of newcomer is higher than expected, so we are working in adjust that probabilities, to allow that the network could work properly with a small or big ARS, and it could reduce the quantity of empty blocks and rollbacks in a future

@mariocao mariocao added the discussion 💬 Does not need implementing, only discussing label Feb 8, 2021
@mariocao mariocao removed this from the Sprint #7 milestone Feb 8, 2021
@mariocao
Copy link
Contributor

There is a published WIP with a proposal to mitigate the problems identified in this discussion: https://github.com/witnet/WIPs/blob/master/wip-0009.md

@tmpolaczyk
Copy link
Contributor

WIP-0009 is now active in mainnet, solving this problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion 💬 Does not need implementing, only discussing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants