Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestNet Mempool Strange Behavior #638

Closed
saltyskip opened this Issue Mar 15, 2019 · 21 comments

Comments

5 participants
@saltyskip
Copy link

saltyskip commented Mar 15, 2019

Over the past couple of weeks I have noticed some strange behavior in regards to transaction processing.

It seems like something is happening in regards to transaction propagation. Let me show some snapshots of the mempools I have taken on March-15, Roughly 3:30 PM. I will list the average mempool transaction count of these three sets of nodes

TestNet NGD Nodes:
ngd1 - 0
ngd2 - 0
ngd3 - 0

TestNet Neo (Running 2.10)
neo5-2700
neo2 - 1300
neo3 - 800

Testnet CoZ Nodes:
test1 - 2756
test2 - 2700
test3 - 2700
test4 - 2746

As we can see there are vast discrepancies between the mempool counts of various node clusters. I am unaware if all consensus nodes have been upgraded to 2.10 on the testnet, but there appears to be non trivial issues based on my observations of the mempool in the past week.

Transactions are often accepted, but not confirmed or propogated to the various mempools.

Does anyone have some insight on whats going on?

@saltyskip

This comment has been minimized.

Copy link
Author

saltyskip commented Mar 15, 2019

Additionally blocks appear to be accepting transactions in chunks

image

You can see that there are several blocks with only the miner transaction in them, and then a burst of transactions in the subsequent blocks. Ideally this would be a more steady stream of confirmed based on these transactions.

But currently I am unsure of whose mempool I can trust 😓

@saltyskip

This comment has been minimized.

Copy link
Author

saltyskip commented Mar 15, 2019

Does this relate to #620?

@saltyskip

This comment has been minimized.

Copy link
Author

saltyskip commented Mar 15, 2019

This should also be relevant to the issue. When I submit a transaction the NGD node, I recieve this error

image

however submitting the exact same raw transaction is fine when submitting to neo.org node. However as mentioned it appears to simply get stuck in the mempool of the neo node and never propogate properly

@jsolman

This comment has been minimized.

Copy link
Contributor

jsolman commented Mar 15, 2019

Nodes without the SimplePolicy plugin that don’t have the proper fee filters may keep transactions in their MemPool that won’t propagate to other nodes that have the policy set. Also CN nodes that have the policy obviously won’t validate such transactions.

@saltyskip that unknown error means it violated the policy. I think newer versions return something to that effect.

@saltyskip

This comment has been minimized.

Copy link
Author

saltyskip commented Mar 15, 2019

This seems like rather dangerous behavior, as these nodes can be essentially viewed as malicious. They are happy to accept transactions, but refuse to propogate them, without the user knowing why their transaction was never confirmed in a block

As we can see this can lead to thousands of unconfirmed transactions without a good recourse of action. How can we prevent this scenario?

@vncoelho

This comment has been minimized.

Copy link
Member

vncoelho commented Mar 15, 2019

I agree @saltyskip, I believe that soon all nodes will be updated to 2.10.0+, which will better organize the network. Mainet CN will be probably just around April and maybe Testnet nodes next week, let's monitor it there to see the behavior.

Currently, I think that some nodes have limits in their amount of connected peers and may create this loop, maybe we can design something in the future that checks if the RPC is receiving CN consensus packages, otherwise, switch connected nodes.

@jsolman

This comment has been minimized.

Copy link
Contributor

jsolman commented Mar 15, 2019

Does this relate to #620?

No it does not

@jsolman

This comment has been minimized.

Copy link
Contributor

jsolman commented Mar 15, 2019

How can we prevent this scenario?

Include SimplePolicy plugin on your nodes. Don’t create transactions that don’t meet the policy requirements of the CN nodes.

@saltyskip

This comment has been minimized.

Copy link
Author

saltyskip commented Mar 15, 2019

@jsolman I understand that this is the technical solution, but from a practical solution I am still not seeing how this upgrade to 2.10 can go smoothly.

We cannot simply assume that all nodes will immediately upgrade to 2.10 (and include the simple policy plugin) at the same time as consensus nodes. Transactions have a high probability of getting stuck in limbo if there are nodes on the network which do not meet the policy requirements of the consensus nodes.

Especially in a case where 2.10 is expected to fix the stability issues of the NEO network,

Nodes that do not meet the policy requirements of the consensus nodes need to somehow be barred from participating in the network

Otherwise they will perform malicious behavior of holding on to transactions forever.

Here is the potential feedback loop for the user on the testnet.

  1. Submit transaction to malicious node(doesnt meet policy requirements)
  2. Receive Success message
  3. Transaction never processes
  4. Attempt to resbumit transaction
  5. Transaction validation failed (it already exists in the mempool of the nodes)
  6. Continiously retry
  7. Can never succeed because even if it hits a node that does meet the policy requirements the error maybe unclear or the wallet does not support the ability to create a tx which does meet the policy requirements.

If we end up in this feedback loop we will have 1000's of unconfirmed transactions and angry users

@saltyskip

This comment has been minimized.

Copy link
Author

saltyskip commented Mar 15, 2019

I myself have been stuck in this feeback loop this week, and it is incredibly frustrating. I can only imagine the scenario if this happens to your average user

@vncoelho

This comment has been minimized.

Copy link
Member

vncoelho commented Mar 15, 2019

You are right, Andrei. I believe that the transition will be smooth now because, in general, 2.10.0 will not handle stability. It is more focused on the dBFT safety and security.

The point about Simply Policy filters recently were improved with that message that we discussed in the NeoCompiler issue (NeoResearch/neocompiler-eco#45).

But I think that should be an ensure that RPCs are connected to CN. This might be plausible to achieve if RPCs makes sure that CN payload are arriving to it (considering a limit delay).

@jsolman

This comment has been minimized.

Copy link
Contributor

jsolman commented Mar 15, 2019

@saltyskip The CN nodes of mainnet already enforce the policy. As people upgrade to 2.10.0 with the plugin the problem can only get better, not worse than now, right?

@jsolman

This comment has been minimized.

Copy link
Contributor

jsolman commented Mar 15, 2019

I think I did mention somewhere else that it might actually be better if SimplePolicy plugin was distributed with the core. The policy really should probably be be agreed on by voting and sent through the network in the protocol so that it could be enforced by all nodes in a way that can ensure they all use the same policy. I think this was mentioned a long time ago also when the policy changes were first made.

@vncoelho

This comment has been minimized.

Copy link
Member

vncoelho commented Mar 15, 2019

@jsolman, I partially agree. I think that the current voting system is already implicitly doing that, but it can be improved with an explicit declaration of its Policy.

The CNs have the ability to propose the fees they want, if a given CN wants to enforce a SimplePolicy he will try to push other nodes to accept it. Intrinsically, the voting system is handling that, because NEO holders choose the CN that has the SimplePolicy they agree.

Seed nodes should follow CN SimplePolicy.
Something that might be useful is to query SP proofs from CN.

@jsolman

This comment has been minimized.

Copy link
Contributor

jsolman commented Mar 15, 2019

@vncoelho I agree, the current voting is fine for the CN nodes, we just need a way for them to all sign something to agree what the policy should be so that other nodes can query the policy and see that it is valid and use it.

@vncoelho

This comment has been minimized.

Copy link
Member

vncoelho commented Mar 15, 2019

Exactly. Do you think that it needs a NEP?

@vncoelho

This comment has been minimized.

Copy link
Member

vncoelho commented Mar 15, 2019

@shargon, we are thinking about a proposal for the CN to sign and divulgate their SimplePolicy, what do you think?

@shargon

This comment has been minimized.

Copy link
Member

shargon commented Mar 15, 2019

I think the best way is to store certain information (such as SimplyPolicy) in a Native Smart Contract (stored in the blockchain). With this... they can vote for shared configurations.

@vncoelho

This comment has been minimized.

Copy link
Member

vncoelho commented Mar 15, 2019

Sounds good.

@saltyskip

This comment has been minimized.

Copy link
Author

saltyskip commented Mar 25, 2019

Im ok with this being closed as I now understand the root issues here, if there is auxilary conversation about detecting consensus node plugins, perhaps it should be moved to a seperate more targeted discussion

@vncoelho

This comment has been minimized.

Copy link
Member

vncoelho commented Apr 9, 2019

@saltyskip, as you suggested, let's close this one.

Just one last thing, check this PR here #410.
Maybe these alerts will be enough for communicating SimplePolicy, or maybe not and we could: design the NEP and a public contract for publishing this info.

@vncoelho vncoelho closed this Apr 9, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.