Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transaction Pool Worker #3626

Closed
faustbrian opened this issue Mar 30, 2020 · 0 comments
Closed

Transaction Pool Worker #3626

faustbrian opened this issue Mar 30, 2020 · 0 comments
Assignees

Comments

@faustbrian
Copy link
Contributor

faustbrian commented Mar 30, 2020

An initial explanation of the issue and idea based @supaiku0. This most likely will need refinement as implementation is ongoing.


A few months @supaiku0 worked on a proof of concept:
https://github.com/ArkEcosystem/core/tree/wip/core-transaction-pool/worker

It is nowhere near production-ready, but in principle working. Currently, the /transactions POST endpoint can be easily abused to cause high load on nodes, because it validates all transactions on the main thread, which causes CPU spikes and this can even be caused by broadcasting invalid transactions targeted at specific nodes since the signature check is pretty heavy. This is the reason why it is advised for node operators to secure their core-api access or completely disable it if used in front of a forger.

The p2p endpoint already received a workaround, by giving the main thread room to breath: https://github.com/ArkEcosystem/core/pull/2848/files

So the problem only manifests when using the core-api endpoint. However, ideally, this workaround is replaced with a more generic solution which also affects core-api. This is where the pool worker comes in.

The flow right now is:
POST /transactions -> create new Processor instance -> validate -> addTransactionsToPool
-> return response (accepted, ignored, excess, error)

With the pool worker it would change to something like this:
POST /transactions -> enqueue transactions which creates a job -> return response (ticketId, e.g a sequentially increasing number)

ticketId represents a job, that is either in the queue (about to be sent to worker), being processed (somewhere in worker) or done (returned from worker).

End users/clients can query the status by hitting an endpoint of a peer they broadcasted to using the ticket id which would return a response resembling what they get currently if they use the endpoint.

^ This is a significant change API wise and breaks all kind of client software so that's why the worker has been postponed to 3.0


Queued jobs are then pushed to the worker which does all the heavy lifting and once done reports back to the main thread, which will add valid transactions to the transaction pool.


A nice benefit of this approach is that it also makes the frequency in which transactions are rebroadcasted by a node to other peers more deterministic. Right now they rebroadcast whenever they are done validating the current batch of transactions (i.e. 1 request -> 1 broadcast), while a worker could report his finished jobs like only every 100ms, then a behaving node would at most rebroadcast 10 times per second. which greatly reduces the snowball effect that can currently be observed when the network is flooded with many transactions. Also, the rate-limit on the endpoint can then be properly calibrated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants