New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make better use of multi-core servers #1079

Open
pmconrad opened this Issue Jun 21, 2018 · 7 comments

Comments

@pmconrad

pmconrad commented Jun 21, 2018

User Story
As a node operator I want to make better use of my server hardware so that my server can process more transactions per second.

Impacts
Describe which portion(s) of BitShares Core may be impacted by your request. Please tick at least one box.

  • API (the application programming interface)
  • Build (the build process or something prior to compiled code)
  • CLI (the command line wallet)
  • Deployment (the deployment process after building such as Docker, Travis, etc.)
  • DEX (the Decentralized EXchange, market engine, etc.)
  • P2P (the peer-to-peer network for transaction/block propagation)
  • Performance (system or user efficiency, etc.)
  • Protocol (the blockchain logic, consensus, validation, etc.)
  • Security (the security of system or user data, etc.)
  • UX (the User Experience)
  • Other (please add below)

Additional Context (optional)

@clockworkgr discovered this attempt to parallelize signature verification: neuronchain/neuronchain@3b545d2

I think a better approach would be to extract the crypto calculations from the database thread out into the network layers, i. e. handle the signature->pubkey conversion in the P2P and API layers. Keep in mind that a witness has will apply a given transaction three times when signing a block:

  • when the TX is received before it's included in a block
  • when building the block
  • when applying the newly generated block

The abovementioned patch doesn't help with steps 1 + 2.

CORE TEAM TASK LIST

  • Evaluate / Prioritize Feature Request
  • Refine User Stories / Requirements
  • Define Test Cases
  • Design / Develop Solution
  • Perform QA/Testing
  • Update Documentation
@pmconrad

This comment has been minimized.

pmconrad commented Jun 21, 2018

General remark: I think we need a "Performance" epic - #982 belongs into the same category.

@ryanRfox ryanRfox added this to New -Awaiting Core Team Evaluation in Project Backlog via automation Jun 21, 2018

@jmjatlanta

This comment has been minimized.

jmjatlanta commented Jun 21, 2018

Someone has invaded my head. I spent a good chunk of last night reading and pondering some of the same ideas. The big question that kept coming up as I thought about implementations was "would this be faster, or slower?"

Web servers have SSL offloading, which can be dedicated hardware that handles the SSL encryption/decryption, so the real server can serve pages. But we need authentication on both sides ("client side authentication" in web-speak).

An intermediate step is giving peers the option to talk over an encrypted, verified channel. Part of the handshake verifies public keys and negotiates stream ciphers. While negotiation is painful up front, subsequent communication is less-so.

We could centralize (and perhaps optimize) signature verification by moving the incoming data through the verification step on the way in, and marking the internal representation of that data as "signed and verified by public key X". Perhaps there would be no more signature verification needed.

Warning... Going off topic here...
Off topic, I know, but a possible side advantage is a "web of trust" that develops. Connections using a certain public key and a secure connection could be eligible for extra privileges. An example: Would it help to know that the server you are talking to is one of the active witness nodes? Are there optimizations we can do if we know for sure that we are talking to one? Comparing the public key they gave us to our list of public keys can help with that.

What if we want to only connect to a white list of servers? We can do that if we know (with a good amount of confidence) who is on the other side.

Another optimization is specialized protocols. Incoming blocks can quickly be routed somewhere different than incoming heartbeats, or connection requests, or ....

Back on topic (somewhat)...

There's a lot of possible optimizations here. But I keep bubbling back up to my original question. "Would this be faster or slower?" The current p2p code has some metrics. I think we're going to have to give that a hard look, gather stats (especially what kind of process is currently slowing our throughput), and spend some time in the laboratory.

@bangzi1001

This comment has been minimized.

bangzi1001 commented Jun 22, 2018

Does utilize multi-core able to solve Missing blocks due to high latency of maintenance block?

#504
#803

@pmconrad

This comment has been minimized.

pmconrad commented Jun 22, 2018

@jmjatlanta there's a difference between client and/or P2P connections and transaction signature verification. Similar to sending PGP-encrypted email through an TLS-protected SMTP connection (that's routed through a VPN tunnel if you want). :-) They are different communication layers where encryption / authentication plays different roles.

P2P connections are already encrypted, but there is no authentication happening yet. Interesting topic, but out of scope here.

We could centralize (and perhaps optimize) signature verification by moving the incoming data through the verification step on the way in, and marking the internal representation of that data as "signed and verified by public key X".

This! ("centralize" in the sense of software code, it could be done "decentralized" in the sense of multi-core and perhaps as a future step in the sense of multi-server).

"Would this be faster or slower?" The current p2p code has some metrics. I think we're going to have to give that a hard look, gather stats (especially what kind of process is currently slowing our throughput), and spend some time in the laboratory.

Full ACK!

@jmjatlanta

This comment has been minimized.

jmjatlanta commented Jun 24, 2018

Does utilize multi-core able to solve Missing blocks due to high latency of maintenance block?

@bangzi1001 It would certainly help.

I would like to set up some kind of lab. I've got a fairly slow machine in NY, and a few around here. I'm hoping to develop some kind of testing framework that closely mimics the block producing process, and use it and a variety of machines to gather some metrics. We may already have such a framework, or at least the beginnings of one.

@pmconrad

This comment has been minimized.

pmconrad commented Sep 29, 2018

Currently applying some optimizations I've discovered during the research for my BitFest presentation.
Also working on off-loading crypto stuff into separate threads.

@abitmore abitmore removed this from New -Awaiting Core Team Evaluation in Project Backlog Sep 29, 2018

@abitmore abitmore added this to To do in Feature release (201810) via automation Sep 29, 2018

@abitmore abitmore moved this from To do to In progress in Feature release (201810) Sep 29, 2018

@pmconrad

This comment has been minimized.

pmconrad commented Oct 10, 2018

Moving to december release since only partially resolved at this time.

@pmconrad pmconrad removed this from In progress in Feature release (201810) Oct 10, 2018

@pmconrad pmconrad added this to To do in Feature Release (201812) via automation Oct 10, 2018

@pmconrad pmconrad removed this from the 201810 - Feature Release milestone Oct 10, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment