-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: add transports benchmark #521
Conversation
Adds a benchmark that measures how long it takes to transfer 100M-1G of data between node, firefox and chrome using WebRTC, WebSockets and TCP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
running this locally seems to get stuck for me on the 210 MB for the node.js -> kubo
implementation test:
[11:18:32] tsc [started]
[11:18:33] tsc [completed]
Implementation, 105 MB, 210 MB, 315 MB, 419 MB, 524 MB, 629 MB, 734 MB, 839 MB, 944 MB, 1.05 GB
TCP (node.js -> node.js) filecoin defaults, 776, 1449, 1889, 3077, 4255, 5318, 7098, 5894, 6203, 6014
TCP (node.js -> kubo) filecoin defaults, 2541
WebSockets (node.js -> node.js) filecoin defaults, 1068, 1642, 2092, 2812, 4117, 4423, 6117, 7820, 7182, 7816 | ||
//... results here | ||
``` | ||
3. Graph the CSV data with your favourite graphing tool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any recommendations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, I use Google Sheets because I'm a luddite.
logger, | ||
addresses: { | ||
listen: [ | ||
'/ip4/127.0.0.1/tcp/0/ws' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can the transport the relay uses impact transfer speeds or is this just for getting the nodes to connect directly?
note: this is an early comment i'll probably find an answer to myself later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relays are only used for WebRTC, and only for connection establishment.
In all the tests we only measure the transfer time so with WebRTC the nodes have a direct connection by this point and the relay is no longer involved.
benchmarks/transports/src/tests.ts
Outdated
addTests('TCP', tcpImpls, output, relay) | ||
addTests('WebSockets', webSocketimpls, output, relay) | ||
addTests('WebRTC', webRTCimpls, output, relay) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why no webtransport?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably because nodejs doesn't have webtransport yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a WebTransport benchmark, though it only runs kubo -> kubo for the time being due to browser bugs.
benchmarks/transports/src/index.ts
Outdated
/* | ||
'kubo defaults': { | ||
chunkSize: 256 * 1024, | ||
rawLeaves: false, | ||
cidVersion: 0, | ||
maxChildrenPerNode: 174 | ||
}, | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the intent to re-enable this after resolving some issue or should we remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be removed if it's not useful.
blockstore: new FsBlockstore(`${repoPath}/blocks`), | ||
datastore: new LevelDatastore(`${repoPath}/data`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen this a few times before. Why FsBlockstore but LevelDatastore? Are these the recommended stores for production IPFS with nodejs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a rule of thumb to put data in a database and files in a filesystem. The access pattern is pretty similar though, we don't really do queries anywhere so a FsDatastore would probably be ok?
Some measurements for a "best practices" blog entry might be quite nice.
// pull data from remote. this is going over HTTP so use pin in order to ensure | ||
// the data is loaded by Kubo but don't skew the benchmark by then also | ||
// streaming it to the client | ||
await kubo.api.pin.add(cid, { | ||
recursive: true | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there's a better way to pull the data. maybe with dag import?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DAG import doesn't do any network stuff, you import from a CAR file so we'd be measuring transfer speed from the (RPC) client to Kubo, not Helia to Kubo.
The refs API might be an alternative, but the nice thing about the pin api is it only sends us a tiny amount of data so we don't skew the benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better approach would probably be to port the recipient/sender scripts to go and have it all run in-process same as the Helia ones.
// use Helia's UnixFS tooling to create the DAG otherwise we are limited | ||
// to 1MB block sizes | ||
const fs = unixfs({ | ||
blockstore: { | ||
async get (cid, options = {}) { | ||
return kubo.api.block.get(cid, options) | ||
}, | ||
async put (cid, block, options = {}) { | ||
const opts: BlockPutOptions = { | ||
allowBigBlock: true | ||
} | ||
|
||
if (cid.version === 1) { | ||
opts.version = 1 | ||
opts.format = FORMAT_LOOKUP[cid.code] | ||
} | ||
|
||
const putCid = await kubo.api.block.put(block, opts) | ||
|
||
if (!uint8ArrayEquals(cid.multihash.bytes, putCid.multihash.bytes)) { | ||
throw new Error(`Put failed ${putCid} != ${cid}`) | ||
} | ||
|
||
return cid | ||
}, | ||
async has (cid, options = {}) { | ||
try { | ||
await kubo.api.block.get(cid, options) | ||
return true | ||
} catch { | ||
return false | ||
} | ||
} | ||
} | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A kubo-wrapped blockstore for helia is really cool.
I'm sad that we don't have kubo.api.block.has
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a curious omission.
Co-authored-by: Russell Dempsey <1173416+SgtPooki@users.noreply.github.com>
This should fix the stuck test for TCP/WebSockets - #522 Still investigating WebRTC. |
I'm going to merge this, since the benchmark runs now. There are some browser bugs that prevent all implementations being tested with each other but that's beyond the scope of this PR to fix. |
Adds a benchmark that measures how long it takes to transfer 100M-1G of data between node, firefox and chrome using WebRTC, WebSockets and TCP.
Change checklist