Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add transports benchmark #521

Merged
merged 14 commits into from
May 1, 2024
Merged

Conversation

achingbrain
Copy link
Member

Adds a benchmark that measures how long it takes to transfer 100M-1G of data between node, firefox and chrome using WebRTC, WebSockets and TCP.

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation if necessary (this includes comments as well)
  • I have added tests that prove my fix is effective or that my feature works

Adds a benchmark that measures how long it takes to transfer 100M-1G
of data between node, firefox and chrome using WebRTC, WebSockets
and TCP.
@achingbrain achingbrain requested a review from a team as a code owner April 26, 2024 09:02
Copy link
Member

@SgtPooki SgtPooki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

running this locally seems to get stuck for me on the 210 MB for the node.js -> kubo implementation test:

[11:18:32] tsc [started]
[11:18:33] tsc [completed]
Implementation, 105 MB, 210 MB, 315 MB, 419 MB, 524 MB, 629 MB, 734 MB, 839 MB, 944 MB, 1.05 GB
TCP (node.js -> node.js) filecoin defaults, 776, 1449, 1889, 3077, 4255, 5318, 7098, 5894, 6203, 6014
TCP (node.js -> kubo) filecoin defaults, 2541

WebSockets (node.js -> node.js) filecoin defaults, 1068, 1642, 2092, 2812, 4117, 4423, 6117, 7820, 7182, 7816
//... results here
```
3. Graph the CSV data with your favourite graphing tool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any recommendations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, I use Google Sheets because I'm a luddite.

benchmarks/transports/README.md Outdated Show resolved Hide resolved
benchmarks/transports/README.md Outdated Show resolved Hide resolved
logger,
addresses: {
listen: [
'/ip4/127.0.0.1/tcp/0/ws'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can the transport the relay uses impact transfer speeds or is this just for getting the nodes to connect directly?

note: this is an early comment i'll probably find an answer to myself later

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relays are only used for WebRTC, and only for connection establishment.

In all the tests we only measure the transfer time so with WebRTC the nodes have a direct connection by this point and the relay is no longer involved.

Comment on lines 100 to 102
addTests('TCP', tcpImpls, output, relay)
addTests('WebSockets', webSocketimpls, output, relay)
addTests('WebRTC', webRTCimpls, output, relay)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why no webtransport?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably because nodejs doesn't have webtransport yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a WebTransport benchmark, though it only runs kubo -> kubo for the time being due to browser bugs.

Comment on lines 38 to 45
/*
'kubo defaults': {
chunkSize: 256 * 1024,
rawLeaves: false,
cidVersion: 0,
maxChildrenPerNode: 174
},
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the intent to re-enable this after resolving some issue or should we remove this?

Copy link
Member Author

@achingbrain achingbrain Apr 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be removed if it's not useful.

Comment on lines 12 to 13
blockstore: new FsBlockstore(`${repoPath}/blocks`),
datastore: new LevelDatastore(`${repoPath}/data`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen this a few times before. Why FsBlockstore but LevelDatastore? Are these the recommended stores for production IPFS with nodejs?

Copy link
Member Author

@achingbrain achingbrain Apr 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a rule of thumb to put data in a database and files in a filesystem. The access pattern is pretty similar though, we don't really do queries anywhere so a FsDatastore would probably be ok?

Some measurements for a "best practices" blog entry might be quite nice.

Comment on lines 20 to 25
// pull data from remote. this is going over HTTP so use pin in order to ensure
// the data is loaded by Kubo but don't skew the benchmark by then also
// streaming it to the client
await kubo.api.pin.add(cid, {
recursive: true
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there's a better way to pull the data. maybe with dag import?

Copy link
Member Author

@achingbrain achingbrain Apr 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DAG import doesn't do any network stuff, you import from a CAR file so we'd be measuring transfer speed from the (RPC) client to Kubo, not Helia to Kubo.

The refs API might be an alternative, but the nice thing about the pin api is it only sends us a tiny amount of data so we don't skew the benchmark.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better approach would probably be to port the recipient/sender scripts to go and have it all run in-process same as the Helia ones.

Comment on lines +32 to +66
// use Helia's UnixFS tooling to create the DAG otherwise we are limited
// to 1MB block sizes
const fs = unixfs({
blockstore: {
async get (cid, options = {}) {
return kubo.api.block.get(cid, options)
},
async put (cid, block, options = {}) {
const opts: BlockPutOptions = {
allowBigBlock: true
}

if (cid.version === 1) {
opts.version = 1
opts.format = FORMAT_LOOKUP[cid.code]
}

const putCid = await kubo.api.block.put(block, opts)

if (!uint8ArrayEquals(cid.multihash.bytes, putCid.multihash.bytes)) {
throw new Error(`Put failed ${putCid} != ${cid}`)
}

return cid
},
async has (cid, options = {}) {
try {
await kubo.api.block.get(cid, options)
return true
} catch {
return false
}
}
}
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A kubo-wrapped blockstore for helia is really cool.

I'm sad that we don't have kubo.api.block.has

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a curious omission.

benchmarks/transports/README.md Outdated Show resolved Hide resolved
Co-authored-by: Russell Dempsey <1173416+SgtPooki@users.noreply.github.com>
@achingbrain
Copy link
Member Author

This should fix the stuck test for TCP/WebSockets - #522

Still investigating WebRTC.

@achingbrain
Copy link
Member Author

Generated graph:

image

@achingbrain
Copy link
Member Author

I'm going to merge this, since the benchmark runs now. There are some browser bugs that prevent all implementations being tested with each other but that's beyond the scope of this PR to fix.

@achingbrain achingbrain merged commit 55b9650 into main May 1, 2024
18 checks passed
@achingbrain achingbrain deleted the chore/add-transports-benchmark branch May 1, 2024 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants