New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: use a random pool #1986
perf: use a random pool #1986
Conversation
cffcc05
to
a4adad6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea in general, but I'm not a maintainer of the project. I hope your PR get merged :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much simple problem than I thought initially! I really do hope this gets merged!
237886f
to
3cf487c
Compare
Fingers crossed this gets trough :) |
@@ -81,13 +97,24 @@ class Sender { | |||
|
|||
if (!options.mask) return [target, data]; | |||
|
|||
randomFillSync(mask, 0, 4); | |||
if (randomPool.length - randomPoolIdx < 4) { | |||
randomPool = randomBytes(32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced this is much better than the status quo. It would only take 8 WebSocket
each sending a single message to empty the pool and there is a new Buffer
allocation every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just the fallback path. We shouldn't hit this often, if at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are misunderstanding something here...?
I think that if there are a lot of sockets all of them sending data at the same time synchronously, then this does not change much. I would like to see a "before" and "after" comparison. Do you have a flamegraph? |
Also from my experience generating random data is a nothing compared to the actual masking of the data in terms of CPU time. |
It will be best to skip masking for TLS websockets.. |
No sure if it is spec compliant. What about proxies? |
From my profiles this is not the case. |
Why? We generate the random data asynchronously/parallel off thread so most of the overhead is off the main thread. |
Do you have any existing good benchmark we can use for before/after? |
Is it possible to share the code/data? My tests were based on
This is not the case if we are near the end or already emptied the pool and we need data now.
No, but a simple echo server with a bunch of clients sending data should work. |
We dispatch the next parallel task before the pool has emptied. |
I can imagine a server system initiating a websocket conections to other servers for example forming a mesh but how big a mesh can be? Not really a problem in need of solving ATM |
@lpinca Does the mask need to be cryptographically random or would pseudo random numbers also be ok here? |
This.
|
https://stackoverflow.com/questions/14174184/what-is-the-mask-in-a-websocket-frame
|
That's really weird. Is it the same with |
Yea, it's the same. |
I solved this in production buy doing |
Actually... that doesn't work at all. |
Yes the server must close the connection if an unmasked frame is received. |
Haha! I see a spot here for "ignore-the-standards-and-be-fast" setup if both client and server are willing to disobey the rules |
Another workaround. Move ws to a separate thread. // socket-worker.js
'use strict'
const { Worker } = require('worker_threads')
const path = require('path')
class Socket {
constructor(url) {
this._worker = new Worker(path.join(__dirname, 'socket-worker.js'), {
workerData: {
url: url.href,
},
})
.on('error', (err) => {
this.onerror?.(err)
})
.on('exit', () => {
this.readyState = this.CLOSED
})
.on('message', ({ event, data }) => {
if (event === 'open') {
this.readyState = this.OPEN
this.onopen?.()
} else if (event === 'error') {
this.readyState = this.CLOSING
this.onerror?.(data)
} else if (event === 'close') {
this.readyState = this.CLOSED
this.onclose?.()
} else if (event === 'data') {
this.onmessage?.({ data: Buffer.from(data) })
}
})
this.CONNECTING = 'CONNECTING'
this.OPEN = 'OPEN'
this.CLOSING = 'CLOSING'
this.CLOSED = 'CLOSED'
this.onopen = null
this.onerror = null
this.onclose = null
this.onmessage = null
this.readyState = this.CONNECTING
}
send(data) {
// TODO (perf): Transfer Buffer?
this._worker.postMessage(data)
}
close() {
this.readyState = this.CLOSING
this._worker.terminate()
}
}
module.exports = Socket // socket.js
'use strict'
const { Worker } = require('worker_threads')
const path = require('path')
class Socket {
constructor(url) {
this._worker = new Worker(path.join(__dirname, 'socket-worker.js'), {
workerData: {
url: url.href,
},
})
.on('error', (err) => {
this.onerror?.(err)
})
.on('message', ({ event, data }) => {
if (event === 'open') {
this.readyState = this.OPEN
this.onopen?.()
} else if (event === 'error') {
this.readyState = null
this.onerror?.(data)
} else if (event === 'close') {
this.readyState = null
this.onclose?.()
} else if (event === 'data') {
this.onmessage?.({ data: Buffer.from(data) })
}
})
this.OPEN = 'OPEN'
this.onopen = null
this.onerror = null
this.onclose = null
this.onmessage = null
this.readyState = null
}
send(data) {
// TODO (perf): Transfer Buffer?
this._worker.postMessage(data)
}
close() {
this._worker.terminate()
}
}
module.exports = Socket |
We're good with disabling masking and/or worker thread. Solves our perf issues. |
@ronag are you running it on a resource constrained system? This is what I get on my machine: const crypto = require('crypto');
const buffer = Buffer.alloc(4);
let max = 0n;
for (let i = 0; i < 1e6; i++) {
const start = process.hrtime.bigint();
crypto.randomFillSync(buffer, 0, 4);
const end = process.hrtime.bigint();
const elapsed = end - start;
if (elapsed > max) {
max = elapsed;
}
}
console.log(max);
That's 2 ms in the worst case. |
Not really. It’s an EPYC server. Node 16.13 |
For correctness I have to take this #1986 (comment) back. Recent benchmarks show that masking ~10 KiB of data in plain JavaScript is now comparable to acquiring 4 bytes of random data. |
If there is evidence (I'm still not convinced) that this improves things in production, then I'm fine with reopening and merging. |
I think it gets slow once the entropy pool is emptied. |
The `generateMask` option specifies a function that can be used to generate custom masking keys. Refs: websockets/ws#1986 Refs: websockets/ws#1988 Refs: websockets/ws#1989
In production
randomFillSync
is taking 8-9% of our cpu time. This tries to reduce the overhead by pooling random data.