Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: use a random pool #1986

Closed
wants to merge 2 commits into from
Closed

Conversation

ronag
Copy link
Contributor

@ronag ronag commented Dec 17, 2021

In production randomFillSync is taking 8-9% of our cpu time. This tries to reduce the overhead by pooling random data.

@ronag ronag force-pushed the random-poll branch 6 times, most recently from cffcc05 to a4adad6 Compare December 17, 2021 09:37
Copy link

@sdrsdr sdrsdr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea in general, but I'm not a maintainer of the project. I hope your PR get merged :)

lib/random-util.js Outdated Show resolved Hide resolved
lib/sender.js Outdated Show resolved Hide resolved
Copy link

@sdrsdr sdrsdr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much simple problem than I thought initially! I really do hope this gets merged!

lib/sender.js Show resolved Hide resolved
lib/random-util.js Outdated Show resolved Hide resolved
lib/random-util.js Outdated Show resolved Hide resolved
lib/sender.js Outdated Show resolved Hide resolved
@ronag ronag force-pushed the random-poll branch 2 times, most recently from 237886f to 3cf487c Compare December 17, 2021 10:41
@sdrsdr
Copy link

sdrsdr commented Dec 17, 2021

Fingers crossed this gets trough :)

@@ -81,13 +97,24 @@ class Sender {

if (!options.mask) return [target, data];

randomFillSync(mask, 0, 4);
if (randomPool.length - randomPoolIdx < 4) {
randomPool = randomBytes(32);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced this is much better than the status quo. It would only take 8 WebSocket each sending a single message to empty the pool and there is a new Buffer allocation every time.

Copy link
Contributor Author

@ronag ronag Dec 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just the fallback path. We shouldn't hit this often, if at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are misunderstanding something here...?

@lpinca
Copy link
Member

lpinca commented Dec 17, 2021

I think that if there are a lot of sockets all of them sending data at the same time synchronously, then this does not change much. I would like to see a "before" and "after" comparison. Do you have a flamegraph?

@lpinca
Copy link
Member

lpinca commented Dec 17, 2021

Also from my experience generating random data is a nothing compared to the actual masking of the data in terms of CPU time.

@sdrsdr
Copy link

sdrsdr commented Dec 17, 2021

It will be best to skip masking for TLS websockets..

@lpinca
Copy link
Member

lpinca commented Dec 17, 2021

It will be best to skip masking for TLS websockets

No sure if it is spec compliant. What about proxies?

@ronag
Copy link
Contributor Author

ronag commented Dec 17, 2021

Also from my experience generating random data is a nothing compared to the actual masking of the data in terms of CPU time.

From my profiles this is not the case.

@ronag
Copy link
Contributor Author

ronag commented Dec 17, 2021

I think that if there are a lot of sockets all of them sending data at the same time synchronously, then this does not change much.

Why? We generate the random data asynchronously/parallel off thread so most of the overhead is off the main thread.

@ronag
Copy link
Contributor Author

ronag commented Dec 17, 2021

Do you have any existing good benchmark we can use for before/after?

@lpinca
Copy link
Member

lpinca commented Dec 17, 2021

From my profiles this is not the case.

Is it possible to share the code/data? My tests were based on crypto.randomBytes() (sync) but I would be surprised if crypto.randomFillSync() is actually worse.

Why? We generate the random data asynchronously/parallel off thread so most of the overhead is off the main thread.

This is not the case if we are near the end or already emptied the pool and we need data now.

Do you have any existing good benchmark we can use for before/after?

No, but a simple echo server with a bunch of clients sending data should work.

@ronag
Copy link
Contributor Author

ronag commented Dec 17, 2021

This is not the case if we are near the end or already emptied the pool and we need data now.

We dispatch the next parallel task before the pool has emptied. RANDOM_POOL_REFRESH

@ronag
Copy link
Contributor Author

ronag commented Dec 17, 2021

Screenshot 2021-12-17 at 13 09 04

140 ms

I have randomFillSync calls that take up to 300 ms 😢

@sdrsdr
Copy link

sdrsdr commented Dec 17, 2021

I can imagine a server system initiating a websocket conections to other servers for example forming a mesh but how big a mesh can be? Not really a problem in need of solving ATM

@ronag
Copy link
Contributor Author

ronag commented Dec 17, 2021

@lpinca Does the mask need to be cryptographically random or would pseudo random numbers also be ok here?

@lpinca
Copy link
Member

lpinca commented Dec 17, 2021

cryptographically random

This.

Clients MUST choose a new masking key for each frame, using an
algorithm that cannot be predicted by end applications that provide
data. For example, each masking could be drawn from a
cryptographically strong random number generator.

@sdrsdr
Copy link

sdrsdr commented Dec 17, 2021

https://stackoverflow.com/questions/14174184/what-is-the-mask-in-a-websocket-frame

Why is there masking at all? Because apparently there is enough broken infrastructure out there
that lets the upgrade header go through and then handles the rest of the connection as a second 
HTTP request which it then stuffs into the cache. I have no words for this. In any case, the defence 
against that is basically a strong 32bit random number as masking key.```

@lpinca
Copy link
Member

lpinca commented Dec 17, 2021

I have randomFillSync calls that take up to 300 ms.

That's really weird. Is it the same with crypto.randomBytes() sync? I guess the underlying code is the same.

@ronag
Copy link
Contributor Author

ronag commented Dec 17, 2021

Is it the same with crypto.randomBytes() sync? I guess the underlying code is the same.

Yea, it's the same.

@ronag
Copy link
Contributor Author

ronag commented Dec 17, 2021

I solved this in production buy doing send(data, { mask: false })

@ronag
Copy link
Contributor Author

ronag commented Dec 17, 2021

I solved this in production buy doing send(data, { mask: false })

Actually... that doesn't work at all.

@lpinca
Copy link
Member

lpinca commented Dec 17, 2021

Yes the server must close the connection if an unmasked frame is received.

@sdrsdr
Copy link

sdrsdr commented Dec 17, 2021

Haha! I see a spot here for "ignore-the-standards-and-be-fast" setup if both client and server are willing to disobey the rules

@ronag
Copy link
Contributor Author

ronag commented Dec 17, 2021

Another workaround. Move ws to a separate thread.

// socket-worker.js
'use strict'

const { Worker } = require('worker_threads')
const path = require('path')

class Socket {
  constructor(url) {
    this._worker = new Worker(path.join(__dirname, 'socket-worker.js'), {
      workerData: {
        url: url.href,
      },
    })
      .on('error', (err) => {
        this.onerror?.(err)
      })
      .on('exit', () => {
        this.readyState = this.CLOSED
      })
      .on('message', ({ event, data }) => {
        if (event === 'open') {
          this.readyState = this.OPEN
          this.onopen?.()
        } else if (event === 'error') {
          this.readyState = this.CLOSING
          this.onerror?.(data)
        } else if (event === 'close') {
          this.readyState = this.CLOSED
          this.onclose?.()
        } else if (event === 'data') {
          this.onmessage?.({ data: Buffer.from(data) })
        }
      })

    this.CONNECTING = 'CONNECTING'
    this.OPEN = 'OPEN'
    this.CLOSING = 'CLOSING'
    this.CLOSED = 'CLOSED'

    this.onopen = null
    this.onerror = null
    this.onclose = null
    this.onmessage = null
    this.readyState = this.CONNECTING
  }

  send(data) {
    // TODO (perf): Transfer Buffer?
    this._worker.postMessage(data)
  }

  close() {
    this.readyState = this.CLOSING
    this._worker.terminate()
  }
}

module.exports = Socket
// socket.js
'use strict'

const { Worker } = require('worker_threads')
const path = require('path')

class Socket {
  constructor(url) {
    this._worker = new Worker(path.join(__dirname, 'socket-worker.js'), {
      workerData: {
        url: url.href,
      },
    })
      .on('error', (err) => {
        this.onerror?.(err)
      })
      .on('message', ({ event, data }) => {
        if (event === 'open') {
          this.readyState = this.OPEN
          this.onopen?.()
        } else if (event === 'error') {
          this.readyState = null
          this.onerror?.(data)
        } else if (event === 'close') {
          this.readyState = null
          this.onclose?.()
        } else if (event === 'data') {
          this.onmessage?.({ data: Buffer.from(data) })
        }
      })

    this.OPEN = 'OPEN'

    this.onopen = null
    this.onerror = null
    this.onclose = null
    this.onmessage = null
    this.readyState = null
  }

  send(data) {
    // TODO (perf): Transfer Buffer?
    this._worker.postMessage(data)
  }

  close() {
    this._worker.terminate()
  }
}

module.exports = Socket

@ronag
Copy link
Contributor Author

ronag commented Dec 17, 2021

We're good with disabling masking and/or worker thread. Solves our perf issues.

@ronag ronag closed this Dec 17, 2021
@lpinca
Copy link
Member

lpinca commented Dec 17, 2021

@ronag are you running it on a resource constrained system? This is what I get on my machine:

const crypto = require('crypto');

const buffer = Buffer.alloc(4);

let max = 0n;

for (let i = 0; i < 1e6; i++) {
  const start = process.hrtime.bigint();
  crypto.randomFillSync(buffer, 0, 4);
  const end = process.hrtime.bigint();

  const elapsed = end - start;

  if (elapsed > max) {
    max = elapsed;
  }
}

console.log(max);
$ node test.js 
2040128n

That's 2 ms in the worst case.

@ronag
Copy link
Contributor Author

ronag commented Dec 17, 2021

Not really. It’s an EPYC server. Node 16.13

@lpinca
Copy link
Member

lpinca commented Dec 17, 2021

For correctness I have to take this #1986 (comment) back. Recent benchmarks show that masking ~10 KiB of data in plain JavaScript is now comparable to acquiring 4 bytes of random data.

@lpinca
Copy link
Member

lpinca commented Dec 17, 2021

If there is evidence (I'm still not convinced) that this improves things in production, then I'm fine with reopening and merging.

@ronag
Copy link
Contributor Author

ronag commented Dec 17, 2021

I think it gets slow once the entropy pool is emptied.

@e3dio e3dio mentioned this pull request Dec 18, 2021
lpinca added a commit that referenced this pull request Dec 19, 2021
lpinca added a commit that referenced this pull request Dec 19, 2021
lpinca added a commit that referenced this pull request Dec 19, 2021
The `generateMask` option specifies a function that can be used to
generate custom masking keys.

Refs: #1986
Refs: #1988
Refs: #1989
lpinca added a commit that referenced this pull request Dec 19, 2021
The `generateMask` option specifies a function that can be used to
generate custom masking keys.

Refs: #1986
Refs: #1988
Refs: #1989
lpinca added a commit that referenced this pull request Dec 20, 2021
The `generateMask` option specifies a function that can be used to
generate custom masking keys.

Refs: #1986
Refs: #1988
Refs: #1989
lpinca added a commit that referenced this pull request Dec 20, 2021
The `generateMask` option specifies a function that can be used to
generate custom masking keys.

Refs: #1986
Refs: #1988
Refs: #1989
lpinca added a commit that referenced this pull request Dec 20, 2021
The `generateMask` option specifies a function that can be used to
generate custom masking keys.

Refs: #1986
Refs: #1988
Refs: #1989
th37rose added a commit to th37rose/websocket_server that referenced this pull request May 24, 2022
The `generateMask` option specifies a function that can be used to
generate custom masking keys.

Refs: websockets/ws#1986
Refs: websockets/ws#1988
Refs: websockets/ws#1989
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants