Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed feedback #57

Closed
chrisdone opened this issue Jun 5, 2019 · 10 comments
Closed

Speed feedback #57

chrisdone opened this issue Jun 5, 2019 · 10 comments
Assignees

Comments

@chrisdone
Copy link

A personal observation: when sending a payload of a PNG image from a web browser on localhost to the emacs-websocket server, the websocket-read-frame function takes 452ms to consume it.

websocket-read-frame: 0ms [29 times]
websocket-read-frame: 452ms

@ahyatt Are you aware of some obvious low-hanging fruit in the performance? It seems like O(n) time, perhaps it could be mitigated?

For my use-case I'm probably going to switch to making a POST request to Emacs and use a raw socket to consume the binary chunks. Theoretically it should take maybe a millisecond to receive 100KB. It seems like the websocket protocol is a bit of a pain to implement.

@ahyatt
Copy link
Owner

ahyatt commented Jun 5, 2019

I haven't looked at performance at all. Half a second to consume a PNG does excessive. Was the websocket library compiled?

@chrisdone
Copy link
Author

The result above is byte-compiled. Before that, it was around 700ms+. So byte-compiling brought it down to a reproducible 400ms~.

@ahyatt ahyatt self-assigned this Jun 6, 2019
@ahyatt
Copy link
Owner

ahyatt commented Jun 9, 2019

Can you check out the performance branch I recently created and try again? My changes should result in a substantial speedup (they did on my test).

@chrisdone
Copy link
Author

Sure, I still have the code for my test.

@chrisdone
Copy link
Author

That looks like a 10x improvement! From 400ms best case to about 40ms best case. Nice work! 👌

Before:

    websocket-read-frame: 0ms [30 times]
    websocket-read-frame: 637ms
    websocket-read-frame: 0ms
    create-image: 0ms
    insert-image & redisplay: 5ms
    
    websocket-read-frame: 0ms [30 times]
    websocket-read-frame: 402ms
    websocket-read-frame: 0ms
    create-image: 0ms
    insert-image & redisplay: 4ms
    
    websocket-read-frame: 0ms [11 times]
    websocket-read-frame: 258ms
    websocket-read-frame: 0ms [18 times]
    websocket-read-frame: 423ms
    websocket-read-frame: 0ms
    create-image: 0ms
    insert-image & redisplay: 3ms
    
    websocket-read-frame: 0ms [24 times]
    websocket-read-frame: 231ms
    websocket-read-frame: 0ms [5 times]
    websocket-read-frame: 395ms
    websocket-read-frame: 0ms
    create-image: 0ms
    insert-image & redisplay: 11ms
    
    websocket-read-frame: 0ms [30 times]
    websocket-read-frame: 408ms
    websocket-read-frame: 0ms
    create-image: 0ms
    insert-image & redisplay: 5ms

after

websocket-read-frame: 0ms [10 times]
websocket-read-frame: 244ms
websocket-read-frame: 0ms [19 times]
websocket-read-frame: 40ms
websocket-read-frame: 0ms
create-image: 0ms
insert-image & redisplay: 1ms

websocket-read-frame: 0ms [30 times]
websocket-read-frame: 38ms
websocket-read-frame: 0ms
create-image: 0ms
insert-image & redisplay: 1ms

websocket-read-frame: 0ms [30 times]
websocket-read-frame: 43ms
websocket-read-frame: 0ms
create-image: 0ms
insert-image & redisplay: 1ms

websocket-read-frame: 0ms [30 times]
websocket-read-frame: 40ms
websocket-read-frame: 0ms
create-image: 0ms
insert-image & redisplay: 1ms

@chrisdone
Copy link
Author

I wrote a simple http server to receive the bytes directly as a POST request:

Time to receive as a simple POST binary request:

Non-byte-compiled:

image-server: receive time: 31ms
image-server: receive time: 27ms
image-server: receive time: 32ms
image-server: receive time: 27ms
image-server: receive time: 22ms

Byte-compiled, not so different:

image-server: receive time: 18ms
image-server: receive time: 24ms
image-server: receive time: 21ms
image-server: receive time: 25ms
image-server: receive time: 22ms
image-server: receive time: 23ms

@chrisdone
Copy link
Author

My websocket code is like:

(progn
  (websocket-server-close server)
  (defvar server
    (websocket-server
     9009
     :host 'local
     :on-message
     (lambda (ws frame)
       (with-current-buffer (websocket-client-data ws)
         (insert (websocket-frame-payload frame))
         (when (websocket-frame-completep frame)
           (let ((bytes (buffer-string)))
             (with-current-buffer (get-buffer-create "*image-output*")
               (let ((img (measure-time "create-image" (create-image bytes 'jpeg t))))
                 (measure-time "insert-image & redisplay"
                               (progn (insert-image img)
                                      (redisplay)))))
             (erase-buffer))))
       )
     :on-open
     (lambda (ws) (message "Client connected.")
       (setf (websocket-client-data ws)
             (with-current-buffer
                 (get-buffer-create (generate-new-buffer "websocket"))
               (toggle-enable-multibyte-characters -1)
               (current-buffer))))
     :on-close
     (lambda (ws)
       (kill-buffer (websocket-client-data ws))
       (message "Client disconnected.")))))

@chrisdone
Copy link
Author

chrisdone commented Jun 9, 2019

I looked into a little bit how I'm buffering the strings.

(benchmark-run-compiled 100
  (length
   (with-temp-buffer
     (cl-loop for i from 0 to 10000
              do (insert "hello world hello worldhello worldhello worldhello worldhello world"))
     (buffer-string))))

(benchmark-run-compiled 100
  (length
   (let ((buffer (list)))
     (cl-loop for i from 0 to 10000
              do (setq buffer (cons "hello world hello worldhello worldhello worldhello worldhello world" buffer)))
     (mapconcat 'identity (reverse buffer) ""))))

(benchmark-run-compiled 100
  (length
   (let* ((buffer (list ""))
          (pointer buffer))
     (cl-loop for i from 0 to 10000
              do (setq
                  pointer
                  (setcdr pointer
                          (list "hello world hello worldhello worldhello worldhello worldhello world"))))
     (mapconcat 'identity buffer ""))))

(the length is so I can easily check they all have the same length)

Results

(1.641801722 4 0.908923355999832)
(1.631438268 4 0.9008467079997899)
(1.616854798 4 0.9054001909998988)

(1.331959202 5 1.1342625530000987)
(1.322703598 5 1.119752232999872)
(1.320611183 5 1.1191802010002903)

(1.101568966 4 0.9080151119997026)
(1.085339208 4 0.9026358060000348)
(1.095221923 4 0.9009243840000636)

It seems you can shave off 30% of the time by using a list. Between consing+reverse vs setcdr, it just seems that the latter shaves off a bit more.

@ahyatt
Copy link
Owner

ahyatt commented Jun 11, 2019

This is good data, thanks for sharing it.

I'm happy the speedup from the performance changes is substantial. It turns out the key to doing a great job on performance is developing the initial version without paying any attention to it :)

I'll go ahead and merge the performance changes in and create new version. I can continue to iterate on performance, but elisp is just not that fast, so I think it will be hard to be equivalent to non-elisp code.

@ahyatt ahyatt closed this as completed Jun 11, 2019
@chrisdone
Copy link
Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants