Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider debouncing ement-room-list-auto-update #197

Open
phil-s opened this issue Aug 3, 2023 · 7 comments
Open

Consider debouncing ement-room-list-auto-update #197

phil-s opened this issue Aug 3, 2023 · 7 comments
Assignees
Labels
Milestone

Comments

@phil-s
Copy link

phil-s commented Aug 3, 2023

If I (trace-function 'plz--respond) I'm seeing about 5 calls per second consistently.

That really seems like a lot. (Especially if each of those is starting an independent curl process?)

Is that level of activity in line with expectations?

@phil-s
Copy link
Author

phil-s commented Aug 3, 2023

I guess ement-auto-sync just fires off each request as soon as it's handled the previous one?

I don't know whether multiple concurrent sync requests for the same session are getting stacked up here for some reason. If there isn't already protection against issuing a new sync while one is already in progress (or otherwise queued up), there probably should be.

I expect that some kind of customizable throttling would be a good idea? I'm imagining something like "poll every X seconds; but when a message is received (or sent!) then temporarily reduce the polling interval to Y seconds so that active conversations are processed in a timely fashion. Maintain the smaller value until such time as there have been no new messages for Z seconds, at which point revert the interval back to X." (With X, Y, Z being user options.)

In the meantime I've disabled ement-auto-sync and am instead running #177 (comment) on a shorter interval than the one in that code. (Which makes things really chatty in the messages buffer due to defsubst ement--sync-messages-p , so I've hacked that as well.) Emacs is no longer using most of a CPU core most of the time :)

@alphapapa
Copy link
Owner

alphapapa commented Aug 3, 2023

Matrix is implemented via an HTTP connection that remains open until the server has data to send to the client. That data might be a large chunk of events for multiple rooms, or it might be a tiny response indicating that a user is typing in a room. As soon as such a response is received, the client makes another request to listen for the next update. If the user is in several busy rooms, it's entirely possible, if not very likely, that multiple such requests and responses will be made in short periods of time.

Ement already limits to one outstanding sync per session:

ement.el/ement.el

Lines 492 to 499 in 60e9a50

(when (map-elt ement-syncs session)
(if force
(condition-case err
(delete-process (map-elt ement-syncs session))
;; Ensure the only error is the expected one from deleting the process.
(ement-api-error (cl-assert (equal "curl process killed" (plz-error-message (cl-third err))))
(message "Ement: Forcing new sync")))
(user-error "Ement: Already syncing this session")))

However, requests other than sync requests could be outstanding, e.g. sending messages, uploading files, retrieving images, etc. All of those may run in parallel.

I expect that some kind of customizable throttling would be a good idea? I'm imagining something like "poll every X seconds; but when a message is received (or sent!) then temporarily reduce the polling interval to Y seconds so that active conversations are processed in a timely fashion. Maintain the smaller value until such time as there have been no new messages for Z seconds, at which point revert the interval back to X." (With X, Y, Z being user options.)

I don't know of any reason such a thing would be needed. AFAIK incoming events are processed correctly and quickly already. This has worked fine ever since the first iteration of this client.

What is the problem you're reporting? That it seems like there are too many HTTP requests being made? If so, I'd suggest making a test account that's only in one room, where you're the only member of it, and see if the problem persists. I think you'll see that only one connection is made, and it doesn't receive any data unless the room receives an event, and the connection will time out and be recreated every 30-40 seconds as programmed.

(Especially if each of those is starting an independent curl process?)

You can check for yourself which curl process the sentinel is being called for, and see if it's being called multiple times for the same process, or once per process. But how often a sentinel is called is up to Emacs, and accounting for that in plz.el has been difficult and only in the last version has been fully handled correctly, after years of looking for solutions.

@phil-s
Copy link
Author

phil-s commented Aug 3, 2023

Matrix is implemented via an HTTP connection that remains open until the server has data to send to the client.

Ah, good, that make things less alarming!

What is the problem you're reporting?

My real problem was Emacs continually maxing out a CPU core, and when I profiled that I got this:

       79180  65% - timer-event-handler
       79172  65%  - apply
       78361  64%   - plz--respond
       78330  64%    - #<compiled -0x187abbe156b59d04>
       77275  63%     - #<compiled -0xade032038805cf0>
       77275  63%      - apply
       77275  63%       - ement--sync-callback
       77229  63%        - run-hook-with-args
       76069  62%         - ement-room-list-auto-update
       76065  62%          - revert-buffer
       76065  62%           - ement-room-list-revert
       76045  62%            - ement-room-list
       28118  23%             + taxy-magit-section-format-items
       21583  17%             + taxy-magit-section-insert
       21551  17%             + taxy-fill
        3366   2%             + taxy-sort
         735   0%             + taxy-make-take-function
          80   0%             + walk-windows

Disabling ement-room-list-auto-update helped of course, but (I'm pretty sure) I didn't see the CPU usage completely drop away until I implemented my makeshift throttling.

I am joined to a lot of rooms in my primary Matrix server (~180), so that's a huge factor, but it would be good if the user had a way to help ement/emacs cope better with heavier loads like this.

@alphapapa
Copy link
Owner

I see, thanks. Well, I'm in 103 rooms, and I don't have any performance problems with the room list updating, so I'm surprised that you're seeing that. From looking at the profile, it doesn't look like walk-windows is a factor, but I still wonder, if you remove that patch that does that (i.e. run the latest release of Ement unmodified), does the performance problem remain?

Anyway, if necessary, the room list auto update function could be modified to run a timer of something like 200ms and reset it each time, so that if the function is called in rapid succession, the room list would only get updated when 200ms had expired since the last time it was called; that could serve to "debounce" the updates.

@phil-s
Copy link
Author

phil-s commented Aug 3, 2023

I'm in 103 rooms, and I don't have any performance problems with the room list updating, so I'm surprised that you're seeing that.

Ah, darn... I'd figured my case might be abnormal among ement.el users, but a factor of < 2x isn't so huge. Or not unless there's some exponential effects which are starting to really ramp up around the upper half of that range.

I guess I'll need to try to learn more. Which probably won't happen in a hurry, but leave it with me and I'll follow up when I have more info (and feel free to close this, I guess -- it can always be re-opened).

@phil-s
Copy link
Author

phil-s commented Aug 3, 2023

I've been alerted to a Matrix server bug which is probably one (or both) of the following, which was apparently resulting in an enormous increase in Matrix traffic on our network.

I strongly suspect what I was seeing yesterday was simply on account of this, as the timing fits.

I'll re-test with defaults later on (as presence has now been disabled for our server to work around this issue).

@phil-s
Copy link
Author

phil-s commented Aug 4, 2023

I haven't gone right back to defaults, but I've reinstated ement-auto-sync and reverted the other things I did for this, and tracing plz--respond now gets me around 1-2 entries in a 30 second span, rather than 4-5 every second :)

the room list auto update function could be modified to run a timer of something like 200ms and reset it each time, so that if the function is called in rapid succession, the room list would only get updated when 200ms had expired since the last time it was called; that could serve to "debounce" the updates.

Even though this case was a server problem, I think something like that would be a good idea, if only to protect against any similar cases in the future.

@alphapapa alphapapa changed the title plz is extremely busy... should it be? Consider debouncing ement-room-list-auto-update Aug 4, 2023
@alphapapa alphapapa self-assigned this Aug 4, 2023
@alphapapa alphapapa added the enhancement New feature or request label Aug 4, 2023
@alphapapa alphapapa added this to the 0.12 milestone Aug 4, 2023
@alphapapa alphapapa modified the milestones: 0.12, 0.13 Aug 14, 2023
@alphapapa alphapapa modified the milestones: 0.13, 0.14 Oct 3, 2023
@alphapapa alphapapa modified the milestones: 0.14, 0.15 Oct 29, 2023
@alphapapa alphapapa modified the milestones: 0.15, 0.16 Mar 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants