Hello!!
Thank you so much for developing Etherpad-lite!
I've been using it for about 1-2 years. Today, I have a performance issue that I can't solve/understand on my own.
My story became quite long so here I leave a short/summary version.
A. My etherpad-lite instance (mysql) was suffering from random disconnections when used by 2-3 peoples or more (same pad). Is this could happen by low performance of database i/o?
B. I see my pad performance is great when it is running only with dirtyDB. Is this understandable? or something is not right from my side? (i.e. supposed not to be like this?)
C. If I want to inspect where is the performance bottleneck, from where should I start look into?
== long story ==
Originally, I used mysql-based etherpad. It is mainly used when collaborating in small group meetings, but even when working with 2-3 people, there were often situations in which they were cut off without notice. So, there may be cases where the data is not actually saved/sync-ed. I couldn't resolve this issue so I just let it go as is and asked peoples to be prepared for the data being blown away. So, before refreshing the page, please copy the content that is just written.
This situation was quite inconvenient and sad(if you have lost, your writings), so I searched hard how to find a clue why this happens but I couldn't find why but instead I have found a package called etherpad-load-test: https://github.com/ether/etherpad-load-test. Then I performed this test to my instance. So, according to JohnMcLear's related research and test results, I found that the performance of my instance was fell far short of the example JohnMcLear gave: ether/etherpad-load-test#1
So, I tried to match similar performance trying various things (without really understanding what is my original issue), the configuration I found after many twists and turns was to use dirtyDB and not to use NODE_ENV=production. Only by doing this way, it was possible to achieve a level of performance comparable to that of JohnMcLear. I was happy at least to match the expected performance level but was not sure so this would solve the original issue, 'random disconnection' issue.
Then, I recently conducted a workshop in which more than 10 people connected and worked simultaneously for a week, and I don't think anyone has ever experienced an issue of data loss from the pad. It was great and finally I have confirfmed that that original issue is not present anymore at least for now.
It was good, but since this setting makes the size of the database file(var/dirty.db) very large, I had to run clean-dirty-db.py periodically to prevent the file size from getting too large. Now my db file size is almost 4GB even after clean up. So, now I really need to migrate to another proper database.
Today I re-opened the box of problem, restart continue where to start persuing real troubleshoot.
I have quickly set-up 2 instances of etherpad-lite (v1.8.18). One with Redis backend. The other one with Postgresql. Both was seriously degraded compared my original etherpad-lite (v1.8.16) with dirtyDB.
Some test results that I have got, for example:
App No. 1 using dirtyDB performance is:
Load Test Metrics -- Target Pad https://....
Local Clients Connected: 161
Authors Connected: 41
Lurkers Connected: 120
Sent Append messages: 3372
Commits accepted by server: 3271
Commits sent from Server to Client: 426111
Current rate per second of Commits sent from Server to Client: 0
Mean(per second) of # of Commits sent from Server to Client: 7459
Max(per second) of # of Messages (SocketIO has cap of 10k): 16969
Number of commits not yet replied as ACCEPT_COMMIT from server 101
doohoyi@debian:~/node_modules/etherpad-load-test$
Which is very nice. But ...
Same server/same user
App No. 2 using redis performance is:
Load Test Metrics -- Target Pad https://....
Local Clients Connected: 52
Authors Connected: 13
Lurkers Connected: 39
Sent Append messages: 188
Commits accepted by server: 87
Commits sent from Server to Client: 2895
Current rate per second of Commits sent from Server to Client: 0
Mean(per second) of # of Commits sent from Server to Client: 283
Max(per second) of # of Messages (SocketIO has cap of 10k): 5055
Number of commits not yet replied as ACCEPT_COMMIT from server 101
This is far less. Which is, in my exprience, not acceptable because of disconnections.
In the case of postgresql, I haven't really saved the result.
But, almost the same or slightly better performance was obtained.
summary of my server info:
CPU: Intel(R) Core(TM) i5-4690K CPU @ 3.50GHz
RAM: 16GiB
Storage: SSD
(if I have to post more specific info, please let me know..)
This issue is really frustrating for me.
I even don't know that anyone else is suffering from this similar case or not. (I couldn't find any similar issues/stories... So, I might be doing sth. stupid... 😱)
Here I repeat the question list:
A. My etherpad-lite instance (mysql) was suffering from random disconnections when used by 2-3 peoples or more (same pad). Is this could happen by low performance of database i/o?
B. I see my pad performance is great when it is running only with dirtyDB. Is this understandable? or something is not right from my side? (i.e. supposed not to be like this?)
C. If I want to inspect where is the performance bottleneck, from where should I start look into?
Thank you for reading.
- Dooho Yi
Hello!!
Thank you so much for developing Etherpad-lite!
I've been using it for about 1-2 years. Today, I have a performance issue that I can't solve/understand on my own.
My story became quite long so here I leave a short/summary version.
A. My etherpad-lite instance (mysql) was suffering from random disconnections when used by 2-3 peoples or more (same pad). Is this could happen by low performance of database i/o?
B. I see my pad performance is great when it is running only with dirtyDB. Is this understandable? or something is not right from my side? (i.e. supposed not to be like this?)
C. If I want to inspect where is the performance bottleneck, from where should I start look into?
== long story ==
Originally, I used mysql-based etherpad. It is mainly used when collaborating in small group meetings, but even when working with 2-3 people, there were often situations in which they were cut off without notice. So, there may be cases where the data is not actually saved/sync-ed. I couldn't resolve this issue so I just let it go as is and asked peoples to be prepared for the data being blown away. So, before refreshing the page, please copy the content that is just written.
This situation was quite inconvenient and sad(if you have lost, your writings), so I searched hard how to find a clue why this happens but I couldn't find why but instead I have found a package called etherpad-load-test: https://github.com/ether/etherpad-load-test. Then I performed this test to my instance. So, according to JohnMcLear's related research and test results, I found that the performance of my instance was fell far short of the example JohnMcLear gave: ether/etherpad-load-test#1
So, I tried to match similar performance trying various things (without really understanding what is my original issue), the configuration I found after many twists and turns was to use dirtyDB and not to use NODE_ENV=production. Only by doing this way, it was possible to achieve a level of performance comparable to that of JohnMcLear. I was happy at least to match the expected performance level but was not sure so this would solve the original issue, 'random disconnection' issue.
Then, I recently conducted a workshop in which more than 10 people connected and worked simultaneously for a week, and I don't think anyone has ever experienced an issue of data loss from the pad. It was great and finally I have confirfmed that that original issue is not present anymore at least for now.
It was good, but since this setting makes the size of the database file(var/dirty.db) very large, I had to run clean-dirty-db.py periodically to prevent the file size from getting too large. Now my db file size is almost 4GB even after clean up. So, now I really need to migrate to another proper database.
Today I re-opened the box of problem, restart continue where to start persuing real troubleshoot.
I have quickly set-up 2 instances of etherpad-lite (v1.8.18). One with Redis backend. The other one with Postgresql. Both was seriously degraded compared my original etherpad-lite (v1.8.16) with dirtyDB.
Some test results that I have got, for example:
App No. 1 using dirtyDB performance is:
Load Test Metrics -- Target Pad https://....
Local Clients Connected: 161
Authors Connected: 41
Lurkers Connected: 120
Sent Append messages: 3372
Commits accepted by server: 3271
Commits sent from Server to Client: 426111
Current rate per second of Commits sent from Server to Client: 0
Mean(per second) of # of Commits sent from Server to Client: 7459
Max(per second) of # of Messages (SocketIO has cap of 10k): 16969
Number of commits not yet replied as ACCEPT_COMMIT from server 101
doohoyi@debian:~/node_modules/etherpad-load-test$
Which is very nice. But ...
Same server/same user
App No. 2 using redis performance is:
Load Test Metrics -- Target Pad https://....
Local Clients Connected: 52
Authors Connected: 13
Lurkers Connected: 39
Sent Append messages: 188
Commits accepted by server: 87
Commits sent from Server to Client: 2895
Current rate per second of Commits sent from Server to Client: 0
Mean(per second) of # of Commits sent from Server to Client: 283
Max(per second) of # of Messages (SocketIO has cap of 10k): 5055
Number of commits not yet replied as ACCEPT_COMMIT from server 101
This is far less. Which is, in my exprience, not acceptable because of disconnections.
In the case of postgresql, I haven't really saved the result.
But, almost the same or slightly better performance was obtained.
summary of my server info:
CPU: Intel(R) Core(TM) i5-4690K CPU @ 3.50GHz
RAM: 16GiB
Storage: SSD
(if I have to post more specific info, please let me know..)
This issue is really frustrating for me.
I even don't know that anyone else is suffering from this similar case or not. (I couldn't find any similar issues/stories... So, I might be doing sth. stupid... 😱)
Here I repeat the question list:
A. My etherpad-lite instance (mysql) was suffering from random disconnections when used by 2-3 peoples or more (same pad). Is this could happen by low performance of database i/o?
B. I see my pad performance is great when it is running only with dirtyDB. Is this understandable? or something is not right from my side? (i.e. supposed not to be like this?)
C. If I want to inspect where is the performance bottleneck, from where should I start look into?
Thank you for reading.
- Dooho Yi