New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profanity list: If string is too long, sogs fails to start properly #159
Comments
Same or similar issue regarding length of the profanity phrases is that when you have some longer phrases like this profanity blocklist, then the sogs commands like "sogs -Lv" takes too much time to execute, causing OOM kill: When i sort my profanity, issue causing file by the length of lines:
After shortening these, i think that the sogs restart time decreased ~2 seconds and no OOMkill on "sogs -Lv" - even it is still very slow (delays like 8 seconds before output - likely due to profanity being somehow loaded maybe uselesly, 8 s. is the time equal to sogs restart time btw.) @jagerman @mdPlusPlus @majestrate Debian GNU/Linux 11 (bullseye), 5.10.0-18-amd64, PySOGS 0.3.7 |
Some of the following phrases also cause issues in sogs profanity blocklist, causing +30-40 seconds increase of the sogs restart time: maybe not each charter has same size and there is some maximum threshold per phrase where it start causing issue? |
On Thursday, 6 July 2023 07:21:16 EDT slrslr wrote:
Some of the following phrases also cause issues in sogs profanity blocklist,
causing +30-40 seconds increase of the sogs restart time: willing to do
tributes - issue
willing to do tributess - issue
willing to do tribuees - issue
willing to do tribites - issue
willing to do tributey - ok (no increase)
willing to do tributet - ok (no increase)
willing to do tributeyy - ok (no increase)
aaaaaaa aa aa aaaaates - issue
aaaaaaa aaaa aaaaaaas - Job for sogs-proxied.service failed. (after like 30
seconds) aaaaaaaaaaaaaaaates - Job for sogs-proxied.service failed. (after
like 30 seconds) aaaaaaaaaaaaa aaa tes - Job for sogs-proxied.service
failed. (after like 30 seconds)
maybe not each charter has same size and there is some maximum threshold per
phrase where it start causing issue?
i think that the way the filter is implemented is really naive and results in
quadratic complexity given the number and size of each phrase.
this is something that likely could be implemented with a bloom filter or
probabilistic negative lookup filter. funny enough there is a very fast one
that is almost perfect for this:
https://github.com/NationalSecurityAgency/XORSATFilter
…--
~jeff
|
This is bad problem. 1200 blocked phrases and the restart time is 1 minute 25 seconds and the memory usage is like 1 GB and 500MB swap. Which is near full capacity of my server. |
I suspect that the profanity filtering does not support longer phrases like:
privacy?public_key=118df8c6c471ac0468c7c77e1cdc12f24a139ee8a07c6e3bf4e7855640dad821
or
aaaaaaaaaaadaaaaaaa118df8c6c471ac0468c7c77e1cdc12f24a139ee8a07c6e3bf4e7855640dad821
When i add it and ran "sudo systemctl restart sogs"
$ sogs --version
PySOGS 0.3.5
I am on Linux Debian 11, .deb package
Can you reproduce it and fix it please?
The text was updated successfully, but these errors were encountered: