Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profanity list: If string is too long, sogs fails to start properly #159

Open
slrslr opened this issue Dec 13, 2022 · 4 comments
Open

Profanity list: If string is too long, sogs fails to start properly #159

slrslr opened this issue Dec 13, 2022 · 4 comments

Comments

@slrslr
Copy link

slrslr commented Dec 13, 2022

I suspect that the profanity filtering does not support longer phrases like:
privacy?public_key=118df8c6c471ac0468c7c77e1cdc12f24a139ee8a07c6e3bf4e7855640dad821
or
aaaaaaaaaaadaaaaaaa118df8c6c471ac0468c7c77e1cdc12f24a139ee8a07c6e3bf4e7855640dad821

When i add it and ran "sudo systemctl restart sogs"

Job for sogs-proxied.service failed.
See "systemctl status sogs-proxied.service" and "journalctl -xe" for details.

$ sogs --version
PySOGS 0.3.5

I am on Linux Debian 11, .deb package

Can you reproduce it and fix it please?

@slrslr
Copy link
Author

slrslr commented Jun 11, 2023

Same or similar issue regarding length of the profanity phrases is that when you have some longer phrases like this profanity blocklist, then the sogs commands like "sogs -Lv" takes too much time to execute, causing OOM kill:
Out of memory: Killed process 24843 (uwsgi) total-vm:1100772kB, anon-rss:286844kB, file-rss:0kB, shmem-rss:180kB, UID:1000 pgtables:2152kB oom_score_adj:0
yet when you empty the profanity blocklist or just trim all lines to maximum 6 characters (cut -c -6 inputfile outputfile;mv inputfile yourprofanityfile) then it is very quick to execute without OOMkill.

When i sort my profanity, issue causing file by the length of lines:
awk '{print length, $0}' /var/lib/session-open-group-server/profanity-block-list.txt | sort -n | cut -d " " -f2-
Longest lines are:

to have my limits pushed
send me something corrupt
New group, dm with example
DM me with sample for group
chat with me about anything

After shortening these, i think that the sogs restart time decreased ~2 seconds and no OOMkill on "sogs -Lv" - even it is still very slow (delays like 8 seconds before output - likely due to profanity being somehow loaded maybe uselesly, 8 s. is the time equal to sogs restart time btw.) @jagerman @mdPlusPlus @majestrate

Debian GNU/Linux 11 (bullseye), 5.10.0-18-amd64, PySOGS 0.3.7

@slrslr
Copy link
Author

slrslr commented Jul 6, 2023

Some of the following phrases also cause issues in sogs profanity blocklist, causing +30-40 seconds increase of the sogs restart time:
willing to do tributes - issue
willing to do tributess - issue
willing to do tribuees - issue
willing to do tribites - issue
willing to do tributey - ok (no increase)
willing to do tributet - ok (no increase)
willing to do tributeyy - ok (no increase)
aaaaaaa aa aa aaaaates - issue
aaaaaaa aaaa aaaaaaas - Job for sogs-proxied.service failed. (after like 30 seconds)
aaaaaaaaaaaaaaaates - Job for sogs-proxied.service failed. (after like 30 seconds)
aaaaaaaaaaaaa aaa tes - Job for sogs-proxied.service failed. (after like 30 seconds)
looking for other groups - ok (no increase)
from Central Maine - issue

maybe not each charter has same size and there is some maximum threshold per phrase where it start causing issue?

@majestrate
Copy link
Contributor

majestrate commented Jul 6, 2023 via email

@slrslr
Copy link
Author

slrslr commented Nov 8, 2023

This is bad problem. 1200 blocked phrases and the restart time is 1 minute 25 seconds and the memory usage is like 1 GB and 500MB swap. Which is near full capacity of my server.
I wish you developers fix how sogs handle blocked words/blocklist. I am stopping being able to add new phrases since sogs would not start at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants