Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] spamd_language_detector_init: cannot compile stop words od Debian 10 #3124

Closed
predraggavrilovic opened this issue Oct 28, 2019 · 13 comments

Comments

@predraggavrilovic
Copy link

Prerequisites

After fresh install or upgrade to rspamd "2.0-1~buster" there are warnings in rspamd.log:

2019-10-28 15:31:39 #2307(main) ; cfg; rspamd_language_detector_init: cannot compile stop words for 0 language group: pattern is not enclosed with : m\x{ed}
2019-10-28 15:31:39 #2307(main) ; cfg; rspamd_language_detector_init: cannot compile stop words for 1 language group: regexp parsing error: 'character value in \x{} or \o{} is too large' at position 6
2019-10-28 15:31:39 #2307(main) ; cfg; rspamd_language_detector_init: cannot compile stop words for 2 language group: regexp parsing error: 'character value in \x{} or \o{} is too large' at position 6
2019-10-28 15:31:39 #2307(main) ; cfg; rspamd_language_detector_init: cannot compile stop words for 3 language group: regexp parsing error: 'character value in \x{} or \o{} is too large' at position 6

Also rspamdadm configtest gives:

CPU doesn't have SSSE3 instructions set required for hyperscan, disable it
cannot compile stop words for 0 language group: pattern is not enclosed with : m\x{ed}
cannot compile stop words for 1 language group: regexp parsing error: 'character value in \x{} or \o{} is too large' at position 6
cannot compile stop words for 2 language group: regexp parsing error: 'character value in \x{} or \o{} is too large' at position 6
cannot compile stop words for 3 language group: regexp parsing error: 'character value in \x{} or \o{} is too large' at position 6
syntax OK

Steps to Reproduce

  1. Install rspamd 2.0-1~buster on Debian 10
  2. check logs or run rspamadm configtest

Versions

Additional Information

rspamd 2.0-1~buster on Debian 10.1

@predraggavrilovic
Copy link
Author

Still present in rspamd 2.1-1~buster

@vstakhov
Copy link
Member

I cannot reproduce it. Are you using pcre1 or pcre2? Anyway, non-hyperscan version of Rspamd is not something in top of my priorities.

@pmontepagano
Copy link

@vstakhov is the error about not being able to compile stop words related to hyperscan being disabled? I thought those were separate issues.

@vstakhov
Copy link
Member

CPU doesn't have SSSE3 instructions set required for hyperscan, disable it - so it is caused by PCRE.

@pmontepagano
Copy link

Thanks! Changing VM's CPU to enable SSSE3 instructions and re-enabling hyperscan solved it for me.

@predraggavrilovic
Copy link
Author

Vsevold, thank you for your time.
I also have enabled SSSE3 on my virtual machines and errors are gone.
As for pcre libs:
rspamd binary packages from rspamd site are linked against libpcre3 which is in debian consiedered old librararies, and packages are to be linked against libpcre2 which is newer despite number in its name.
For example, debian repository has rspamd 1.8 and debian buster backports has 1.9.4, both of which are old but not ancient, and both are linked against libpcre2-8-0 which is condiered newer.
So enabling hyperscan solves it, but maybe different pcre lib could provide functional failback

Best regards,

@vstakhov
Copy link
Member

I do not support Debian packages, please stop using them...

@vstakhov
Copy link
Member

I wish I could convince Debian maintainers to stop providing Rspamd in Debian repos. Probably, I should change a license for that, but I'm really tired of that crap...

@predraggavrilovic
Copy link
Author

Just to be clear, I am not using debian packages, I am using binary packages from http://rspamd.com/apt-stable/
I was just noting that debian uses newer pcre lib for their packages, than upsteam rspamd does, and that for debian, on libpcre3 page there is information "New packages should use the newer pcre2 packages, and existing packages should migrate to pcre2."

Packages from http://rspamd.com/apt-stable are depending od libpcre3 not on libpcre2

Best regards,

@vstakhov
Copy link
Member

Ah, I see, sorry. Yes, all packages should be built with pcre2 - I will try to fix it.

@stale
Copy link

stale bot commented Jan 11, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jan 11, 2020
@stale stale bot closed this as completed Jan 19, 2020
@Ryushin
Copy link

Ryushin commented Jun 12, 2020

Can this be reopened? I'm setting up rspamd on a dual socket Opteron 6168 12 core system with 128GB of RAM. The Opteron 6168 does not have SSSE3. I'm using the Rspamd provided Debian packages.

/etc/init.d/rspamd restart [....] Restarting rapid spam filtering system: rspamd2020-06-12 10:32:20 #0(main) cfg; rspamd_config_post_load: CPU doesn't have SSSE3 instructions set required for hyperscan, disable it 2020-06-12 10:32:20 #0(main) <9senjh>; cfg; rspamd_language_detector_init: cannot compile stop words for 0 language group: regexp parsing error: 'character code point value in \x{} or \o{} is too large' at position 7; pattern: a\x{10d}koli 2020-06-12 10:32:20 #0(main) <9senjh>; cfg; rspamd_language_detector_init: cannot compile stop words for 1 language group: regexp parsing error: 'character code point value in \x{} or \o{} is too large' at position 6; pattern: \x{441}\x{430}\x{43c} 2020-06-12 10:32:20 #0(main) <9senjh>; cfg; rspamd_language_detector_init: cannot compile stop words for 2 language group: regexp parsing error: 'character code point value in \x{} or \o{} is too large' at position 6; pattern: \x{906}\x{92b}\x{942} 2020-06-12 10:32:20 #0(main) <9senjh>; cfg; rspamd_language_detector_init: cannot compile stop words for 3 language group: regexp parsing error: 'character code point value in \x{} or \o{} is too large' at position 6; pattern: \x{647}\x{646}\x{627}\x{644}\x{643}

@vstakhov
Copy link
Member

That is likely fixed in the master. Using of such a CPU has a very bad impact on Rspamd performance JFYI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants