Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proof of concept: Reject postings containing text written in Farsi. #128

Closed
wants to merge 2 commits into from

Conversation

ianmacd
Copy link
Contributor

@ianmacd ianmacd commented Sep 25, 2022

In its current state, the filter is of limited use:

  • The filter allows only a single group to be whitelisted from filtering. This makes it unusable if Farsi is an allowed language in more than one group on the server.

  • No error other than the failure to send is reported to the client.

The purpose of this PR is to spark discussion and pave the way for the implementation of a multi-lingual, per group language filter.

--
Codepoint reference: https://github.com/mirhmousavi/Regex.Persian.Language

In its current state, the filter is of limited use:

* The filter is per server, not per group. This makes it unusable if
  Farsi is an allowed language in some server groups.

* The filter has no config file settings to disable it or control its
  behaviour.

* No error other than the failure to send is reported to the client.

The purpose of this PR is to spark discussion and pave the way for the
implementation of a multi-lingual, per group language filter.

--
Codepoint reference: https://github.com/mirhmousavi/Regex.Persian.Language
@ianmacd
Copy link
Contributor Author

ianmacd commented Sep 26, 2022

The filter can now be enabled with language_filter_farsi = yes in sogs.ini, and a group id can be specified to exempt it from filtering, for example language_whitelist_farsi = 18.

Obviously it would be much better to allow a list of group ids here.

language_filter_farsi = yes

Additionally, a group id may be specified to exempt it from filtering:

language_whitelist_farsi = 18
@ianmacd
Copy link
Contributor Author

ianmacd commented Sep 26, 2022

I have now totally overhauled this in #129, which supersedes this pull-request.

#129 more accurately addresses the issue of alphabet filtering, since that's what this PR was actually doing.

The new PR also allows for multiple alphabet filtering and per-alphabet whitelisting.

@jagerman
Copy link
Member

Closing (replaced with #129).

@jagerman jagerman closed this Sep 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants