Fixes #2900 - Checks slur regex to see if it is too permissive #3146

ninanator · 2023-06-16T13:27:23Z

WHAT

Adds validation to check if the slur regex is too permissive (i.e would match all text)
- Adds two new error keys depending on whether the slur regex:
  - is too permissive (permissive_regex)
  - couldn't compile (invalid_regex)
Adds a validation check for the site name, since I saw we were checking the site description length but not the name length
- Adds two new error keys:
  - site_name_length_overflow
  - site_name_required

lemmy-translations PR here: LemmyNet/lemmy-translations#65

PROOF

Updating site name

OLD - Attempting to clear site name

(Your site now has a name of empty string, which seems not great).

NEW - Attempting to clear site name

OLD - Site name is too long

(Your change had no effect, which is confusing)

NEW - Site name is too long

Updating the site with different regex patterns

Screen.Recording.2023-06-17.at.3.39.04.PM.mov

…e along with small validation organization

crates/api_common/src/utils/site_utils.rs

ninanator · 2023-06-16T23:08:56Z

@dessalines @Nutomic - It looks like my build is failing since I'm using features from 1.70.0 while Woodpecker is pinned to 1.67.0. I'm happy to adjust to what is available from 1.67.0, but just wanted to ask if you both had an opinion on what Rust version we should be using to develop locally?

Would you be OK with bumping to the 1.70.0 image if everything builds fine in the pipeline? I don't know if version upgrades in Rust are nerve-wracking or not, so I figured I'd ask. Thank you!

crates/api_common/src/utils/site_utils.rs

dessalines · 2023-06-17T12:29:33Z

crates/api_crud/src/site/create.rs

-    let icon = diesel_option_overwrite_to_url(&data.icon)?;
-    let banner = diesel_option_overwrite_to_url(&data.banner)?;
+    // Check that the slur regex compiles, and returns the regex if valid...
+    let slur_regex = build_and_check_regex(&local_site.slur_filter_regex.as_deref())?;


crates/api_crud/src/site/create.rs

dessalines · 2023-06-17T12:34:52Z

crates/api_crud/src/site/create.rs


-    if let Some(Some(desc)) = &description {
+    if let Some(desc) = &data.description {


One thing that scares me about this change, is that the diesel_option_overwrite function has a special case: it considers sending empty string as a set the DB item to null. In which case, this check shouldn't be run.

By removing that, it now runs these checks on empty strings, which might fail.

You can see this working as (I believe) expected here: https://github.com/LemmyNet/lemmy/pull/3146/files#diff-d9124d8135ec751589f580dd3a84eba50f3ddb2bd27bd2cfb7a436f6af9e9c54R276

dessalines · 2023-06-17T12:38:10Z

crates/api_crud/src/site/update.rs


    if let Some(desc) = &data.description {
      site_description_length_check(desc)?;
+      check_slurs_opt(&data.description, &slur_regex)?;


Same concern as above, and this also should probably be changed to if let Some(Some(... .

I hope it won't fail on empty strings.

I had started moving the validation portions into a function; based on this and your previous comments I'll just go ahead and do that! It'll let me write more targeted unit tests and show things should continue to work as expected 👍

You can see this working as (I believe) expected here: https://github.com/LemmyNet/lemmy/pull/3146/files#diff-fcc9fa5b2820776716027bd57d2b8c83dfd11b52b3274a1cb341e5845aa1feebR370

ninanator · 2023-06-17T17:16:13Z

crates/utils/src/utils/validation.rs

+          // may match against any string text. To keep it simple, we'll match the regex
+          // against an innocuous string - a single number - which should help catch a regex
+          // that accidentally matches against all strings.
+          if regex.is_match("1") {


This would be problematic if there is ever a community that rejects numbers; however, I feel like this would work for the majority of cases. Not sure if there's a better way to check that wont get complicated quickly.

This seems fine, a slur filter which matches a single character is almost certainly wrong.

ninanator · 2023-06-17T17:18:14Z

crates/utils/src/utils/validation.rs

+
+/// Checks the site name length, the limit as defined in the DB.
+pub fn site_name_length_check(name: &str) -> LemmyResult<()> {
+  min_max_length_check(


I noticed that the site name is required in the database, so I added a check to also validate a minimum site name length. Otherwise the changes are just moved from when it lived in site_utils.rs.

Good catch, thanks.

ninanator · 2023-06-17T17:20:27Z

crates/api_crud/src/site/update.rs

+    check_slurs_opt(&edit_site.description, &slur_regex)?;
+  }
+
+  if let Some(listing_type) = &edit_site.default_post_listing_type {


Should this check also be done when creating a site?

Its probably a good idea, for consistency, although lemmy-ui doesn't show nearly as many fields when creating the site, as for updating it.

ninanator · 2023-06-17T17:42:36Z

crates/api_crud/src/site/update.rs

+  let enabled_federation_with_private_instance = edit_site.federation_enabled == Some(true)
+    && edit_site.private_instance.unwrap_or(private_instance);
+
+  if enabled_private_instance_with_federation || enabled_federation_with_private_instance {


Same here - should this check also be done when creating a site?

Probably a good idea for consistency.

ninanator · 2023-06-17T17:43:47Z

crates/api_crud/src/site/update.rs

+  federation_enabled: bool,
+  private_instance: bool,
+  edit_site: &EditSite,
+) -> LemmyResult<()> {


Following comments are questions I have, that I can address in a subsequent PR:

Do we need a check to make sure that the site does exist (opposite of the local_site.site_setup in create)? Not immediately clear to me if this would also work as a way to create the initial site.

Not really necessary IMO, because it never sets that site_setup field in the form builder.

ninanator · 2023-06-17T17:49:56Z

crates/api_crud/src/site/create.rs

+      }
+    }
+
+    let invalid_payloads = [(


I didn't add cases for functions from validation.rs because the unit tests in that file should cover those cases. Let me know if you think otherwise, and I can add more test cases here!

ninanator · 2023-06-17T18:14:39Z

crates/api_crud/src/site/create.rs

-    }
-
-    is_valid_body_field(&data.sidebar)?;
+    validate_create_payload(local_site.site_setup, local_site.slur_filter_regex, data)?;

    let application_question = diesel_option_overwrite(&data.application_question);


I want to move this into the validation fn in the next PR.

Sounds good

ninanator · 2023-06-17T20:50:19Z

@dessalines - Thank you for the initial feedback! I think this is ready to review again at your leisure. For additional proof, I've added some screenshots and a video of this working in the MR description. I also created a PR for the translations at LemmyNet/lemmy-translations#65.

ninanator · 2023-06-18T15:17:15Z

crates/db_views/src/comment_view.rs

@@ -600,21 +606,22 @@ mod tests {

    let read_comment_views_no_person = CommentQuery::builder()
      .pool(pool)
-      .sort(Some(CommentSortType::Hot))
+      .sort(Some(CommentSortType::Old))


This was the main change required to get the tests to not be flaky on both main and this branch. It seems like the hot ranking doesn't always update in time for the sort order Hot to pick the first comment on the post.

Yep that's perfectly fine,we should be using New or Old for sorting for the tests, to be predictable.

dessalines

This is great, and thank you for adding all these tests too!

dessalines · 2023-06-19T14:19:10Z

crates/api_crud/src/site/create.rs

-    }
-
-    is_valid_body_field(&data.sidebar)?;
+    validate_create_payload(local_site.site_setup, local_site.slur_filter_regex, data)?;

    let application_question = diesel_option_overwrite(&data.application_question);


Sounds good

dessalines · 2023-06-19T14:24:03Z

crates/api_crud/src/site/update.rs

+  federation_enabled: bool,
+  private_instance: bool,
+  edit_site: &EditSite,
+) -> LemmyResult<()> {


Not really necessary IMO, because it never sets that site_setup field in the form builder.

dessalines · 2023-06-19T14:25:59Z

crates/api_crud/src/site/update.rs

+    check_slurs_opt(&edit_site.description, &slur_regex)?;
+  }
+
+  if let Some(listing_type) = &edit_site.default_post_listing_type {


Its probably a good idea, for consistency, although lemmy-ui doesn't show nearly as many fields when creating the site, as for updating it.

dessalines · 2023-06-19T14:26:41Z

crates/api_crud/src/site/update.rs

+  let enabled_federation_with_private_instance = edit_site.federation_enabled == Some(true)
+    && edit_site.private_instance.unwrap_or(private_instance);
+
+  if enabled_private_instance_with_federation || enabled_federation_with_private_instance {


Probably a good idea for consistency.

dessalines · 2023-06-19T14:27:52Z

crates/db_views/src/comment_view.rs

@@ -600,21 +606,22 @@ mod tests {

    let read_comment_views_no_person = CommentQuery::builder()
      .pool(pool)
-      .sort(Some(CommentSortType::Hot))
+      .sort(Some(CommentSortType::Old))


Yep that's perfectly fine,we should be using New or Old for sorting for the tests, to be predictable.

dessalines · 2023-06-19T14:28:14Z

crates/utils/src/utils/validation.rs

+
+/// Checks the site name length, the limit as defined in the DB.
+pub fn site_name_length_check(name: &str) -> LemmyResult<()> {
+  min_max_length_check(


Good catch, thanks.

ninanator · 2023-06-19T23:46:21Z

@dessalines I did (what I think is) the last step and updated the submodules 🎉 Let me know if there's anything else I need to do! Thank you!

Nutomic · 2023-06-20T11:19:07Z

crates/api_crud/src/site/create.rs

+          }
+        }
+      },
+    );


I find it pretty confusing to loop through tests like this. Would be much clearer to have a helper function and call that explicitly with each set of params.

@Nutomic Trying to understand your feedback here - are you saying you'd rather have a test per case, and call

match validate_update_payload(local_site_slur_filter_regex.clone(), false, true, edit_site) { Ok(_) => { panic!( "Got Ok, but validation should have failed with error: {} for invalid_payloads.nth({})", expected_err, idx ) } Err(error) => { assert!( error.message.eq(&Some(String::from(expected_err))), "Got Err {:?}, but should have failed with message: {} for invalid_payloads.nth({})", error.message, expected_err, idx ) } }

as the helper function?

Now that you write it out, it doesnt seem to make much difference. Anyway the errors are being logged so it should be fine. Just resolve the conflicts and we can merge.

I had started cleaning up the validation some more in anticipation of putting up a second PR, but at this point I just pushed it all up. I also updated the tests so that there's now a reason included in the tuples, which should make it easier to understand each case and not need to make a separate test for each one.

…ve test readability

Fixes LemmyNet#2900 - Checks slur regex to see if it is too permissiv…

a51ffe6

…e along with small validation organization

ninanator requested review from Nutomic and dessalines as code owners June 16, 2023 13:27

ninanator commented Jun 16, 2023

View reviewed changes

crates/api_common/src/utils/site_utils.rs Outdated Show resolved Hide resolved

ninanator added 3 commits June 16, 2023 08:52

Clean up variable names, add handler for valid empty string usecase

ca6c519

Merge branch 'main' into lemmy-2900-check-slur-regex

69b9b79

Update tests

6836080

dessalines reviewed Jun 17, 2023

View reviewed changes

Create validation function and add tests

c01e944

ninanator commented Jun 17, 2023

View reviewed changes

ninanator mentioned this pull request Jun 17, 2023

Add fields for site creation and update validation LemmyNet/lemmy-translations#65

Merged

Test clean up

3569b0f

ninanator commented Jun 17, 2023

View reviewed changes

Use payload value vs local site value to prevent stunlocking

3526feb

ninanator requested a review from dessalines June 17, 2023 21:05

ninanator added 3 commits June 17, 2023 16:14

Remove println added while testing

aba61cf

Fall back to local site regex if not provided from request

753d50c

Attempt clean up of flaky comment_view tests

33eff7e

ninanator force-pushed the lemmy-2900-check-slur-regex branch from 9181f8f to 33eff7e Compare June 18, 2023 14:37

ninanator commented Jun 18, 2023

View reviewed changes

dessalines approved these changes Jun 19, 2023

View reviewed changes

Pull in latest submodule

cf1f0e4

Nutomic reviewed Jun 20, 2023

View reviewed changes

ninanator added 2 commits June 26, 2023 22:05

Merge in main

2613892

Merge main, resolve conflicts

9b883f2

ninanator requested a review from Nutomic June 27, 2023 03:46

ninanator and others added 2 commits June 27, 2023 01:08

Move application, post check into functions, add more tests and impro…

597d8f1

…ve test readability

Merge branch 'main' into lemmy-2900-check-slur-regex

cfb1f0f

Nutomic enabled auto-merge (squash) June 27, 2023 09:14

Merge branch 'main' into lemmy-2900-check-slur-regex

eea4b20

Nutomic merged commit e63aa80 into LemmyNet:main Jun 27, 2023
1 check passed

Nutomic mentioned this pull request Jun 27, 2023

[Bug]: PUT requests to /site fail if application_question is not provided, even if the question has already been configured for the site #3323

Closed

4 tasks


		if let Some(Some(desc)) = &description {
		if let Some(desc) = &data.description {

Fixes #2900 - Checks slur regex to see if it is too permissive #3146

Fixes #2900 - Checks slur regex to see if it is too permissive #3146

Conversation

ninanator commented Jun 16, 2023 • edited

WHAT

PROOF

Updating site name

OLD - Attempting to clear site name

NEW - Attempting to clear site name

OLD - Site name is too long

NEW - Site name is too long

Updating the site with different regex patterns

ninanator commented Jun 16, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ninanator Jun 17, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ninanator Jun 17, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ninanator commented Jun 17, 2023 • edited

ninanator Jun 18, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dessalines left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ninanator commented Jun 19, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Nutomic Jun 26, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ninanator commented Jun 16, 2023 •

edited

ninanator Jun 17, 2023 •

edited

ninanator Jun 17, 2023 •

edited

ninanator commented Jun 17, 2023 •

edited

ninanator Jun 18, 2023 •

edited

Nutomic Jun 26, 2023 •

edited