Add tags and mute words #2968

estrattonbailey · 2024-02-22T20:09:57Z

No description provided.

* Add bare minimum hashtags support As atproto/api already parses hashtags, this is as simple as hooking it up like link segments. This is "bare minimum" because: - Opening hashtag "#foo" is actually just a search for "foo" right now to work around #2491. - There is no integration in the composer. This hasn't stopped people from using hashtags already, and can be added later. - This change itself only had to hook things up - thank you for having already put the hashtag parsing in place. * Remove workaround for hash search not working now that it's fixed

pfrazee · 2024-02-23T00:59:05Z

src/lib/moderatePost_wrapped.ts

+
+      if (mutedWordNoPunc === wordNoPunc) return true
+    }
+  }


If we want to enable support for phrases, I think what we can do is this:

Instead of splitting the text into words, run your punctuation strip on it

Convert the muteword into a regex like this (havent tested this regex): new RegExp('[\s\t\n\r\f\,\.\;\!\?\_\-\"\(\)]' + stripPunctuation(muteWord) + '[\s\t\n\r\f\,\.\;\!\?\_\-\"\(\)]')

Just run the muteword regexes against the punctuation-stripped post text

Here's how Mastodon is more or less doing its matching if you need another perspective:

https://github.com/mary-ext/langit/blob/70cf8ed820d27337370039f517f4968c6553cd11/app/desktop/components/settings/settings-views/keyword-filters/KeywordFilterFormView.tsx#L258

Thanks for the link! Will have a look and compare 🙏

pfrazee

Awesome work Eric. Really just into the tweaking stage.

src/components/TagMenu/index.web.tsx

haileyok

Everything here looks solid. Test cases look good, giving it a spin for the new few days will probably turn up something but I say let's

pfrazee · 2024-02-27T04:33:58Z

Hot diggity dog

dolciss · 2024-02-28T10:41:04Z

Glad to see this PR merged🎉, but forgive me for commenting when it is already closed.

Mute words appear to split the body text at the following locations,

social-app/src/lib/moderatePost_wrapped.ts

Line 62 in e2ae534

const words = postText.split(REGEX.WORD_BOUNDARY)

However, this often fails to mute words in languages that are not separated by spaces, such as Japanese.

social-app/src/lib/__tests__/moderatePost_wrapped.test.ts

Lines 74 to 91 in e2ae534

    
               it(`match: single character 希`, () => { 
        
                 /** 
        
                  * @see https://bsky.app/profile/mukuuji.bsky.social/post/3klji4fvsdk2c 
        
                  */ 
        
                 const rt = new RichText({ 
        
                   text: `改善希望です`, 
        
                 }) 
        
                 rt.detectFacetsWithoutResolution() 
        
                 const match = hasMutedWord( 
        
                   [{value: '希', targets: ['content']}], 
        
                   rt.text, 
        
                   rt.facets, 
        
                   [], 
        
                 ) 
        
                 expect(match).toBe(true) 
        
               })

There are testing mute of a single kanji character here, but we often mute Japanese characters on a word-by-word basis as well. For example, "ハッシュタグ"(means Hashtags) in "次のバージョンでハッシュタグが実装される！"
It would be great if you could consider a non word-by-word word mute option!

estrattonbailey · 2024-02-28T16:50:37Z

@dolciss no problem at all! Appreciate you bringing this to our attention. We've considered adding options for allowing users to select which type of matching they'd like to do, and we may continue that investigation.

For now, we pushed a new change just now that checks for the post language, and if it's a language that doesn't use spaces — or uses spaces in ways other than separating words — then we check for the string directly instead of word-by-word.

In English, this direct matching would return false-positives e.g. AI would mute the word brain.

If you don't mind, may I ask: with this new change, do you happen to know if this type of partial-match issue will be a concern in Japanese as well?

dolciss · 2024-02-29T00:12:20Z

@estrattonbailey Thank you very much for your improve at #3018 !

If you don't mind, may I ask: with this new change, do you happen to know if this type of partial-match issue will be a concern in Japanese as well?

Yes, partial-match problems are a concern in Japanese as well. (e.g. ネット would mute the word インターネット ネットワーク)
However, solving this problem is distressing because it requires complex processing, such as morphological analysis.
(I am sorry, but I am not that familiar with morphological analysis.)

DavidBuchanan314 · 2024-03-15T23:58:22Z

src/lib/moderatePost_wrapped.ts

+    // rest of the checks are for `content` only
+    if (!mute.targets.includes('content')) continue
+    // single character, has to use includes
+    if (mutedWord.length === 1 && postText.includes(mutedWord)) return true


This checks the utf16 code-unit length, which may give an unexpected result for certain international scripts and/or emojis (maybe it's fine? I'm not sure I totally understand the purpose of this check)

Good shout. Yeah the original intention was to handle international scripts with this. In response to the comments above I added a check a couple weeks ago for a few languages that commonly don't use spaces as well.

But what you've called out here is still an issue: trying to match a single character (potentially with length > 1) in a language not part of our exceptions may fail unless we can split the text at the word boundary. Interesting...

As a practical test-case if someone mutes "💩", then it it'll match when it's on its own, but not as part of a "word"

Muting an emoji is probably quite a common thing to want to do, and I think counting graphemes would give the expected behavior in most cases (which is normally an annoying thing to do, but I know the app already has grapheme counting logic in it elsewhere)

kisaragi-hiu and others added 30 commits February 22, 2024 17:36

Add RichTextTag and TagMenu

2c40db2

Sketch

1d5743a

Remove hackfix

b89e7cf

Some cleanup

7d41667

Sketch web

4174c74

Mobile design

0eccf90

Mobile handling of tags search

321860f

Web only

d071e71

Fix navigation woes

ac5a2f0

Use new callback

cb1bcc9

Hook it up

0b0f07f

Integrate muted tags

c8113db

Fix dropdown styles

dfcf553

Type error

883eeee

Use close callback

de6b9ef

Fix styles

0b0e6d6

Cleanup, install latest sdk

f8ec2bb

Quick muted words screen

9f2d358

Targets

4fe4bfa

Dir structure

e2e1028

Icons, list view

47efab5

Move to dialog

5775606

Add removal confirmation

e64aa52

Swap copy

e51c905

Improve checkboxees

a6aa198

Update matching, add tests

163b8a6

Moderate embeds

66f44c0

Create global dialogs concept again to prevent flashing

1352228

Add access from moderation screen

7fbb675

pfrazee reviewed Feb 23, 2024

View reviewed changes

estrattonbailey added 10 commits February 23, 2024 09:56

Rename toggles and adjust logic

a45ee1a

Icon update

711c9fa

Load states

5b75c62

Improve regex

d3baa90

Improve regex

f4039d1

Improve regex

1948ed3

Revert link test

f367900

Hyphenated words

ca6e9d5

Improve matching

d1779d0

Enhance

7206af5

mary-ext reviewed Feb 24, 2024

View reviewed changes

src/components/TagMenu/index.web.tsx Outdated Show resolved Hide resolved

estrattonbailey added 8 commits February 26, 2024 10:55

Some tweaks

a7cf77c

Muted words modal changes

619d83c

Handle invalid handles, handle long tags

dd95a81

Remove main regex

177eab3

Better test

aa395f2

Space/punct check drop to includes

e5bc997

Lowercase post text before comparison

45c14cd

Add better real world test case

fae6797

haileyok approved these changes Feb 26, 2024

View reviewed changes

pfrazee merged commit 58aaad7 into main Feb 27, 2024
4 checks passed

pfrazee deleted the eric/tags branch February 27, 2024 04:33

kawadah mentioned this pull request Mar 2, 2024

Make tags in profile descriptions clickable #3081

Closed

DavidBuchanan314 reviewed Mar 15, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tags and mute words #2968

Add tags and mute words #2968

estrattonbailey commented Feb 22, 2024

pfrazee Feb 23, 2024

mary-ext Feb 24, 2024 •

edited

Loading

estrattonbailey Feb 24, 2024

pfrazee left a comment

haileyok left a comment

pfrazee commented Feb 27, 2024

dolciss commented Feb 28, 2024

estrattonbailey commented Feb 28, 2024

dolciss commented Feb 29, 2024

DavidBuchanan314 Mar 15, 2024

estrattonbailey Mar 19, 2024

DavidBuchanan314 Mar 19, 2024

Add tags and mute words #2968

Add tags and mute words #2968

Conversation

estrattonbailey commented Feb 22, 2024

pfrazee Feb 23, 2024

Choose a reason for hiding this comment

mary-ext Feb 24, 2024 • edited Loading

Choose a reason for hiding this comment

estrattonbailey Feb 24, 2024

Choose a reason for hiding this comment

pfrazee left a comment

Choose a reason for hiding this comment

haileyok left a comment

Choose a reason for hiding this comment

pfrazee commented Feb 27, 2024

dolciss commented Feb 28, 2024

estrattonbailey commented Feb 28, 2024

dolciss commented Feb 29, 2024

DavidBuchanan314 Mar 15, 2024

Choose a reason for hiding this comment

estrattonbailey Mar 19, 2024

Choose a reason for hiding this comment

DavidBuchanan314 Mar 19, 2024

Choose a reason for hiding this comment

mary-ext Feb 24, 2024 •

edited

Loading