Failed to detect number substitutions #5

priyankagv1 · 2019-03-19T20:38:00Z

When trying to identify profane words sh1t is not getting identified as profane.
Levenstein approach should have identified the variation to the original profane word.
Also, I see that sh1t is listed under the profane word dictionary. Could you please see where the problem is?

The text was updated successfully, but these errors were encountered:

rominf · 2019-03-20T18:27:51Z

Thank you for the report. The problem was that Spacy tokenizer splitted sh1t into tokens sh1 and t. I fixed this by adding all profane words to tokenizer special cases. Please use the latest version from PyPI.

priyankagv1 · 2019-03-20T18:41:46Z

Thank you so much.Will try and let you know!

rominf · 2019-03-20T18:46:44Z

Forgot to mention: with my improvements sh1t is detected fine (because it's in the profane word dictionary), but sh5t is still splitted into 2 words and, therefore is not detected. I don't know how to fix this yet.

rominf · 2019-03-20T18:59:29Z

I've got an idea. Will try it tomorrow.

priyankagv1 · 2019-03-21T13:17:43Z

Thank you!

rominf · 2019-03-22T13:08:52Z

I've got a better idea and I need more time.

rominf · 2019-03-24T08:41:24Z

Blocked by #14.

Fixes #5.

rominf · 2019-03-28T07:59:19Z

@priyankagv1, finally solved it. Please, try the latest version from PyPI.

priyankagv1 · 2019-03-28T13:33:10Z

Sure..thank you!

rominf added the bug Something isn't working label Mar 20, 2019

rominf closed this as completed Mar 20, 2019

rominf reopened this Mar 20, 2019

rominf mentioned this issue Mar 24, 2019

Minimize profane word dictionaries for deep analysis usage #8

Open

rominf closed this as completed in dfec911 Mar 28, 2019

rominf added a commit that referenced this issue Mar 28, 2019

Add SpacyProfanityFilterComponent and ProfanityFilter.spacy_component

709d0d8

Fixes #5.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to detect number substitutions #5

Failed to detect number substitutions #5

priyankagv1 commented Mar 19, 2019 •

edited by rominf

Loading

rominf commented Mar 20, 2019 •

edited

Loading

priyankagv1 commented Mar 20, 2019

rominf commented Mar 20, 2019 •

edited

Loading

rominf commented Mar 20, 2019

priyankagv1 commented Mar 21, 2019

rominf commented Mar 22, 2019

rominf commented Mar 24, 2019

rominf commented Mar 28, 2019

priyankagv1 commented Mar 28, 2019

Failed to detect number substitutions #5

Failed to detect number substitutions #5

Comments

priyankagv1 commented Mar 19, 2019 • edited by rominf Loading

rominf commented Mar 20, 2019 • edited Loading

priyankagv1 commented Mar 20, 2019

rominf commented Mar 20, 2019 • edited Loading

rominf commented Mar 20, 2019

priyankagv1 commented Mar 21, 2019

rominf commented Mar 22, 2019

rominf commented Mar 24, 2019

rominf commented Mar 28, 2019

priyankagv1 commented Mar 28, 2019

priyankagv1 commented Mar 19, 2019 •

edited by rominf

Loading

rominf commented Mar 20, 2019 •

edited

Loading

rominf commented Mar 20, 2019 •

edited

Loading