You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 29, 2024. It is now read-only.
If we try to run censor on the text "fuck", it returns "f***" as expected. The issue arises, however, when we attempt to implement a markdown parser like Earmark. This runs us into a predicament, because if we run censor before we run as_html, it would parse the censor as markdown. If we run censor after we run as_html, it would parse self censoring as markdown in some cases, sabotaging the filter. Running it after also makes "<p>fuck</p>" return "<p>f****/p>". We could try to make it run an iterator over some html to censor it, but then we would run into the same issue with self censoring.
Possible solutions:
Use a different character than "*" to solve "</p>" -> "/p>"
Dispute this issue with Earmark's developers
Implement our own markdown parser
Use a different markdown parser (probably a rust crate like pulldown)
Don't support markdown officially
I'm probably going to try pulldown before implementing our own parser.
UPDATE: Closer testing reveals that running censor after as_html does not affect self censoring. I will only be forwarding the "</p>" -> "/p>" section of this to finnbear's repo.
EDIT: Solution 1 won't work, since it detects all characters. Either add whitespace at the end of a string (probably not, that's just wasting characters) or remove the tags that earmark/pulldown adds.
The text was updated successfully, but these errors were encountered:
If we try to run censor on the text "fuck", it returns "f***" as expected. The issue arises, however, when we attempt to implement a markdown parser like Earmark. This runs us into a predicament, because if we run censor before we run as_html, it would parse the censor as markdown. If we run censor after we run as_html, it would parse self censoring as markdown in some cases, sabotaging the filter. Running it after also makes "<p>fuck</p>" return "<p>f****/p>". We could try to make it run an iterator over some html to censor it, but then we would run into the same issue with self censoring.
Possible solutions:
Use a different character than "*" to solve "</p>" -> "/p>"I'm probably going to try pulldown before implementing our own parser.
UPDATE: Closer testing reveals that running censor after as_html does not affect self censoring. I will only be forwarding the "</p>" -> "/p>" section of this to finnbear's repo.
EDIT: Solution 1 won't work, since it detects all characters. Either add whitespace at the end of a string (probably not, that's just wasting characters) or remove the tags that earmark/pulldown adds.
The text was updated successfully, but these errors were encountered: