-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MarkdownFixer] Do not escape username with underscores in code or codeblocks #5785
[MarkdownFixer] Do not escape username with underscores in code or codeblocks #5785
Conversation
d486f87
to
6591cb7
Compare
6591cb7
to
5315367
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, thanks for the fix!
I assume this PR is good because all tests are passing now, but @JuanitoFatas let me know if I'm mistaken. :) |
5315367
to
477f152
Compare
Rebased with master and verified it works ✅ and tests passed 🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @JuanitoFatas, markdown parsing bugs are always complicated to fix, your solution seems very clever!
I left notes about adding comments to the new code as node traversal and regexp aren't usually super intuitive for future new readers.
ps. don't forget to prefix your branches with your name in the future, you can checkout the contributing guide for details.
477f152
to
432885f
Compare
Thanks for the review. I‘ve added comments and changed the implementation to a less clever one 😉 . I also pulled the node traverser into a separate file, hope that make it easier to read. Verified this works and updated the screenshot in the Pull Request accordingly 🎉 |
app/labor/markdown_fixer.rb
Outdated
@@ -47,11 +47,29 @@ def split_tags(markdown) | |||
end | |||
|
|||
def underscores_in_usernames(markdown) | |||
markdown.gsub(/(@)(_\w+)(_)/, '\1\\\\\2\\\\\3') | |||
return markdown unless markdown.include?("@") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This guard clause has to go through all of markdown
to determine whether or not to return, so I don't think it saves us much work. If we skip this check we go through the document once, if we keep it we go through the document once + the percentage of text preceding the username.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are number of tests that pass ""
, front matters, or text that do not have username at all (without @
) to here. These make sense in real world. Hence checking for @
here to avoid doing work for irrelevant cases and all tests are happy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the reason for the check, my only slight concern was this:
Case 1: markdown
doesn't contain an @
: we're still going to compare every character in the document to @
for the include?
check. I'd guess a majority of articles won't have usernames in them.
Case 2. markdown
contains an @
: we read markdown
once until we find the @
, then traverse again.
Maybe the guard clause should actually make use of the regular expression you defined, something like
return markdown unless markdown =~ USERNAME_WITH_UNDERSCORE_REGEXP
as that's actually what we care about, since the @
sign does not necessarily indicate a username (in some languages it's used for array concatenation for example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey checking against the regexp makes sense. I‘ve updated the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🏓
app/labor/markdown_fixer.rb
Outdated
@@ -47,11 +47,29 @@ def split_tags(markdown) | |||
end | |||
|
|||
def underscores_in_usernames(markdown) | |||
markdown.gsub(/(@)(_\w+)(_)/, '\1\\\\\2\\\\\3') | |||
return markdown unless markdown.include?("@") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are number of tests that pass ""
, front matters, or text that do not have username at all (without @
) to here. These make sense in real world. Hence checking for @
here to avoid doing work for irrelevant cases and all tests are happy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a bunch more comments, but don't let any of them stop you.
It only runs when markdown has underscored usernames. Skip escaping underscore when content is in code or codeblock. It works by going through all lines of a markdown content. Find underscored username. Escape if we are not in the codeblock. The (?<!) is negative lookbehind which rules out the case when a underscored username is present in code, e.g., `@_dev_` will not be escaped.
0e92d06
to
2ac92e8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good now, thanks for updating the guard clause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very awesome solution. LGTM!
What type of PR is this? (check all applicable)
Refactor
Feature
Bug Fix
Optimization
Documentation Update
Tests are passing
bundle exec rspec spec/labor/markdown_fixer_spec.rb spec/requests/api/v0/articles_spec.rb spec/services/articles/creator_spec.rb spec/services/mentions/create_all_spec.rb
I‘ve verified it works on development
starts the server at local, sign in, new post, copy the following to preview and see if things work as expected. You should not see
\
in code and codeblock, and underscored username renders correctly.Code for testing at development
Description
The underscored username got escaped in code and code blocks. We probably do not want that. The fix is only to escape underscored username if they are not in code or code blocks.
Other possible approaches:
pre
orcode
Related Tickets & Documents
Fixes #5739.
Screenshots
Current on dev (2019.01.28, 26e4412)
After this Pull Request (tested against 432885f)
Added to documentation?
[optional] What gif best describes this PR or how it makes you feel?