fix escaped symbols breaking markdown checks#209
fix escaped symbols breaking markdown checks#209lpapazow wants to merge 1 commit intoopenhab:masterfrom
Conversation
75f6e88 to
1154f49
Compare
| public interface LineFormatterCallback { | ||
| /** | ||
| * Used in AbstractStaticCheck's findLineNumber method to apply rules | ||
| * about escaping special characters to a line in the .mardown file |
There was a problem hiding this comment.
typo .mardown -> markdown
But maybe it is better .md as this is the file extension that we are using in the check, or just say from the markdown file (without the .)
| * Used in AbstractStaticCheck's findLineNumber method to apply rules | ||
| * about escaping special characters to a line in the .mardown file | ||
| * | ||
| * @param line - an unparsed line from the .markdown file |
martinvw
left a comment
There was a problem hiding this comment.
I don't think this is really an optimal solution I also do not see sufficient tests which prove that this actually resolves the problem and does not have some nasty corner cases where it might make things worse
|
|
||
| @Override | ||
| public String formatLine(String line) { | ||
| Pattern needsEscape = Pattern.compile("\\\\" + Escaping.ESCAPABLE); |
There was a problem hiding this comment.
But you can not be sure that all characters which have to escaped are actually escaped are you sure this will suffice for the most common or better all cases? Should we maybe determine the line number in a different way instead of trying to match it again?
There was a problem hiding this comment.
Given such changes I would expect new tests? Or maybe changed tests?
There was a problem hiding this comment.
The Escaping.ESCAPABLE constant is defined in the external markdown library and is used in the markdown parsing logic. It will cover all the cases in which the markdown parser will make changes to the original text.
The reason why I didn't include any new tests is that when I use the verify method that is provided by com.puppycrawl.tools.checkstyle.BaseCheckTestSupport and I provide no error messages to that method, it doesn't fail the test, it only logs the error. Thus, every test that I include, would pass even if it shouldn't.
Do you have any ideas how to create tests that could fail?
There was a problem hiding this comment.
Can you please discuss it with some of your college's first
| */ | ||
| protected int findLineNumber(FileText fileContent, String searchedText, int startLineNumber) | ||
| throws NoResultException { | ||
| protected int findLineNumberFormatted(FileText fileContent, String searchedText, int startLineNumber, |
There was a problem hiding this comment.
Imho this is not the best way forward it looks a little over designed to me with this callback logic, isn't there an easy way to solve this?
There was a problem hiding this comment.
It also seems that source from which we receive the string has not the same origin as lines we are searching through which seems dangerous for a lot of reasons imho.
There was a problem hiding this comment.
The sources differ is because the searchedText is generated by the markdown parsing logic and the fileContent is the raw text in the markdown file.
There was a problem hiding this comment.
Yes, I did understand that, but isn't this part of the problem and the thing we would actually like to resolve, can't we just request the parsed text from markdown and search through that?
There was a problem hiding this comment.
@martinvw I will try to jump in here, because we discussed it in the office with @lpapazow and @doandzhi : Originally the check is reporting errors into the raw markdown file and for this reason we want to find the exact line in the raw file and report the error there. So if we get the parsed text and report the errors into its lines, we will probably hit an inconsistent line. For this reason we decided to parse a bit the special symbols (taken from the parser) in order to find the correct raw content. What do you think, are you tend to agree or disagree ?
There was a problem hiding this comment.
I understand what we try to achieve but it feels a little fragile to me, in most languages there are more character which can be escaped than the ones you have to escape so mismatches are possible. Without any proper tests that is hard to tell without me diving into the code myself.
I found an issue referring to the actual underlying problem that the markdown Nodes do not have a reference back to the original location: commonmark/commonmark-java#1
Given that the issue is there, seems it cannot be done (easily) so we might go for this current solution but not without proper tests.
There was a problem hiding this comment.
Note that good example where escaping is not always performed can be seen here:
http://docs.openhab.org/addons/bindings/netatmo/readme.html
https://github.com/openhab/openhab2-addons/edit/master/addons/binding/org.openhab.binding.netatmo/README.md on line 49 there is a unescaped _ which does not cause problems in some views but in others it does. Docs renders fine, github view renders fine but diff's don't.
| * search of the beginning of the text the startLineNumber should be 0 | ||
| * @return the number of the line starting from 1, where the searched text occurred for the first | ||
| * time | ||
| * @throws NoResultException when no match was found |
There was a problem hiding this comment.
I don't like null I would have prefered a no-op LineFormatterCallback if we go for this solution than you also do not need the null check, just let it explode if you are called with a null
There was a problem hiding this comment.
Thank you for this suggestion, it does seem to make the code more robust. I will rewrite it in the suggested way.
| } | ||
| }; | ||
| MarkdownVisitor visitor = new MarkdownVisitor(callBack, fileText); | ||
| MarkdownVisitor visitor = new MarkdownVisitor(callBack, fileText, new LineFormatterCallback() { |
There was a problem hiding this comment.
If the design stays as is, I would make the LineFormatterCallback a FunctionalInterface
There was a problem hiding this comment.
Great suggestion, I will use it.
a6ed210 to
01c9beb
Compare
01c9beb to
b43b1f6
Compare
|
@lpapazow, do you have any updates on this PR ? |
|
Replaced by #245 |
Create a fix for issue #205 .
Fix a single backslash before special symbols (!"#$%&'()*+,./:;<=>?@][^_`{|}~-) not escaping that symbol in .markdown files.