New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-25093][SQL] Avoid recompiling regexp for comments multiple times #22135
Conversation
SGTM, but is it worth to address the similar issues at once? |
Test build #94897 has finished for PR 22135 at commit
|
thanks for the comment @kiszk , I am doing it! |
Test build #94926 has finished for PR 22135 at commit
|
("""([ |\t]*?\/\*[\s|\S]*?\*\/[ |\t]*?)|""" + // strip /*comment*/ | ||
"""([ |\t]*?\/\/[\s\S]*?\n)""").r // strip //comment | ||
val codeWithoutComment = commentReg.replaceAllIn(input, "") | ||
val codeWithoutComment = commentRegexp.replaceAllIn(input, "") | ||
codeWithoutComment.replaceAll("""\n\s*\n""", "\n") // strip ExtraNewLines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line also compile regex and could be replaced!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is still one place the compile Regex again and again
Test build #95017 has finished for PR 22135 at commit
|
retest this please |
Test build #95038 has finished for PR 22135 at commit
|
thanks, merging to master! |
What changes were proposed in this pull request?
The PR moves the compilation of the regexp for code formatting outside the method which is called for each code block when splitting expressions, in order to avoid recompiling the regexp every time.
Credit should be given to Izek Greenfield.
How was this patch tested?
existing UTs