Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-25093][SQL] Avoid recompiling regexp for comments multiple times #22135

Closed
wants to merge 3 commits into from

Conversation

mgaido91
Copy link
Contributor

What changes were proposed in this pull request?

The PR moves the compilation of the regexp for code formatting outside the method which is called for each code block when splitting expressions, in order to avoid recompiling the regexp every time.

Credit should be given to Izek Greenfield.

How was this patch tested?

existing UTs

@mgaido91
Copy link
Contributor Author

cc @cloud-fan @gatorsmile @kiszk

@kiszk
Copy link
Member

kiszk commented Aug 17, 2018

SGTM, but is it worth to address the similar issues at once?
Even under src/main/..., we can see this pattern at several places.

@SparkQA
Copy link

SparkQA commented Aug 17, 2018

Test build #94897 has finished for PR 22135 at commit e39c855.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mgaido91
Copy link
Contributor Author

thanks for the comment @kiszk , I am doing it!

@SparkQA
Copy link

SparkQA commented Aug 18, 2018

Test build #94926 has finished for PR 22135 at commit 5731825.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

("""([ |\t]*?\/\*[\s|\S]*?\*\/[ |\t]*?)|""" + // strip /*comment*/
"""([ |\t]*?\/\/[\s\S]*?\n)""").r // strip //comment
val codeWithoutComment = commentReg.replaceAllIn(input, "")
val codeWithoutComment = commentRegexp.replaceAllIn(input, "")
codeWithoutComment.replaceAll("""\n\s*\n""", "\n") // strip ExtraNewLines

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line also compile regex and could be replaced!

Copy link

@igreenfield igreenfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is still one place the compile Regex again and again

@SparkQA
Copy link

SparkQA commented Aug 21, 2018

Test build #95017 has finished for PR 22135 at commit 643d432.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mgaido91
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Aug 21, 2018

Test build #95038 has finished for PR 22135 at commit 643d432.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@asfgit asfgit closed this in 55f3664 Aug 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants