-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-8505][SparkR] Add settings to kick lint-r
from ./dev/run-test.py
#7883
Conversation
# R style check should be executed after `install-dev.sh`. | ||
# Since warnings about `no visible global function definition` appear | ||
# without the installation. SEE ALSO: SPARK-9121. | ||
run_cmd([os.path.join(SPARK_HOME, "dev", "lint-r")]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we place the lint check in run_sparkr_tests
then this may result in us running those lint checks after running other tests, which will make it take longer to discover R style failures in scenarios where non-R tests had to be run. As a result, I wonder if we should place this linting check near line 488, by the existing run_python_style_checks()
call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah thats a valid point, but the R lint tests have this problem where they don't resolve internal / private functions unless the corresponding package is included (i.e. SparkR in this case) . This is SPARK-9121 that is described in the comment.
FWIW I think we can do this right after the Maven build finishes as the SparkR package would have been built at that point -- so this will be before running Scala unit tests at least.
@JoshRosen Sounds good. We could also separate 1.checking R is installed, 2.install SparkR, 3. check style and 4. run R tests. I'll show you alternative implementation soon. |
Test build #39501 has finished for PR 7883 at commit
|
Also @yu-iskw I am not sure I understand why we need to install [1] https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/39501/consoleFull |
@shivaram As you said, lintr will be installed on every run in Jenkins. We need to add some validation rules into lintr package, since it is not enough for our style yet. So it is better of installing it on every run. What do you think? |
I don't think we need to do it on every run. If we need a new version of lintr on Jenkins, then we can ask @JoshRosen or @shaneknapp to install a new version. |
Yeah, we could also maintain lintr with the way as you mentioned. However, it is a little bothersome to me. And there is a risk that it will be technical debt in future. Since it will depend on persons in charge of the Jenkins. And from the other developer's point of view, it is a little hard for them to understand the mechanism without any document. FWIW, it seems that PEP8 which is a tool for checking python code is installed on every run with What does @JoshRosen think about the way to update lintr? |
I think that @shaneknapp is the right person to manage Jenkins-wide library / package installs since he manages the AMPLab scripts and tooling for that. Since I think Shane is still out this week, I think we'll want to loop in Jon or Matt for help (should be fairly easy given Shane's scripts, but I don't actually know how to deploy those scripts myself). |
Also, I don't have a super-strong opinion on whether this should be downloaded on every run vs. installed in Jenkins; downloading on each run makes things easier to change / makes the test setup a little easier to reproduce on fresh Jenkins boxes, but adds a bit of complexity. |
i'm actually here until tomorrow, so i can get this installed pretty On Mon, Aug 3, 2015 at 1:06 PM, Josh Rosen notifications@github.com wrote:
|
i'm installing lintr now on all the workers |
@JoshRosen thank you for your comment. @shaneknapp I appreciate your support. How did you install lintr? Since the CRAN version is too old to match our style guide, we should use the Github version. @shivaram how about downloading on each run first? If the complexity will be increased significantly, I think we should be going to manage lintr as a system library. |
@yu-iskw -- i did indeed use the cran version: Rscript -e 'install.packages("lintr", repos="http://cran.stat.ucla.edu/")' got a link to the more recent version? |
@shaneknapp Alright. We could install the latest version with
Hmm, could you wait for @shivaram comment? When we decide to manage lintr, we will let you know. Thanks! |
Other than downloading the thing that bothers me is that lintr has 33 dependencies which all get downloaded and built from source (you can see the log file I linked to above to see what I mean). Given all the existing flakiness with builds I don't want to introduce a new source of flakiness How about this for a solution -- In the Jenkins machine, we ask @shaneknapp to install from devtools github now. In the lint-r script we check if the package is available and if not we re-install from devtools. BTW in the future we can extend this to check for a specific lintr version / git-hash if we want a specific version. Does that sound good ? |
That's a great idea. I will revert the souce code for installing lintr. Thanks! @shaneknapp Could you install lintr with |
yep, i'm on it. |
alright, lintr is installed. |
@shaneknapp thanks! |
Test build #40005 has finished for PR 7883 at commit
|
Test build #40008 has finished for PR 7883 at commit
|
Test build #40002 has finished for PR 7883 at commit
|
@yu-iskw This one actually failed a lint test
|
@shivaram Alright. I'll fix it. Thanks! |
On Mon, Aug 17, 2015 at 10:11 AM, Shivaram Venkataraman <
i'll go through all of the workers/spark build dirs first thing tomorrow |
found one directory on amp-jenkins-worker-01 that's polluted -- deleting it On Mon, Aug 17, 2015 at 9:36 PM, shane knapp ☠ incomplete@gmail.com wrote:
|
welcome back @shaneknapp ! |
Jenkins, retest this please |
1 similar comment
Jenkins, retest this please |
Test build #41199 has finished for PR 7883 at commit
|
So the R lint passed ! Change LGTM -- @JoshRosen let me know if its okay to merge this just to master branch. |
@shivaram thank you for your support! |
@JoshRosen now that |
It seems fine to me, although let's re-test quickly to make sure nothing has broken. |
Jenkins, retest this please. |
Test build #41644 has finished for PR 7883 at commit
|
Aha we did break them style rules in the last week ! -- @yu-iskw we can fix them in this PR or a separate PR |
@shivaram sure. I'll send another PR to fix them soon. |
Getting rid of some validation problems in SparkR #7883 cc shivaram ``` inst/tests/test_Serde.R:26:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_Serde.R:34:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_Serde.R:37:38: style: Trailing whitespace is superfluous. expect_equal(class(x), "character") ^~ inst/tests/test_Serde.R:50:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_Serde.R:55:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_Serde.R:60:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_sparkSQL.R:611:1: style: Trailing whitespace is superfluous. ^~ R/DataFrame.R:664:1: style: Trailing whitespace is superfluous. ^~~~~~~~~~~~~~ R/DataFrame.R:670:55: style: Trailing whitespace is superfluous. df <- data.frame(row.names = 1 : nrow) ^~~~~~~~~~~~~~~~ R/DataFrame.R:672:1: style: Trailing whitespace is superfluous. ^~~~~~~~~~~~~~ R/DataFrame.R:686:49: style: Trailing whitespace is superfluous. df[[names[colIndex]]] <- vec ^~~~~~~~~~~~~~~~~~ ``` Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8474 from yu-iskw/minor-fix-sparkr.
Jenkins, retest this please. |
Test build #41664 has finished for PR 7883 at commit
|
Jenkins, retest this please. |
Test build #41673 has finished for PR 7883 at commit
|
@JoshRosen All tests have passed. Is it ok to merge this to |
Alright I'm going to merge this as its better to do so before more breaking style changes get in. Will watch Jenkins for the next couple of hours to make sure things are fine |
@shivaram thank you for merging it. I keep watching the Jenkins in a couple of hours. If it will go well, I will inform the community about this lint script. |
Getting rid of some validation problems in SparkR apache/spark#7883 cc shivaram ``` inst/tests/test_Serde.R:26:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_Serde.R:34:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_Serde.R:37:38: style: Trailing whitespace is superfluous. expect_equal(class(x), "character") ^~ inst/tests/test_Serde.R:50:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_Serde.R:55:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_Serde.R:60:1: style: Trailing whitespace is superfluous. ^~ inst/tests/test_sparkSQL.R:611:1: style: Trailing whitespace is superfluous. ^~ R/DataFrame.R:664:1: style: Trailing whitespace is superfluous. ^~~~~~~~~~~~~~ R/DataFrame.R:670:55: style: Trailing whitespace is superfluous. df <- data.frame(row.names = 1 : nrow) ^~~~~~~~~~~~~~~~ R/DataFrame.R:672:1: style: Trailing whitespace is superfluous. ^~~~~~~~~~~~~~ R/DataFrame.R:686:49: style: Trailing whitespace is superfluous. df[[names[colIndex]]] <- vec ^~~~~~~~~~~~~~~~~~ ``` Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8474 from yu-iskw/minor-fix-sparkr.
@JoshRosen we'd like to check the SparkR source code with the
dev/lint-r
script on the Jenkins. I tried to incorporate the script intodev/run-test.py
. Could you review it when you have time?@shivaram I modified
dev/lint-r
anddev/lint-r.R
to install lintr package into a local directory(R/lib/
) and to exit with a lint status. Could you review it?lint-r
from./dev/run-test.py
- ASF JIRA