Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-8505][SparkR] Add settings to kick lint-r from ./dev/run-test.py #7883

Closed
wants to merge 6 commits into from

Conversation

yu-iskw
Copy link
Contributor

@yu-iskw yu-iskw commented Aug 3, 2015

@JoshRosen we'd like to check the SparkR source code with the dev/lint-r script on the Jenkins. I tried to incorporate the script into dev/run-test.py. Could you review it when you have time?

@shivaram I modified dev/lint-r and dev/lint-r.R to install lintr package into a local directory(R/lib/) and to exit with a lint status. Could you review it?

# R style check should be executed after `install-dev.sh`.
# Since warnings about `no visible global function definition` appear
# without the installation. SEE ALSO: SPARK-9121.
run_cmd([os.path.join(SPARK_HOME, "dev", "lint-r")])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we place the lint check in run_sparkr_tests then this may result in us running those lint checks after running other tests, which will make it take longer to discover R style failures in scenarios where non-R tests had to be run. As a result, I wonder if we should place this linting check near line 488, by the existing run_python_style_checks() call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah thats a valid point, but the R lint tests have this problem where they don't resolve internal / private functions unless the corresponding package is included (i.e. SparkR in this case) . This is SPARK-9121 that is described in the comment.

FWIW I think we can do this right after the Maven build finishes as the SparkR package would have been built at that point -- so this will be before running Scala unit tests at least.

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Aug 3, 2015

@JoshRosen Sounds good. We could also separate 1.checking R is installed, 2.install SparkR, 3. check style and 4. run R tests. I'll show you alternative implementation soon.

@SparkQA
Copy link

SparkQA commented Aug 3, 2015

Test build #39501 has finished for PR 7883 at commit d654fc8.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor

shivaram commented Aug 3, 2015

Also @yu-iskw I am not sure I understand why we need to install lintr in the same library location as SparkR. That will mean that lintr will be installed on every run in Jenkins and this will take some time, fill up the logs (See text after lintr here [1]) ?

[1] https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/39501/consoleFull

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Aug 3, 2015

@shivaram As you said, lintr will be installed on every run in Jenkins. We need to add some validation rules into lintr package, since it is not enough for our style yet. So it is better of installing it on every run. What do you think?

@shivaram
Copy link
Contributor

shivaram commented Aug 3, 2015

I don't think we need to do it on every run. If we need a new version of lintr on Jenkins, then we can ask @JoshRosen or @shaneknapp to install a new version.

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Aug 3, 2015

Yeah, we could also maintain lintr with the way as you mentioned. However, it is a little bothersome to me. And there is a risk that it will be technical debt in future. Since it will depend on persons in charge of the Jenkins. And from the other developer's point of view, it is a little hard for them to understand the mechanism without any document.

FWIW, it seems that PEP8 which is a tool for checking python code is installed on every run with dev/lint-python.

What does @JoshRosen think about the way to update lintr?

@JoshRosen
Copy link
Contributor

I think that @shaneknapp is the right person to manage Jenkins-wide library / package installs since he manages the AMPLab scripts and tooling for that. Since I think Shane is still out this week, I think we'll want to loop in Jon or Matt for help (should be fairly easy given Shane's scripts, but I don't actually know how to deploy those scripts myself).

@JoshRosen
Copy link
Contributor

Also, I don't have a super-strong opinion on whether this should be downloaded on every run vs. installed in Jenkins; downloading on each run makes things easier to change / makes the test setup a little easier to reproduce on fresh Jenkins boxes, but adds a bit of complexity.

@shaneknapp
Copy link
Contributor

i'm actually here until tomorrow, so i can get this installed pretty
easily...

On Mon, Aug 3, 2015 at 1:06 PM, Josh Rosen notifications@github.com wrote:

I think that @shaneknapp https://github.com/shaneknapp is the right
person to manage Jenkins-wide library / package installs since he manages
the AMPLab scripts and tooling for that. Since I think Shane is still out
this week, I think we'll want to loop in Jon or Matt for help (should be
fairly easy given Shane's scripts, but I don't actually know how to deploy
those scripts myself).


Reply to this email directly or view it on GitHub
#7883 (comment).

@shaneknapp
Copy link
Contributor

i'm installing lintr now on all the workers

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Aug 3, 2015

@JoshRosen thank you for your comment.

@shaneknapp I appreciate your support. How did you install lintr? Since the CRAN version is too old to match our style guide, we should use the Github version.

@shivaram how about downloading on each run first? If the complexity will be increased significantly, I think we should be going to manage lintr as a system library.

@shaneknapp
Copy link
Contributor

@yu-iskw -- i did indeed use the cran version:

Rscript -e 'install.packages("lintr", repos="http://cran.stat.ucla.edu/")'

got a link to the more recent version?

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Aug 3, 2015

@shaneknapp Alright. We could install the latest version with install_github from https://github.com/jimhester/lintr.

devtools::install_github("jimhester/lintr")

Hmm, could you wait for @shivaram comment? When we decide to manage lintr, we will let you know. Thanks!

@shivaram
Copy link
Contributor

shivaram commented Aug 3, 2015

Other than downloading the thing that bothers me is that lintr has 33 dependencies which all get downloaded and built from source (you can see the log file I linked to above to see what I mean). Given all the existing flakiness with builds I don't want to introduce a new source of flakiness

How about this for a solution -- In the Jenkins machine, we ask @shaneknapp to install from devtools github now. In the lint-r script we check if the package is available and if not we re-install from devtools. BTW in the future we can extend this to check for a specific lintr version / git-hash if we want a specific version. Does that sound good ?

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Aug 3, 2015

That's a great idea. I will revert the souce code for installing lintr. Thanks!

@shaneknapp Could you install lintr with devtools::install_github as above?

@shaneknapp
Copy link
Contributor

yep, i'm on it.

@shaneknapp
Copy link
Contributor

alright, lintr is installed.

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Aug 6, 2015

@shaneknapp thanks!

@SparkQA
Copy link

SparkQA commented Aug 6, 2015

Test build #40005 has finished for PR 7883 at commit fad3c95.

  • This patch fails some tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 6, 2015

Test build #40008 has finished for PR 7883 at commit 4c8da4d.

  • This patch fails some tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 6, 2015

Test build #40002 has finished for PR 7883 at commit d2f2c30.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor

@yu-iskw This one actually failed a lint test

R/functions.R:74:1: style: lines should not be more than 100 characters.
            jc <- callJStatic("org.apache.spark.sql.functions", "lit", ifelse(class(x) == "Column", x@jc, x))
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
lintr checks failed.
[error] running /home/jenkins/workspace/SparkPullRequestBuilder/dev/lint-r ; received return code 1

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Aug 18, 2015

@shivaram Alright. I'll fix it. Thanks!

@shaneknapp
Copy link
Contributor

On Mon, Aug 17, 2015 at 10:11 AM, Shivaram Venkataraman <
notifications@github.com> wrote:

@JoshRosen https://github.com/JoshRosen There seems to be some problem
on some of the Jenkins workers and we get errors which look like

running git clean -fdx
warning: failed to remove 'target/'
Removing target/
Build step 'Execute shell' marked build as failure

I've seen this in other PRs as well -- Any ideas what is causing this ?

somehow the spark builds are creating directories w/the wrong permissions
(missing the owner write bit), meaning that the directory created from a
previous build can't be deleted and thereby fails the build.

i'll go through all of the workers/spark build dirs first thing tomorrow
and fix this.

@shaneknapp
Copy link
Contributor

found one directory on amp-jenkins-worker-01 that's polluted -- deleting it
now, and this should fix any builds that run there.

On Mon, Aug 17, 2015 at 9:36 PM, shane knapp ☠ incomplete@gmail.com wrote:

On Mon, Aug 17, 2015 at 10:11 AM, Shivaram Venkataraman <
notifications@github.com> wrote:

@JoshRosen https://github.com/JoshRosen There seems to be some problem
on some of the Jenkins workers and we get errors which look like

running git clean -fdx
warning: failed to remove 'target/'
Removing target/
Build step 'Execute shell' marked build as failure

I've seen this in other PRs as well -- Any ideas what is causing this ?

somehow the spark builds are creating directories w/the wrong permissions
(missing the owner write bit), meaning that the directory created from a
previous build can't be deleted and thereby fails the build.

i'll go through all of the workers/spark build dirs first thing tomorrow
and fix this.

@shivaram
Copy link
Contributor

welcome back @shaneknapp !

@shivaram
Copy link
Contributor

Jenkins, retest this please

1 similar comment
@shivaram
Copy link
Contributor

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Aug 19, 2015

Test build #41199 has finished for PR 7883 at commit 54365fc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor

So the R lint passed ! Change LGTM -- @JoshRosen let me know if its okay to merge this just to master branch.

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Aug 19, 2015

@shivaram thank you for your support!

@shivaram
Copy link
Contributor

@JoshRosen now that branch-1.5 is mostly stable, is it okay to merge this to master ?

@JoshRosen
Copy link
Contributor

It seems fine to me, although let's re-test quickly to make sure nothing has broken.

@JoshRosen
Copy link
Contributor

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Aug 26, 2015

Test build #41644 has finished for PR 7883 at commit 54365fc.

  • This patch fails R style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor

Aha we did break them style rules in the last week ! -- @yu-iskw we can fix them in this PR or a separate PR

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Aug 27, 2015

@shivaram sure. I'll send another PR to fix them soon.

asfgit pushed a commit that referenced this pull request Aug 27, 2015
Getting rid of some validation problems in SparkR
#7883

cc shivaram

```
inst/tests/test_Serde.R:26:1: style: Trailing whitespace is superfluous.

^~
inst/tests/test_Serde.R:34:1: style: Trailing whitespace is superfluous.

^~
inst/tests/test_Serde.R:37:38: style: Trailing whitespace is superfluous.
  expect_equal(class(x), "character")
                                     ^~
inst/tests/test_Serde.R:50:1: style: Trailing whitespace is superfluous.

^~
inst/tests/test_Serde.R:55:1: style: Trailing whitespace is superfluous.

^~
inst/tests/test_Serde.R:60:1: style: Trailing whitespace is superfluous.

^~
inst/tests/test_sparkSQL.R:611:1: style: Trailing whitespace is superfluous.

^~
R/DataFrame.R:664:1: style: Trailing whitespace is superfluous.

^~~~~~~~~~~~~~
R/DataFrame.R:670:55: style: Trailing whitespace is superfluous.
                df <- data.frame(row.names = 1 : nrow)
                                                      ^~~~~~~~~~~~~~~~
R/DataFrame.R:672:1: style: Trailing whitespace is superfluous.

^~~~~~~~~~~~~~
R/DataFrame.R:686:49: style: Trailing whitespace is superfluous.
                    df[[names[colIndex]]] <- vec
                                                ^~~~~~~~~~~~~~~~~~
```

Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com>

Closes #8474 from yu-iskw/minor-fix-sparkr.
@yu-iskw
Copy link
Contributor Author

yu-iskw commented Aug 27, 2015

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Aug 27, 2015

Test build #41664 has finished for PR 7883 at commit 54365fc.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Aug 27, 2015

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Aug 27, 2015

Test build #41673 has finished for PR 7883 at commit 54365fc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yu-iskw
Copy link
Contributor Author

yu-iskw commented Aug 27, 2015

@JoshRosen All tests have passed. Is it ok to merge this to master?

@shivaram
Copy link
Contributor

Alright I'm going to merge this as its better to do so before more breaking style changes get in. Will watch Jenkins for the next couple of hours to make sure things are fine

@asfgit asfgit closed this in 1f90c5e Aug 28, 2015
@yu-iskw
Copy link
Contributor Author

yu-iskw commented Aug 28, 2015

@shivaram thank you for merging it. I keep watching the Jenkins in a couple of hours. If it will go well, I will inform the community about this lint script.

brennonyork pushed a commit to brennonyork/spark that referenced this pull request Oct 16, 2015
kiszk pushed a commit to kiszk/spark-gpu that referenced this pull request Dec 26, 2015
Getting rid of some validation problems in SparkR
apache/spark#7883

cc shivaram

```
inst/tests/test_Serde.R:26:1: style: Trailing whitespace is superfluous.

^~
inst/tests/test_Serde.R:34:1: style: Trailing whitespace is superfluous.

^~
inst/tests/test_Serde.R:37:38: style: Trailing whitespace is superfluous.
  expect_equal(class(x), "character")
                                     ^~
inst/tests/test_Serde.R:50:1: style: Trailing whitespace is superfluous.

^~
inst/tests/test_Serde.R:55:1: style: Trailing whitespace is superfluous.

^~
inst/tests/test_Serde.R:60:1: style: Trailing whitespace is superfluous.

^~
inst/tests/test_sparkSQL.R:611:1: style: Trailing whitespace is superfluous.

^~
R/DataFrame.R:664:1: style: Trailing whitespace is superfluous.

^~~~~~~~~~~~~~
R/DataFrame.R:670:55: style: Trailing whitespace is superfluous.
                df <- data.frame(row.names = 1 : nrow)
                                                      ^~~~~~~~~~~~~~~~
R/DataFrame.R:672:1: style: Trailing whitespace is superfluous.

^~~~~~~~~~~~~~
R/DataFrame.R:686:49: style: Trailing whitespace is superfluous.
                    df[[names[colIndex]]] <- vec
                                                ^~~~~~~~~~~~~~~~~~
```

Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com>

Closes #8474 from yu-iskw/minor-fix-sparkr.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants