-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-22344][SPARKR] clean up install dir if running test as source package #19657
Conversation
odd build failure
|
Jenkins, retest this please |
need to update vignettes build... and realizing that we are downloading the spark jar twice, once for test and once for vignettes |
retest this please |
Test build #83442 has finished for PR 19657 at commit
|
retest this please |
Test build #83458 has finished for PR 19657 at commit
|
@shivaram could you take a look? I think this would do it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @felixcheung - I think the approach looks good. I had some minor questions
@@ -152,6 +152,9 @@ install.spark <- function(hadoopVersion = "2.7", mirrorUrl = NULL, | |||
}) | |||
if (!tarExists || overwrite || !success) { | |||
unlink(packageLocalPath) | |||
if (success) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why should this be in inside this if
block for overwrite || !success
-- Can't we just have it outside this if as an else below ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
basically if success
is TRUE then only the other cases matter: !tarExists || overwrite
- that's the exact condition where the download would have occurred in L126 which is the else case of tarExists && !overwrite
on L123
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm ok that still looks weird in the code. Maybe add a comment before this of the form If we downloaded a tarfile or overwrote it, set sparkDownloaded flag
?
R/pkg/vignettes/sparkr-vignettes.Rmd
Outdated
```{r cleanup, include=FALSE} | ||
# clean up if Spark was downloaded | ||
# get0 not supported before R 3.2.0 | ||
sparkDownloaded <- mget(".sparkDownloaded"[1L], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this an internal util function and call it from here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes! but both call sites are outside of the package technically and we would need to call the private function with SparkR:::
, which is kinda ugly...
I guess we could wrap the entire cleanup thing into a private/internal function (since it has to access a private flag anyway..)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - lets do that. I might even be fine with exposing an external function 'uninstallSpark' or 'uninstallDownloadedSpark' ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this needs to go into 2.2, let's not add a public method for now, we could revisit this for 2.3
R/pkg/tests/run-all.R
Outdated
parentDir <- SparkR:::sparkCachePath() | ||
dirs <- list(parentDir, dirname(parentDir), dirname(dirname(parentDir))) | ||
lapply(dirs, function(d) { | ||
if (length(list.files(d, all.files = TRUE, include.dirs = TRUE, no.. = TRUE)) == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One consequence of this is that if we run R CMD check --as-cran
we will do the download twice -- once for the unit tests and once for the vignettes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it would, as commented above #19657 (comment)
problem is we have no idea whether the vignettes build is going to happen or not (it could easily be disabled via commandline)
Test build #83542 has finished for PR 19657 at commit
|
Test build #83544 has finished for PR 19657 at commit
|
R/pkg/tests/fulltests/test_utils.R
Outdated
"c:\\Users\\user\\AppData\\Local\\Apache\\Spark", | ||
"c:\\Users\\user\\AppData\\Local\\Apache") | ||
} else { | ||
dirs <- traverseParentDirs("/Users/user/Library/Caches/spark/spark2.2", 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also test the linux one (/home/user/.cache
) - Just want to make sure we will not miss hidden files / directories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, but well hopefully the implementation is not platform dependent, otherwise we will need to test linux as well as osx
(and it doesn't check if the path is valid/present)
Thanks @felixcheung -- The Appveyor test seems to have failed with the following err
@HyukjinKwon could you also take a quick look at this PR ? |
Test build #83553 has finished for PR 19657 at commit
|
Test build #83554 has finished for PR 19657 at commit
|
AppVeyor still has an error
|
Will take a look within today (KST). |
Yup, I just checked it too and was writing a comment .. The current change should pass :). |
Test build #83582 has finished for PR 19657 at commit
|
build failure this time |
@HyukjinKwon hey I think the appveyor test pass is just timing out after 1 hr 30 min - is there a way to up the timeout? |
I actually took a look to decrease the build time (as you know) and am currently away from it. If I remember correctly, what I observed was that a single(?) particular test takes 20ish(?) mins. It was related with ML in R. Let me try to take a look again first and will leave some comments about what I investigated in SPARK-21693 if I can't deal with it by myself (probably by my limited ML knowledge). If that's actually not that quite simple, then, let me ask it to increase 2 hours (like my own account). |
ok thanks, in that case, would you mind cherry pick these changes into your account to run under appveyor - fixing test run is lower priority than getting this merged to kick off 2.2.1... :) thanks |
Sure, I can! .. but seems something has gone wrong with package installiation .. Let me trigger it by my account anyway. I can retrigger it. |
Build started: [SparkR] |
ouch
|
Looks mine was set to 1.5 hours back lately .. |
Oh but the test passed now with Apache account here. |
…package ## What changes were proposed in this pull request? remove spark if spark downloaded & installed ## How was this patch tested? manually by building package Jenkins, AppVeyor Author: Felix Cheung <felixcheung_m@hotmail.com> Closes #19657 from felixcheung/rinstalldir. (cherry picked from commit b70aa9e) Signed-off-by: Felix Cheung <felixcheung@apache.org>
thanks! merged to master, 2.2. |
…package ## What changes were proposed in this pull request? remove spark if spark downloaded & installed ## How was this patch tested? manually by building package Jenkins, AppVeyor Author: Felix Cheung <felixcheung_m@hotmail.com> Closes apache#19657 from felixcheung/rinstalldir. (cherry picked from commit b70aa9e) Signed-off-by: Felix Cheung <felixcheung@apache.org>
What changes were proposed in this pull request?
remove spark if spark downloaded & installed
How was this patch tested?
manually by building package
Jenkins, AppVeyor