Isolate conda R from system user library.#65
Isolate conda R from system user library.#65jdblischak wants to merge 5 commits intoconda-forge:mainfrom
Conversation
|
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
|
Thanks so much for doing this @jdblischak and sorry for the slow review. Do you (or any of @conda-forge/r) think we should add an env var around all of this as an escape hatch to bring back the old behaviour? |
|
An env solution to invoke the old behavior would be welcome, but not a deal breaker (I've lost track of the number of people who have been bitten by libraries in their home directories). |
@mingwandroid I think that's fine. However, I think it would only work if the user has explicitly set A few follow-up questions:
And a general question, what is the purpose of the subsection |
Yeah, I propose adding env() checks to your patch. I'll try to schedule some time when I next update default's R to build on top of this PR.
We do not want any difference between R in conda-forge and the R in defaults.
Not sure! I guess conda-forge should document it and we should also document it.
This specifies which files (from the recipe, and not from ${SRC_DIR}, those are specified via |
@mingwandroid That'd be great. I'm not familiar with env() checks.
100% agree.
Yeah, that is an Issue. Especially for conda-forge since its docs are about the channel, not how to use any given package that it distributes. The Anaconda docs has a section on using R, so that could be a potential options. https://docs.anaconda.com/anaconda/user-guide/tasks/use-r-language/
OK. Makes sense. Thanks for the explanation! |
|
@mingwandroid I was just reminded about this PR from a colleague whose code broke because the R packages were being loaded from his user library instead of the conda environment. I think this isolation is very important for conda R users. If the |
|
I've added support for an environment variable |
|
@msarahan @jjhelmus @katietz Could you please review this PR to r-base? I am attempting to isolate R installed in a conda environment from the user's local R library (which always takes precedence when searching for R packages). Any feedback would be much appreciated. If you approve of the approach, I will start working on support for Windows. Thanks! |
|
Sounds fine to me. Sorry not a detailed review in the least! Pinging @katietz |
|
@jdblischak This is used to evaluate whether the parameter is set. ${var+x} is a parameter expansion which evaluates to nothing if var is unset, and substitutes the string x otherwise. See this Stackoverflow post for details. |
@halldc Thanks for the explanation and link. The key for me from that SO answer was that using |
|
Just got the email from the R Project: "R 3.6.1 scheduled for July 5". It would be nice if we could get this feature into the conda-forge release of r-base 3.6.1. |
|
For any users that find this thread and need a short-term solution, @dpryan79 provided a good solution in conda-forge/r-rjags-feedstock#6 (comment): This will replace your default user library with the system library path (which is where conda R packages are installed). To automate this, you could put it inside of an |
|
Can this be retried now that 3.6.1 is out? |
katietz
left a comment
There was a problem hiding this comment.
Sorry, that review took a bit. Change looks sensible to me.
As 3.6.1 is out now, is that patch applying to new version too?
Btw in my opionion we should update soon to 3.6.1 version
@h-vetinari I rebased the PR against R 3.6.1. Could you please test it out? |
@katietz Thanks for the review! Can I interpret this as you being open to the same patch to the r-base recipe in the defaults channel? I'd prefer if the two behaved the same in regards to the user library. |
I can build on my personal laptop, but in the environment where I actually need it, I can't pull stuff from github (but we have a proxy for conda-forge)... There's some complicated ways around it, but even then I'd have to wait until conda-forge supports an r=3.6.1 install. That migration is in progress, at least... |
|
Yes. I am openminded to put patch also in r-base default channel for 3.6.1 |
|
@h-vetinari The migration is effectively finished if you can try things out. |
|
@dpryan79 In a fresh conda install, doing fails with Removing |
|
@dpryan79 *Or so I thought; actually, all the packages that make up |
This section of PEP 405 defines the behavior of EDIT: there was a better subsection to link to |
|
I acknowledge that as both a package manager and environment manager, it may be impossible for conda to perfectly handle both of these situations as there isn't a strong distinction between a "root" environment and a normal environment. So it is difficult to really disentangle these competing situations. |
|
I've needed to use both cases for R. For example, I configured the rstudio server at my old institute to use various versions of conda-installed r-base. This worked quite well and was much more convenient than compiling and installing various versions manually. Users then used the normal methods to install packages in their home directories (the default locations). At the same time our pipelines used r-base in preconfigured envs and then just had the following in their Rscripts: Then users could run scripts and everything in their home directories would be ignored. So you don't need an environment variable, if you want only-conda libraries loaded just add that one line. |
@isuruf They could use
Thus I'd prefer the default behavior to work well with the recommended installation guidelines, and then provide workarounds for the non-standard system installation of conda (especially since a systems administrator installing R for a group of users should be comfortable with the extra steps).
@isuruf Do you have suggestions for the workaround? Here is a potential workaround I came up with. Users could put the following in their
@dpryan79 While this works, I don't like that this solution requires R to be installed with conda. I don't think an R script should be concerned with implementation details like whether or not R was installed via conda or not. |
That sounds like a good idea and maybe a conda-forge package just for that might fix this. We can warn in the activate script here and ask the user to install that package. |
|
The following workaround doesn't work on Windows, because |
@jamesmyatt Thanks for reporting this shortcoming with the proposed solution. Below is an updated version that should work on both Windows and Unix-alike operating systems. I tested it on Windows 10.
You're correct that |
|
A quick update on this PR: As seen above, there is a debate about whether this "isolation" feature should be implemented. The main issue is that because conda is used in so many different ways, it is difficult to find the right behavior that will be intuitive for all of the use cases. I've abandoned this effort for now because the situation is actually fixed well enough for me on Linux. I don't know why or how the change was made, but now the conda recipe changes the value of the "platform" variable when installing R on Linux. Because the default path to the user library uses this platform variable, when I activate a conda environment in Ubuntu, it ignores my user packages. Thus it's properly isolated. Unfortunately conda R is still not isolated in the following situations:
My macbook died, so I don't know what the current behavior is on macOS. I'd appreciate it if someone could run |
|
Might be worth mentioning that the |
@mfansler Yes! That is very relevant. It's a shame that it's been around for over 3 years, and I had no idea about it. Unfortunately I assume the only way to really learn about it is through word of mouth. Thanks for sharing! To clarify though, this solves the problem in the situation where a user has explicitly set Lastly, for the benefits of others that like me were not aware of |
I currently have access to a macOS (Catalina 10.15.7) machine through work. I checked the platform, and unfortunately "conda" is not inserted into the name. However, you could get lucky if your version of macOS differs from the version in which the r-base binary was built. In my case, the user-installed packages wouldn't conflict, but they very easily could on someone else's machine. # homebrew-installed R
# R 4.1.3
> R.version$platform
[1] "x86_64-apple-darwin19.6.0"
# conda-installed R
# r-base==4.1.2=h2b051ba_0
> R.version$platform
[1] "x86_64-apple-darwin13.4.0" |
|
A few updates relevant to this PR:
https://github.com/wch/r-source/blob/tags/R-4-2-0/doc/NEWS.Rd#L324
I'm not sure what to do with this PR. On the one hand, I'm aware at this point that it is going on almost 4 years, and has zero chance of being merged. Thus I'm not planning on updating it. On the other hand, I think it serves as very useful documentation. Users of conda-installed R are typically surprised when they find out that their conda envrionments are using their locally installed packages (and are thus not at all isolated or reproducible). I know I was. Furthermore, this PR makes it clear that it's not enough to simply temporarily clear |
|
Here 4 years after the last comment, 8 years after this PR was opened, I have been tripped up by this too many times to count now. Could there be any interest in revisiting this? Possibly as a new PR, but maybe revisiting this discussion? |
|
See #65 (comment) |
|
I'm well aware of conda-ecosystem-user-package-isolation, but it falls short on MacOS: It sets And I think fundamentally it makes sense that a conda package operates with what is provided inside an environment - not what's outside of it. |
|
We can introduce an environment variable to disable that here and set that variable in |
|
I have made an isolated The recipe is really a modified branch of the this feedstock (https://github.com/roaldarbol/r-base-feedstock), so changes made here will make it across to this build easily. |
💯
Nice! Thanks for updating my patch for the latest version of R. From a quick skim, my only concern is that you modified my test scripts. It's been 8 years now, so things might have changed, but I recall that the problem of the default user library path would only surface when it already existed. Thus this will pass in CI, but on an R user's machine where they have the default user library path, then local packages could leak in again (if not properly isolated). In other words, I suspect that |
I'm not following how this would work. The default user library path is defined when R is built in this feedstock. How could we introduce an environment variable that would make this build-time configuration option be influenced by a conda package installed at run-time? |
|
I saw some activity on this thread and wanted to mention that we recently had a report of a new flag We're still using our own activate/deactivate hooks (they got a bit better recently). Cheers from the past and likely future. |
Checklist
0(if the version changed)conda-smithy(Use the phrase@conda-forge-admin, please rerenderin a comment in this PR for automated rerendering)It's a constant frustration for me that conda R uses the R packages installed in the R user library. This is also a major limitation when trying to convince new conda users about the reproducibility benefits of conda.
This PR attempts to do the following:
Note to future self: I created the patch by 1) cloning r-source, 2) editing and committing etc/Renviron.in, and 3) running
git format-patch HEAD~1.This issue has previously been discussed in #37.
cc: @mingwandroid @ebolyen @bgruening @isuruf