Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R][Doc] minor error in Linux installation documentation ('conda' option) for R on CRAN #32512

Closed
asfimport opened this issue Jul 27, 2022 · 3 comments · Fixed by #34298
Closed

Comments

@asfimport
Copy link
Collaborator

The documentation for the Linux installation for the r-arrow binary for R is at:

    https://cran.r-project.org/web/packages/arrow/vignettes/install.html

The documentation indicates that the 'conda' installation syntax should be:

``

conda install -c conda-forge --strict-channel-priority r-arrow

``

I can't get that to work.  What works for me is:

conda config --set channel_priority strict
conda install -c conda-forge r-arrow

I'm wondering if the syntax presented in the documentation is either deprecated or incorrect.

Environment: Ubuntu 20.04
Reporter: Wayne Smith
Assignee: Jacob Wujciak / @assignUser

Note: This issue was originally created as ARROW-17224. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

Jacob Wujciak / @assignUser:
Hello thanks for the ticket. I have replicated this on ubuntu 20.04 and while it does solve the environment at some point it takes a very long time (>1h), which is of course not acceptable. I don't know why this happens but will look into it, even if there is a fix we probably want to update the docs...

@asfimport
Copy link
Collaborator Author

Wayne Smith:
Jacob, I concur.  And doing conda -y update conda base (or similar) beforehand (as suggested quite often on StackOverflow) doesn't help (and also takes a long time).

The first suggestion for installing r-arrow on Linux from the docs–i.e., upgrading directly from Rstudio (now Posit) is the fastest and works.  I just don't hope the link to the binaries is brittle or unreliable (you might want to check that too).

I've also gotten it to work with the 'nightly' version hosted on Apache.  The compilation is much slower than the RStudio instructions (again, now Posit) approach and also needs (as the doc's say) the libcurl4-openssl-dev package.  However, my experience is that some (non-sudo) users can't install that package on their distro.

One more issue.  The Rstudio package pull is actually for Ubuntu 18.04, not Ubuntu 20.04 (or even 22.04).  It's not clear to me that is a bug or a feature over the long run.  And it should be documented by Rstudio.  Even it is, we might consider documenting that subtle change in the Arrow/Linux/R doc's too (just my $.02.)

Best,

Wayne

 

@asfimport
Copy link
Collaborator Author

Jacob Wujciak / @assignUser:

I just don't hope the link to the binaries is brittle or unreliable (you might want to check that too)

Which link do you mean the RSPM link? ("https://packagemanager.rstudio.com/all/__linux__/focal/latest")? This is will always give you the newest version. If you want to pin a certain version you can check the RSPM docs on how to create a time stamped link. (But I would probably rather use renv or conda-lock if you require a reproducible environment.)

I've also gotten it to work with the 'nightly' version hosted on Apache. The compilation is much slower than the RStudio instructions

Yes while we have pre-compiled libarrow binaries and a script that detects which one matches you distro best, we still need to compile the actual R package which takes ~5 minutes. While RSPM (PPM soon :D) supplies package binaries that don't require any compilation. An important note in regards to the nightlies: these are 100% brittle as we only ever keep 14 versions/days around and delete everything else. So if you require reproducibility I would advise against using them.

I have talked to some conda-forge users and they all recommend using mamba when using conda-forge packages as it has a much faster solver. Something that might need to be added to the docs.

thisisnic pushed a commit that referenced this issue Feb 27, 2023
As shown in #34297 using the cli flag is not enough to avoid influence of the default channel causing extremely long solve times or conflicts.
* Closes: #32512

Authored-by: Jacob Wujciak-Jens <jacob@wujciak.de>
Signed-off-by: Nic Crane <thisisnic@gmail.com>
@thisisnic thisisnic added this to the 12.0.0 milestone Feb 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants