Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new dependency types merge broke R dependency resolution #1337

Closed
glennpj opened this issue Jul 22, 2016 · 19 comments
Closed

new dependency types merge broke R dependency resolution #1337

glennpj opened this issue Jul 22, 2016 · 19 comments

Comments

@glennpj
Copy link
Contributor

glennpj commented Jul 22, 2016

The following change from #378 broke dependency resolution in R packages.

 -        for d in extension_spec.traverse():
 +        for d in extension_spec.traverse(deptype=nolink, deptype_query='run'):

After a package is built for R it loads it in a session. It is required that the dependency chain be on the R_LIB path for that to succeed. That broke with the above change and so all second level dependencies would have to be specified as direct dependencies. That then makes dependency specification out of sync with what is on CRAN.

I have another change that I need to make to the R package and I can revert the above change in that PR. However, the above change was also made for the python and lua packages and I do not know if there is a similar issue with those.

@adamjstewart
Copy link
Member

@mathstuf

@mathstuf
Copy link
Contributor

Ah, this is probably because R dependencies should also be nolink rather than the (default) build, link. Is that the right model for R (like Python where modules don't actually link to each other, but load each other at runtime)?

@adamjstewart
Copy link
Member

I have a Python package that depends on numpy, scipy, and matplotlib. I haven't specified the dependency types because I'm not exactly sure what they should be. Anyway, when I activate my Python package, it activates numpy but not scipy or matplotlib. When I try to import my package, it complains that scipy doesn't exist. Is this a bug? Am I just missing the proper dependency types? What should they be?

@glennpj
Copy link
Contributor Author

glennpj commented Jul 22, 2016

@mathstuf
The dependencies that I have in the r- package files are 'import' dependencies of R. Translated to spack, that means they are both build and run dependencies. There are also link dependencies in R, such as for Rcpp. However, in the case of Rcpp it is almost always imported as well so should be treated as a build and run dependency. There are two other dependency categories in R, 'suggested' and 'enhances'. Those would be translated to run time dependencies in spack. In my opinion though, specifying those starts to have spack replicating the R package manager too much. People can always load environment modules of R packages that work together but do not depend on each other. That said, specifying those may be useful as proposed in #1249.

@adamjstewart
It sounds like python might have the same issue as R.

@adamjstewart
Copy link
Member

Adding the nolink deptype to my python package fixed the problem. Sorry about the confusion. I think Python is handled correctly. Seems like the problem may be restricted to R.

@mathstuf
Copy link
Contributor

Yeah, adding type=nolink to all of the R depends_on should work.

@glennpj
Copy link
Contributor Author

glennpj commented Jul 22, 2016

So every r- package, and presumably, every python package, would need to have that specified for each dependency?

@mathstuf
Copy link
Contributor

Yeah, the default is build, link since that is the default for regular libraries. Not sure we would want to change the default based on the dependee name…

@glennpj
Copy link
Contributor Author

glennpj commented Jul 22, 2016

I guess I am not entirely clear on how the deptype searching works but would changing the deptype_query setting in the R package itself be an option?

@glennpj
Copy link
Contributor Author

glennpj commented Jul 22, 2016

I should note that I am not opposed to making the change in the existing package files. What concerns me though going forward is that specifying the deptype may be forgotten when developing extension packages and then lead to odd problems. That may lead to "fixing" the problem by over specifying dependencies in the respective package files. That in turn will lead to diverging views between spack and CRAN about what the dependencies are for R packages. A similar problem could arise with python and pip.

@mathstuf
Copy link
Contributor

It would work, but the thing is that then R packages would be considered for things like rpath by default which could make the paths overly long.

@tgamblin
Copy link
Member

Spack originally had a single deptype that was build, link, and run all in one. With the switch to deptypes we're trying to avoid pulling unnecessary things into the runtime environment, particularly build dependencies. We also made the default deptype build/link since, as @mathstuf said, that's what a typical library wants.

It seems like we didn't make things intuitive enough. We actually punted on R because we did not know what R deptypes should be. The meanings of the deptypes are roughly as follows:

  • build: needed in the PATH at build time (e.g., cmake)
  • link: package needs to link to this. It should be in RPATHs added to the package, and -L$prefix/lib* and -I$prefix/include args are added by Spack's compiler wrappers.
  • run: needs to be in the PATH, PYTHONPATH, etc. at runtime for a package to work, but doesn't need to be linked in or RPATH'd (i.e. doesn't require any changes to linking)

I think packages should ideally be explicit, but I think it's clear this isn't intuitive enough right now, and it might be hard to make this intuitive for beginners. It's a barrier for new packagers.

Would making the default deptype build/link/run make sense? It might reduce confusion for packagers if they do NOT yet understand dep types. This addresses Glenn's concern above.

@glennpj
Copy link
Contributor Author

glennpj commented Jul 22, 2016

@mathstuf
Good point. I will need to try to figure out how things like Rcpp should be specified when packages link to it.

@mathstuf
Copy link
Contributor

If Rcpp is found using R's module loading logic, then it's a run dependency (e.g., numpy is compiled, but Python loads it). If it is expected to be found by ld.so or whatever OS X uses to load libraries, then it is a link dependency.

@glennpj
Copy link
Contributor Author

glennpj commented Jul 22, 2016

@tgamblin
It is not so much a lack of an understanding of deptypes but simply the fact that the extension packages are different from the "standard" packages where the default setting kicks in. Specifying the deptype is something that is likely to be forgotten. In the current scheme it will only show up as breakage on packages with multiple levels of dependencies. Since many extensions do not have multiple levels of dependencies then having it default to build/link/run would have most extensions adding unnecessarily to the environment, which is not what we want.

What about adding a comment to the boilerplate template reminding packagers to set the deptype for extension packages like R? That is probably sufficient to address the concern that I raised, at least for packages created with spack create.

@becker33
Copy link
Member

I agree that if the concern is mostly about misuse, documentation (and specifically a reminder in the template package) is the way to handle it.

From: Glenn Johnson <notifications@github.commailto:notifications@github.com>
Reply-To: LLNL/spack <reply@reply.github.commailto:reply@reply.github.com>
Date: Friday, July 22, 2016 at 10:55 AM
To: LLNL/spack <spack@noreply.github.commailto:spack@noreply.github.com>
Subject: Re: [LLNL/spack] new dependency types merge broke R dependency resolution (#1337)

@tgamblinhttps://github.com/tgamblin
It is not so much a lack of an understanding of deptypes but simply the fact that the extension packages are different from the "standard" packages where the default setting kicks in. Specifying the deptype is something that is likely to be forgotten. In the current scheme it will only show up as breakage on packages with multiple levels of dependencies. Since many extensions do not have multiple levels of dependencies then having it default to build/link/run would have most extensions adding unnecessarily to the environment, which is not what we want.

What about adding a comment to the boilerplate template reminding packagers to set the deptype for extension packages like R? That is probably sufficient to address the concern that I raised, at least for packages created with spack create.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHubhttps://github.com//issues/1337#issuecomment-234611695, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ANUwcDwvSBOF1ZBnloDBIirRRtTg_ewoks5qYQP_gaJpZM4JSu10.

@adamjstewart
Copy link
Member

What about adding a comment to the boilerplate template reminding packagers to set the deptype for extension packages like R? That is probably sufficient to address the concern that I raised, at least for packages created with spack create.

I can do that!

@glennpj
Copy link
Contributor Author

glennpj commented Jul 22, 2016

I will add the type=nolink to the r- package dependencies, reinstall the stack to verify, and submit a PR.

@tgamblin

I think packages should ideally be explicit,...

For the non-r dependencies, the ones that are build, link, should those get specified explicitly or left empty to pick up the default?

Thanks.

@adamjstewart
Copy link
Member

Leave them blank if you just want the default.

olupton pushed a commit to olupton/spack that referenced this issue Feb 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants