Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add two options diy #2771

Closed
wants to merge 1 commit into from
Closed

Conversation

krafczyk
Copy link
Contributor

@krafczyk krafczyk commented Jan 7, 2017

Fixes #2573.

Add -j to diy which accepts the number of jobs to use for building

Add -d which accepts the source path to use for diy building

Copy link
Member

@tgamblin tgamblin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Couple of minor change requests.

@@ -91,13 +101,21 @@ def diy(self, args):
tty.msg("Uninstall or try adding a version suffix for this DIY build.")
sys.exit(1)

source_path = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't actually need this line -- ifs in Python don't have their own scope.

'-j', '--jobs', action='store', type=int,
help="Explicitly set number of make jobs. Default is #cpus.")
subparser.add_argument(
'-d', '--source-path', dest='source_path',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can add default=None here to simplify the logic below...

@@ -91,13 +101,21 @@ def diy(self, args):
tty.msg("Uninstall or try adding a version suffix for this DIY build.")
sys.exit(1)

source_path = None
if args.source_path is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be shortened to:

source_path = args.source_path
if source_path is None:
    source_path = os.getcwd()

@@ -62,6 +68,10 @@ def diy(self, args):
if not args.spec:
tty.die("spack diy requires a package spec argument.")

if args.jobs is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a common argument with install, can you factor this and arg from install into spack.cmd.common.arguments? Look at the find, install, module, setup and spec commands do it for examples.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such factoring has already been done (extensively) in #2664. Please base your changes off of that PR.

@krafczyk
Copy link
Contributor Author

krafczyk commented Jan 7, 2017

@tgamblin I've incorporated your suggested changes. -j is now a common argument and the source_path logic has been simplified.

Actually, I've started to wonder why diy is a separate command from install at all. An option to pass a source directory could be added to install and the functionality of diy would be duplicated as far as I can tell. diy only seems to create a DIYStage object with the path of the source directory. The user is still in charge of picking a spec for that build though. thoughts?

@citibeth
Copy link
Member

citibeth commented Jan 7, 2017

@krafczyk Please see #2664. It suggests plans to re-do spack diy in ways similar to how spack setup is refactored there. Tell me if you want to contribute DIY stuff directly to #2664 and I'll get you write access to the #2664 branch. If not, I'm happy to eventually do it myself (including the -j flag from this PR).

The changes to spack setup, which I'd like to also see for spack diy, include:

  1. As much code as possible is factored out and re-used from spack install. In particular, that includes all the command-line arguments (inconsistent args between spack install, spack setup and spack diy have been a problem.

  2. I'd like to see spack diy integrated as no longer (necessarily) a separate command. This lets you do spack diy on a whole set of packages within a DAG, not just the top one. For example, suppose A -> B -> C -> D. You might do the command line:

spack install --diy B /path/to/b --diy D /path/to/d A

This would install A and C "normally" while doing a spack diy install on B and D.

These changes and functionality are already working for spack setup. They just need to be ported to spack diy.

Copy link
Member

@citibeth citibeth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see the ideas here integrated into #2664 and submitted together as one PR.

@tgamblin
Copy link
Member

tgamblin commented Jan 8, 2017

@citibeth: this PR is a relatively small change, and I think #2664 is a much larger one that I haven't had a chance to look at. I would favor accepting this and rebasing #2664 on top of it for now. Especially the -j option here seems mostly orthogonal to #2664, so I wouldn't put such a huge requirement on @krafczyk to integrate with it...

Also, IMHO the semantics of DIY are very different from those of install. I can believe diy needs a better name, but given that the two have very different argument formats I think they can remain separate commands.

All that said, I do really like the multi-setup stuff in #2664, but I wonder if it would be better implemented as a project file, and not just command line options to diy. For example, Bundler has really caught on for installing package dependencies -- not just in Ruby but in other package managers as well. I think there's some overlap there with multi-setup. Conda environments, which have a YAML format, are another example of a way we could potentially improve some of that UI. What do you think?

@krafczyk
Copy link
Contributor Author

krafczyk commented Jan 8, 2017

Just to give you guys a bit of perspective of where I'm coming from, I created a small python script a while back which assisted me in building and installing software from source here: https://bitbucket.org/krafczyk/buildpkg

Essentially I used it as a kind of adapter to unify the build methods and arguments from several different build systems (cmake and autotools only at the moment). It seemed to me that there were 3 locations that the user could want mutable.

  1. The source directory
  2. The build directory
  3. The install directory

spack already takes care of 1. in the normal mode, but provides the user the opportunity to define it with DIY builds. 2. is similar to but not the same as the stage directory. This is because it may be desirable to do an out of tree build for some packages. This is especially important for DIY builds since build files may pollute the source directory without this option. 3. is of course already managed by spack, so no need to change this I would think, but maybe the odd user would want the option to set the install directory anyway. I can see this being accomplished with spack view though.

The install process in spack feels like it should be essentially the same except it does magic to manage all the packages for you. This amounts to automatically finding source, managing the build directory and managing install directories and keeping track of what is and is not installed.

Then, the only differences between my buildpkg script and spack install is that buildpkg must be called by the user in exactly the way you want each time while spack does that for you and much more. so the build process is like this:

  1. Build a DAG for the package (or packages) (spack does this)
  2. Build and install the necessary dependencies (do 1-5 for each dependency) (spack does this)
  3. Get the source (user should be able to define where i.e. DIY build) (spack and buildpkg do this)
  4. Build the source possibly out of tree (user should be able to tell spack to use an out of tree build instead of the usual stage variety) (spack kinda does this, buildpkg does this)
  5. Install target package (spack and buildpkg do this)

To me, spack install, spack diy, and spack setup are essentially the same command except they do different parts of the process differently.

spack diy allows the user to choose where to get the source directory,
spack setup does everything except install the ultimate package and provides a script for that instead. (@citibeth correct me if I'm wrong about this)

@tgamblin: I certainly agree the semantics of spack diy and spack install are currently different. However I see no reason why they ought to be. They're doing essentially the same thing except where they're getting the source from is different.

So, to me the bitrot is quite unnecessary. Everything ought to go in one command spack install with appropriate options to trigger the desired effects similar to what @citibeth proposed above, and what buildpkg does.

You could build A from source and get a setup script for B with the same command with something like:
spack install A --source-dir=<path/to/A> B --setup-script-only
would be my preferred way of doing it with descriptive options, but
spack install --diy A <path/to/A> --setup B
would work as well.

This would also allow the possibility of requesting multiple packages at the same time from install and essentially joining their DAGs where possible to create a consistent runtime environment.

spack install --join-dags A B C

A, B, and C not necessarily dependent on each other, but perhaps share dependencies somewhere in their tree. I suspect based on what is in #2664 that @citibeth has probably implemented much of what is needed already there.

Finally, to tack on the end, spack should allow the ability to do out of tree builds especially for DIY style building. This is useful especially for large projects with long build times where you may need to rebuild to test for new features and dont' want the whole build process to run each time.

It would look something like this:

spack install --out-of-tree --keep-build-directory A

@tgamblin
Copy link
Member

tgamblin commented Jan 8, 2017

@krafczyk: That sounds like a pretty good breakdown of the problem to me.

Thoughts I had while reading this:

To me, spack install, spack diy, and spack setup are essentially the same command except they do different parts of the process differently.

I agree that these are all different ways to tweak the build pipeline. You may have already seen that package.py now lets subclasses declare arbitrary sets of phases (thanks to @alalazo). Examples of that are in lib/spack/spack/build_systems -- we've got MakefilePackage, CMakePackage, AutotoolsPackage, etc. That came in via #1186. So, most of the infrastructure is consolidated in spack.package.Package and subclasses.

So, to me the bitrot is quite unnecessary. Everything ought to go in one command spack install with appropriate options to trigger the desired effects similar to what @citibeth proposed above, and what buildpkg does.

I think setup, diy, and install could be consolidated more. I guess I'm up in the air on the syntax, but I'm open to suggestions. I'm pretty sure the current syntax isn't the best one. I do like the fact that I can run spack fetch, spack stage, and now spack configure (for some packages) independently. It's useful for debugging and I think it's more intuitive than adding a bunch of options to install. e.g., @alalazo had suggested spack install --stop-at PHASE but I thought that wording required the user to understand an awful lot about the build pipeline to really know what it meant.

I like the fact that the syntaxes you and @citibeth are proposing let the user pick the parts of a build they want to work on. --diy A /path/to/A --diy B /path/to/B is powerful. I could install different versions of C and diy a couple of its dependencies. It seems like it would also be useful if the command would just check out the source in those locations if the paths don't exist yet. Or maybe that should be a flag, e.g. --stage-if-not-present. Maybe you'd also want something like --force-restage. Cramming all that information into one command might be hard.

Worst case we can probably add the args you're suggesting to spack install and make diy and setup thin wrappers (and maybe rename them).

(2) is similar to but not the same as the stage directory. This is because it may be desirable to do an out of tree build for some packages. This is especially important for DIY builds since build files may pollute the source directory without this option

There are a lot of packages that do "out of tree" builds in the sense that they make a separate directory (grep for with working_dir in the packages). But, it's still "in" the source tree, and we assume a stage directory per build. This is mostly because there are some builds that just don't work out of source (because their authors either don't know how to or haven't had occasion to care).

That said, I think it would be great to try to make most packages build out of tree by default, and to figure out how to modify the package API to support that. The issues I see are:

  1. Arguments to commands like cmake have to change for out of source builds.
  2. We'd need to change the way we create the configure Executable object for autotools (it needs to point to $SRCDIR/configure even if install() starts in the build dir.
  3. You're probably screwed for many Makefile-based builds.
  4. I'd have to look into what Python, R, and other types of builds do.

I think packages will end up having to say whether or not they support out of tree builds. But if we do this, then we can have packages do things like create build stages with consistent names -- most packags use 'spack-build' right now -- and that should help for things like spack setup if the author ends up wanting to build a package many times for one source directory.

Thoughts?

@citibeth
Copy link
Member

citibeth commented Jan 9, 2017

To me, spack install, spack diy, and spack setup are essentially the same command except they do different parts of the process differently.

I agree that these are all different ways to tweak the build pipeline.

Well... yes and no. spack setup is not just a rearrangement of existing build phases; it creates a closure of one of the phases that the user may run at a later date.

Or to put it more formally... if install() normally looks like this:

def install():
    fetch()
    stage()
    configure()
    build()
    write_db()
    write_module()

then spack setup does this:

def setup():
    write_db()
    write_module()
    return configure

It then leaves the user to run the returned configure() function (as represented by spconfig.py). The user also has to run the build() phase manually; but that is as easy as typing make install.

One particularly unusual thing about spack setup is that it "installs" the package the the database and module directory before it's actually been built. This allows for other packages to be set up on top of it (but not installed on top of it, unless the user completes the process).

You may have already seen that package.py now lets subclasses declare arbitrary sets of phases (thanks to @alalazo).

We should probably move spconfig.py generation into one of these arbitrary phases. Formerly it was a hack. #2664 moves it as a method on CMakePackage / AutotoolsPackage, etc. But it sounds like we should go all the way and make it a phase like the others.

So, to me the bitrot is quite unnecessary. Everything ought to go in one command spack install with appropriate options to trigger the desired effects similar to what @citibeth proposed above, and what buildpkg does.
I think setup, diy, and install could be consolidated more.

The initial motivating factor of #2664 was that spack setup had been broken. Such breakage had happened at least once before: someone modifies spack install, without taking appropriate steps on spack setup as well. By sharing as much code as possible with spack install, I minimize the change I will have to continue re-writing spack setup as it is broken again and again.

More recently, I decided I need a spack setup that works on more than one package at once. The only feasible way to do this, that I could think of, was to make the setup process an integral part of Package.do_install(). At that point, spack setup and spack install are essentially the same command. I left a separte spack setup for kicks if you just want to install one item, but it just calls through to the main do_install() method used by everyone else.

I guess I'm up in the air on the syntax, but I'm open to suggestions. I'm pretty sure the current syntax isn't the best one.

I agree, the current syntax isn't really thought through. But I'm not inclined to think more deeply about that UI issue. If someone has an alternate syntax that they like better, I'm happy to use it.

I like the fact that the syntaxes you and @citibeth are proposing let the user pick the parts of a build they want to work on. --diy A /path/to/A --diy B /path/to/B is powerful. I could install different versions of C and diy a couple of its dependencies.

This isn't just a nice byproduct of the syntax; it is a necessary feature that any syntax needs to support.

It seems like it would also be useful if the command would just check out the source in those locations if the paths don't exist yet. Or maybe that should be a flag, e.g. --stage-if-not-present. Maybe you'd also want something like --force-restage.

Yes, that sounds like a logical feature. But not sure I really need it or would use it. I would want someone who REALLY WANTS it, then we know that whether implementation of it actually solves ONE person's problem.

Cramming all that information into one command might be hard.

Not any harder than cramming a zillion variants and ^ clauses on a Spack command line, before packages.yaml became more functional. Actually, it's easier.

OTOH... I agree, some kind of "project setup" file would be nice. Then we just give the project setup file on the command line. How about if we do it this way:

  1. Create a configuration YAML file that can specify this stuff (or add it to packages.yaml). Specifically... it would specify which packages or package versions to setup or diy instead of install. If someone thinks this grammar through carefully, I'd be happy to use it.

  2. Allow users to (optionally) put all YAML file from a single scope in one file, rather than in separate files (I believe the YAML grammers are already set up to enable this). This makes it more convenient to have a zillion small, special-purpose scopes (for example, a scope whose sole purpose is to tell Spack which packages we want to setup instead of install).

  3. Allow users to specify YAML scopes on the command line (Command Line Scopes #2686)

With these three steps, I could create a "project" file that tells Spack (for example) which packages I intend to setup vs. DIY vs. regular-build.

Allowing YAML to proliferate like this could be powerful. Of course, not all YAML options or config files would be appropriate in a "permanent" configuraiton (such as in ~/.spack). For example, a file telling Spack to setup (not install) package Y is probably NOT appropraite for a "permanent" configuraiton. But it would go well in configurations provided on the command line. I'd prefer that we indicate to users our opinions on these issues, but don't try to stop them from putting configurations where they want them.

Your thoughts on this?

Worst case we can probably add the args you're suggesting to spack install and make diy and setup thin wrappers (and maybe rename them).

#2664 already did that. I suppose, then, that #2664 is worst case.

and that should help for things like spack setup if the author ends up wanting to build a package many times for one source directory.

If you want to build a (CMake) package many times for one source directory, just do spack setup, which gives you a spaconfig.py file. Think of spconfig.py as a pre-configured stand-in for the cmake command. At that point, you can run spconfig.py -DCMAKE_INSTALL_PREFIX=xyz as many times as you like, setting up builds in different build directories and different install directories. If you leave off -DCMAKE_INSTALL_PREFIX, it will install in the location that Spack configured.

I use this feature of spack setup heavily in the modele-control system: https://github.com/citibeth/modele-control/blob/develop/lib/ectl/setup.py

It currently only works with CMake. But there's no reason it can't work with any package, someone just has to write the appropriate build phase.

@krafczyk
Copy link
Contributor Author

@citibeth, Sorry for the late reply, If you give me write access to your #2664 branch, I'll remake these changes for your branch. Do you think it would be better to keep this PR or should I just make the changes in your PR and have everything merged in one PR?

@citibeth
Copy link
Member

If you give me write access to your #2664 branch, I'll remake these changes for your branch.

You should be good to go.

Do you think it would be better to keep this PR or should I just make the changes in your PR and have everything merged in one PR?

I think it's best to make all the changes in #2664

On UI... I'm coming to the conclusion that we should configure which packages are built with install vs diy vs setup in a YAML file. When combined with #2686 this will allow us to create configurations files, on a per-project basis, that we include on the command line as needed.

This change will probably happen along with our overall review of Spack environments. In the meantime, we have the current command-line interface. I would recommend just doing spack install mypackage --diy mypackage: it's consistent with what's currently there for spack install --setup, and it will all get ripped out soon anyway.

Also add -j to the common arguments
@luigi-calori
Copy link
Contributor

@krafczyk I was using this PR to specify the package source folder with --source-path.
Is this PR still active? In that case, would it be possible to rebase on current develop?
@tgambling any chances to have it merged ?

@krafczyk
Copy link
Contributor Author

@luigi-calori Originally we were going to add this in with #2664. However that one was never merged, and a new PR was created (#5043) which was supposed to be a 'second attempt' but that hasn't been finished either.

I'll actually revive this one because its such a small change, and I don't want to wait for work to be finished on those branches any more. Thanks for reminding me!

@alalazo
Copy link
Member

alalazo commented Nov 21, 2017

Superseded by #5963

@alalazo alalazo closed this Nov 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants