Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tup can not cope with programmer tools that output random file names. #113

Open
mstewartgallus opened this issue May 6, 2013 · 13 comments

Comments

@mstewartgallus
Copy link

There does not seem to be a way to work with programmer tools that may output random file names. For example, a tool I use is the rustc compiler for the rust language. When this compiler compiles shared libraries it puts a hash of the library, and a version number in the output file name. An example, the source for a library cool.rs might be compiled to libcool-68a2c114141ca-0.0.4.so. Unfortunately, there does not seem to be a way to work with such tools using Tup.

@ppannuto
Copy link
Contributor

ppannuto commented May 6, 2013

AFAICT, tup does not currently support this nor would it be a trivial addition. Tup depends on knowing the full dependency hierarchy before it does any actual work-- this is what allows tup to be fast/efficient.

More generally...

What build system does support this behavior? Indeed, what do want the build system to do? In general, you ask your build system, ultimately, to create all of the desired output files. This means, you need to be able to tell the build system what the desired output files are.

You could imagine something like a glob rule, that says the final output file is output-.exe, but what do you do with the old outputs from previous builds? Does tup store somewhere what was built last time and delete it? Does that always make sense? What does tup do if there are multiple files in the directory that happen to match output-.exe; does it assume that the newest file is the current one? Indeed, how to you as a programmer know which of the many hashes are the newest-- are you constantly sorting my time-modified to figure out which executable to test?

While I appreciate the merit of semantic naming for versioning, I'm not convinced it makes sense in this context and I'm not sure it's something a build system should support.

@mstewartgallus
Copy link
Author

Thank you for your kind, and prompt response. The usual way to support this using normal build tools is to use the touch program, and create a dummy file libexample.dummy which would be used as a proxy. However, tup does not support this approach, and warns on files that are created, and not mentioned in the Tupfile. Can some work around be found using this approach?

@bradjc
Copy link
Contributor

bradjc commented May 7, 2013

Currently, Tup strictly has to know what files will be created from any command and does not support glob characters on output file names. The only workaround I can think of is to use the run command on a script to generate the rules. The script would have to build the binary to determine the output name and then provide that to the Tup rule. Yes this is very strange and would require building everything twice, but it's the only hack I can think of.

More generally, I think that to @ppannuto's point Tup could use * globs in the output filenames. I believe this would work conceptually, but I don't think the current implementation allows for the symbolic graph building you would need to allow for input and output filenames that are unknown at parse time.

@ppannuto
Copy link
Contributor

ppannuto commented May 7, 2013

Also, from a quick skim of the man page, rustc supports -o to name the output file. Is there a reason not do just use that?

Working from: https://github.com/mozilla/rust/blob/master/man/rustc.1

@mstewartgallus
Copy link
Author

Unfortunately rustc --lib -o mylib -o mysrc.rs will output something along the lines of libmylib-68a2c114141ca-0.0.4.so .

@gittup
Copy link
Owner

gittup commented Jun 10, 2013

On Wed, May 8, 2013 at 5:06 PM, sstewartgallus notifications@github.comwrote:

Unfortunately rustc --lib -o mylib -o mysrc.rs will output something
along the lines of libmylib-68a2c114141ca-0.0.4.so .

Is it possible to patch rustc to support not adding the hash to the
filename? If it's really necessary maybe it could be added as a special
symbol to the .so

I've tried to implement arbitrary file outputs in the past, but the problem
I've ran into is with the fact that tup tracks dependencies on files that
don't exist yet, combined with its graph building algorithm.

For example, the command "gcc -c foo.c -o foo.o -Ipublic -Iprivate", where
foo.c has:

#include "foo.h"

and only private/foo.h exists, tup will create a placeholder node in the
database to track the dependency on public/foo.h (as well as the obvious
dependency on private/foo.h). If public/foo.h is later created, tup knows
to re-execute this command.

The graph building algorithm essentially does:

  1. Scan (or use the file monitor) to detect file modifications
  2. Build partial DAG
  3. Walk through DAG, executing commands

If we allow arbitrary file outputs, then it is possible that a command
executed in part 3) will write to a placeholder file. The only way to
really handle that properly is to either report it as an error, or abort
the build and start over from step 2). Neither way seems very promising, so
that is why it is not implemented.

What does the hash in the shared library provide? A build system like tup
will track the dependencies on things that use the library and try to
re-link them as necessary, so adding a hash presumably for some sort of
runtime check seems redundant.

-Mike

@eddyp
Copy link

eddyp commented Aug 28, 2013

I have a similar issue with a temporary file a java application creates.

The build just errors out and I would like to ignore the temporary file, but I found no way of doing that. The error is:

 *** tup errors ***
tup error: File 'C:\DOCUME~1\test\LOCALS~1\Temp\hsperfdata_test\644' was written to, but is not in .tup/db. You probably should specify it as an output
 *** Command ID=9733 ran successfully, but tup failed to save the dependencies.
 *** tup: 1 job failed.

Obviously, the file bears a different name every time, so it's impossible to guess it.
The fix would probably be to ignore things created in a directory defined by environment variable TMP or TEMP.

Note: this is on Windows.

@ppannuto
Copy link
Contributor

Interesting. Would it be acceptable for tup to simply set the TMP and TEMP variables to the temp directory that tup already creates instead of trying to shim another arbitrary directory? I imagine that would be much easier from an implementation standpoint.

@gittup
Copy link
Owner

gittup commented Aug 30, 2013

On Thu, Aug 29, 2013 at 6:47 PM, Pat Pannuto notifications@github.comwrote:

Interesting. Would it be acceptable for tup to simply set the TMP and TEMPvariables to the temp directory that tup already creates instead of trying
to shim another arbitrary directory? I imagine that would be much easier
from an implementation standpoint.

Are TMP or TEMP set to that location? I wonder how it's deciding to use
that as a temporary location. Eddy, can you also try to run this program
and see what it prints?

#include <stdio.h>
#include <windows.h>

int main(void)
{
char buffer[4096];
printf("Gettmp: %i\n", GetTempPath(sizeof(buffer), buffer));
printf("Path: '%s'\n", buffer);
return 0;
}

If any of those are actually that directory, we should be able to ignore
the temp files.

-Mike

@eddyp
Copy link

eddyp commented Aug 30, 2013

@gittup: The output of the code is:

Gettmp: 31
Path: 'C:\DOCUME~1\test\LOCALS~1\Temp\'

Both TEMP and TMP environment variables are set to this same value.

It seems tup takes int account the %TMP% variable on Windows XP, because I just changed TEMP to another value and tup reported the change of the %TEMP% variable, but the temporary files where still created in the old location.

Strangely enough, when %TMP% was also set to a new value (and to a directory that does NOT exist), the step in question executed w/o problems :-/ .

@gittup
Copy link
Owner

gittup commented Aug 31, 2013

On Fri, Aug 30, 2013 at 7:05 PM, eddyp notifications@github.com wrote:

@gittup https://github.com/gittup: The outpu of te code is:

Gettmp: 31
Path: 'C:\DOCUME1\test\LOCALS1\Temp'

Both TEMP and TMP environment variables are set to this same value.

I've pushed a patch where it will ignore outputs in the temp directory
(from GetTempPath) on Windows. Can you try it out and let me know? I'm not
sure what exact tool / command-line you're using so I can't easily
reproduce it myself.

Thanks,
-Mike

@eddyp
Copy link

eddyp commented Sep 4, 2013

@gittup : Sorry for the delay, I've been busy lately.

I just tested tup v0.6.5-182-gd0dfe0a and it correctly passes past the problematic rule.

Thanks,
Eddy

@makmanalp
Copy link

I have a similar use case: I have files that are being rsyncd down. Rsync will sometimes detect changes and write files, sometimes it won't. What would be a good way to handle that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants