Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign uptouch some files when we use them #6477
Conversation
rust-highfive
assigned
ehuss
Dec 22, 2018
This comment has been minimized.
This comment has been minimized.
rust-highfive
commented
Dec 22, 2018
|
r? @ehuss (rust_highfive has picked a reviewer for you, use r? to override) |
This comment was marked as outdated.
This comment was marked as outdated.
|
Message updated! |
Eh2406
force-pushed the
Eh2406:add-a-timestamp-file
branch
from
c700f8d
to
4ae5abc
Dec 22, 2018
This comment has been minimized.
This comment has been minimized.
|
I think this would be useful. I'm not sure if it really needs a dedicated file. I did some tests on an old Windows machine, and I didn't see a noticeable performance difference. However, creating lots of small files generally isn't desirable. Would it be possible to just adjust the mtime of one of the other files? Touching dep-KIND-PKG-HASH may make #2426 worse, so that may not be a good idea, but I think the other two files would be possible options. It should probably get a test, too. |
This comment has been minimized.
This comment has been minimized.
|
Good points. It would be nice if there was a reliable way to know if the fingerprint was made with a cargo that is maintaining the mtime or if falling back to the atime is required.
The tooling can decide to be to conservative, not support older cargos, or do version number based feature detection. (assuming we don't add a way to opt out.) A separate file makes it very clear if Cargo is making this data available to the tool. (file mtime iif file exists else max atime.) If there is another signale you would prefer, let me know. I will add a test when we decided what we want the behavior to be and next time I look at the code. |
Eh2406
force-pushed the
Eh2406:add-a-timestamp-file
branch
from
4ae5abc
to
920f4db
Dec 27, 2018
Eh2406
referenced this pull request
Dec 27, 2018
Merged
Rebuild on mid build file modification #6484
This comment has been minimized.
This comment has been minimized.
|
Thinking about it more, I don't think there is a reasonable way for a cleaner script to handle when cargo is sometimes opting out of this. Version number detection sounds approximately good enough. What is the best way to cross platform touch a file? |
This comment has been minimized.
This comment has been minimized.
|
Could we perhaps touch the mtime of the artifacts themselves? (like all the rlibs, binaries, etc). I think that Cargo doesn't compare the mtime on those artifacts (even historical Cargos) and that may make it easiest for other tools too? IIRC we already check for existence of all artifacts on nop builds, so opening up a file handle to set the mtime on it may not take too much longer. If this becomes a performance concern we could always add configuration/env vars to turn it off, but it seems reasonable to me to have it on by default. |
This comment has been minimized.
This comment has been minimized.
In what way? It would make a simple tool, like the published version of cargo-sweep, that does not know about the structure of the target dir closer to being correct but not really useable. I think a simple tool would start leaving the outputs, and delling the fingerprints leading to a complete rebuild. A tool that knows about the structure, like my branch of cargo-sweep, doesn't really care which file we pick as the authoritative one. |
This comment has been minimized.
This comment has been minimized.
|
Oh I was just thinking that if the output artifacts had mtimes on them then a tool could just delete any artifact older than N days, and the .fingerprint metadata would need cleaning eventually but it's not really large enough to warrant lots of scrutiny |
This comment has been minimized.
This comment has been minimized.
|
Ok, I am convinced that that could be a good next step!
Feels kinda dependent on the |
This comment has been minimized.
This comment has been minimized.
|
Hm that's what I would naively say we should do, but I'll admit I have no idea how the filesystem clock and |
Eh2406
referenced this pull request
Jan 4, 2019
Merged
fix cargo not doing anything when the input and output mtimes are equal #5919
Eh2406
force-pushed the
Eh2406:add-a-timestamp-file
branch
2 times, most recently
from
41b80df
to
b144ad3
Jan 4, 2019
This comment has been minimized.
This comment has been minimized.
|
@alexcrichton where do we check for existence of all artifacts on nop builds? |
Eh2406
referenced this pull request
Jan 9, 2019
Open
Automatically purge target directories after reaching max size #346
This comment has been minimized.
This comment has been minimized.
|
I think here |
Eh2406
force-pushed the
Eh2406:add-a-timestamp-file
branch
from
b144ad3
to
41bf6f2
Jan 10, 2019
Eh2406
force-pushed the
Eh2406:add-a-timestamp-file
branch
from
41bf6f2
to
3eaa70e
Jan 10, 2019
This comment has been minimized.
This comment has been minimized.
|
This has been updated. I rebased. I also switch to touching existing files instead of adding a new one. Tuch was imped with The files being touched are:
when CI is green, I will update the title and op. |
Eh2406
changed the title
adds a timestamp file in the fingerprint folder
touch some files when we use them
Jan 10, 2019
Eh2406
added some commits
Jan 13, 2019
This comment has been minimized.
This comment has been minimized.
|
Added tests. Do people have thoughts, or is this good to go? |
ehuss
approved these changes
Jan 16, 2019
|
Seems good. Alex? |
This comment has been minimized.
This comment has been minimized.
|
@bors: r=ehuss |
This comment has been minimized.
This comment has been minimized.
|
|
This comment has been minimized.
This comment has been minimized.
bors
added a commit
that referenced
this pull request
Jan 16, 2019
This comment has been minimized.
This comment has been minimized.
|
|
Eh2406 commentedDec 22, 2018
•
edited
This is a small change to improve the ability for a third party subcommand to clean up a target folder. I consider this part of the push to experiment with out of tree GC, as discussed in #6229.
how it works?
This updates the modification time of a file in each fingerprint folder and the modification time of the intermediate outputs every time cargo checks that they are up to date. This allows a third party subcommand to look at the modification time of the timestamp file to determine the last time a cargo invocation required that file. This is far more reliable then the current practices of looking at the
accessedtime.accessedtime is not available or disabled on many operating systems, and is routinely set by arbitrary other programs.is this enough to be useful?
The current implementation of cargo sweep on master will automatically use this data with no change to the code. With this PR, it will work even on systems that do not update
accessedtime.This also allows a crude script to clean some of the largest subfolders based on each files modification time.
is this worth adding, or should we just build
clean --outdatedinto cargo?I would love to see a
clean --outdatedin cargo! However, I think there is a lot of design work before we can make something good enough to deserve the cargo teams stamp of approval. Especially as an in tree version will have to work with many use cases some of witch are yet to be designed (like distributed builds). Even just includingcargo-sweeps existing functionality opens a full bike shop about what arguments to take, and in what form (cargo-sweeptakes a days argument, but maybe we should have a minutes or a ISO standard time or ...). This PR, or equivalent, allows out of tree experimentation with all different interfaces, and is basically required for anyLRUbased system. (For example Crater wants a GC that cleans files in anLRUmanner to maintain a target folder below a target size. This is not a use case that is widely enough needed to be worth adding to cargo but one supported by this PR.)what are the downsides?