Skip to content
This repository was archived by the owner on Jul 24, 2024. It is now read-only.

purge_olean.sh: a few small improvements#661

Merged
cipher1024 merged 10 commits intomasterfrom
improve-purge
Feb 3, 2019
Merged

purge_olean.sh: a few small improvements#661
cipher1024 merged 10 commits intomasterfrom
improve-purge

Conversation

@spl
Copy link
Copy Markdown
Collaborator

@spl spl commented Jan 31, 2019

  • Delete empty directories
  • Comments
  • Reduce echos to reduce noise and speed up the script
  • Make the script relocatable by using the git top-level directory instead of .

@spl spl requested a review from cipher1024 January 31, 2019 13:28
@jcommelin
Copy link
Copy Markdown
Member

@spl Welcome back! Long time no see!

@spl
Copy link
Copy Markdown
Collaborator Author

spl commented Jan 31, 2019

@jcommelin Thanks! 😺

Copy link
Copy Markdown
Collaborator

@rwbarton rwbarton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look good to me apart from one small question.

purge_olean.sh Outdated
# Delete all empty directories. An empty directory may have been created if it
# does not contain any .lean files and all of its .olean files were deleted.

find $dir -type d -empty -delete
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also delete empty subdirectories of .git which I'm not sure is a good idea. How about

Suggested change
find $dir -type d -empty -delete
find $dir -type d -not -path '*/.git/*' -empty -delete

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rwbarton Good suggestion. I went with the approach of conservatively including only the src and test directories in 360e8e2. I don't believe we will be find .olean files anywhere else, so this should be safe.

@cipher1024
Copy link
Copy Markdown
Collaborator

Why is it faster?

@spl
Copy link
Copy Markdown
Collaborator Author

spl commented Feb 1, 2019

Why is it faster?

@cipher1024 Because it doesn't print out every .olean file. Printing to the screen takes time, and the script currently prints every .olean file name, whether or not that file is rm'd.

If I understand things correctly, this script will only rm an .olean if the corresponding .lean has been (re-)moved. Since this doesn't happen very often (relative to the number of CI runs), the script does not do much work on average. So I don't think we need to print the entire list of .olean files on every CI run.

@johoelzl
Copy link
Copy Markdown
Collaborator

johoelzl commented Feb 1, 2019

@cipher1024 Because it doesn't print out every .olean file. Printing to the screen takes time, and the script currently prints every .olean file name, whether or not that file is rm'd.

This shouldn't be a problem. There are 338 lean files. If it is slow to write these in a log (which i guess should happen in Travis) we have a serious problem. But I agree to only show the deleted files, otherwise it is hard to read the log files.

purge_olean.sh Outdated
for olean_file in `find $dirs -name "*.olean"`
do
echo olean file: $olean_file
lean_file=`echo $olean_file | sed "s/\.olean/.lean/"`
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
lean_file=`echo $olean_file | sed "s/\.olean/.lean/"`
lean_file=${olean_file/%.olean/.lean}

We don't need sed, bash can do the same (I hope other shells too)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ambivalent about which is better. Changed in de65ffb.

@spl
Copy link
Copy Markdown
Collaborator Author

spl commented Feb 1, 2019

This shouldn't be a problem. There are 338 lean files. If it is slow to write these in a log (which i guess should happen in Travis) we have a serious problem.

@johoelzl It's not necessarily slow on Travis, but there is the web log interface, which (I'm guessing) is updated on demand from a log. So, for every line (or batch of lines) outputted to the log, there is the added latency of transferring that line from the server to the client and rendering it. Thus, the more output this script produces, the longer one who is watching has to wait to see the rest of the output. (Aside: network latency is noticeable here in South Africa.)

Nevertheless, speed was never really a major reason for this PR. It only came up locally for me because I was updating my mathlib git repository, and I ran purge_olean.sh to deal with the move of all the .lean files to the src and test directories. I'd rather not dwell further on this relatively insignificant aspect.

@digama0
Copy link
Copy Markdown
Member

digama0 commented Feb 1, 2019

Actually, I would like to see more information in the travis log, as much as is useful. It is often difficult to figure out what the build environment is, especially with regard to the cache state, which can cause huge variation in build times and behavior between the different build stages. I think it would be really helpful to have a report of the state of the cache before running lean --make. It's not clear to me whether this script is doing that exactly, but if it is, it's not a bad thing, even if it slows down the build a bit. (Printing latency is the least of my worries when it comes to lean build time.)

@spl
Copy link
Copy Markdown
Collaborator Author

spl commented Feb 1, 2019

@digama0 This script currently tells you (a) all of the existing .oleans and (b) for each .olean, whether or not there exists a corresponding .lean. It does not tell you that, given a .lean, there exists a corresponding .olean. As of the current state of this PR, it tells you only the set of .oleans that did not have a corresponding .lean.

@digama0
Copy link
Copy Markdown
Member

digama0 commented Feb 1, 2019

I guess the relevant information is which .lean files have up to date .oleans, or which don't.

@cipher1024
Copy link
Copy Markdown
Collaborator

@digama0 I think the problem you point out is worth addressing but ultimately, it's outside the scope of this PR. If everyone is satisfied, I think we can merge this.

@cipher1024
Copy link
Copy Markdown
Collaborator

Update: as far as I can see, all that's missing to merge is for @spl to change the last line to:

find $dir -type d -not -path '*/.git/*' -empty -delete

@johoelzl
Copy link
Copy Markdown
Collaborator

johoelzl commented Feb 1, 2019

I prefer the current line, with find $dirs ...

I prefer that solution, as it walks into a explicit subset of the directories, in the current case src and test, where we have problems with empty directories. The other only excludes the .git directory, but may walk into some other problematic directory.

@digama0
Copy link
Copy Markdown
Member

digama0 commented Feb 1, 2019

I agree that we should only look for lean files in src and test rather than blacklisting .git.

@cipher1024 cipher1024 merged commit 3109c4b into master Feb 3, 2019
@cipher1024 cipher1024 deleted the improve-purge branch February 3, 2019 00:04
cipher1024 pushed a commit that referenced this pull request Feb 8, 2019
* purge empty directories
* Only print if an .olean is rm'd. This reduces the noise and reduces the
script run time.
* use git top-level dir to make the script relocatable
* only affect src and test dirs
* use bash instead of sed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants