New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up man page generation #1892
Conversation
Makefile
Outdated
| @@ -59,7 +59,7 @@ man-pages: $(HELP_ALL:=.ronn) $(HELP_ALL) $(HELP_ALL:=.txt) | |||
| groff -Wall -mtty-char -mandoc -Tutf8 -rLL=$(TEXT_WIDTH)n $< | col -b >$@ | |||
|
|
|||
| %.1: %.1.ronn bin/ronn | |||
| bin/ronn --organization=GITHUB --manual="Hub Manual" share/man/man1/*.ronn | |||
| bin/ronn --organization=GITHUB --manual="Hub Manual" $< | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi thank you for trying to help out with this! But I don't get this optimization. Before, this patch, this command reports taking under 4.5s:
git clean -fdx share; time make man-pagesAfter your patch, it takes more than 12s.
So, I'm seeing 3x slowdown instead of 3x speedup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was testing in openSUSE with hub-2.3.0-pre10
There we do make bin/hub man-pages
I think, that might make a difference, because git only contains one .ronn file as input, but I saw more. Those extras are generated in some step.
strace shows hub-remote.1.ronn being created by
["/bin/sh", "-c", "bin/hub help hub-remote --plain-text | script/format-ronn hub-remote share/man/man1/hub-remote.1.ronn"]
["script/format-ronn", "hub-remote", "share/man/man1/hub-remote.1.ronn"]I'm wondering though, where your slowdown comes from, because it should be exactly the same when you just have 1 .ronn file from git as input.
Can you reproduce the slowdown?
Do you get 24 entries under /usr/share/man/man1/hub*.1* in both cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe your measurement/cleanup did not account for the one-time creation of bin/hub ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've ensured that make bin/hub is already run before measurement. Your optimization should apply only to the man-pages step, so including bin/hub build step in the measurement doesn't make sense.
What do you get with the following script on your branch?
make bin/hub
git clean -fdxq share; time make man-pages >/dev/null 2>&1
git checkout HEAD^ -- Makefile
git clean -fdxq share; time make man-pages >/dev/null 2>&1There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to simulate it, because I cannot get it to build from git
> make
script/build -o bin/hub
main.go:8:2: cannot find package "github.com/github/hub/commands" in any of:
/usr/lib64/go/1.10/src/github.com/github/hub/commands (from $GOROOT)
/home/abuild/go/src/github.com/github/hub/commands (from $GOPATH)
/usr/share/go/1.10/contrib/src/github.com/github/hub/commands
main.go:9:2: cannot find package "github.com/github/hub/github" in any of:
git bisect run bash -c "make clean ; make
says, build was broken by commit 00e1b99
I noticed that we sed bin/ronn to ronn, using the system's ruby2.5-rubygem-ronn-0.7.3 but OTOH 0.7.3 is also the latest.
cp -a Makefile.unpatched Makefile ; rm -f share/man/man1/* ; cp -a orig/* share/man/man1/ ; time make man-pages >/dev/null 2>&1
real 0m13.063s
user 0m13.020s
sys 0m1.235s
cp -a Makefile.patched Makefile ; rm -f share/man/man1/* ; cp -a orig/* share/man/man1/ ; time make man-pages >/dev/null 2>&1
real 0m4.131s
user 0m3.987s
sys 0m0.451sUsing a local revert of 00e1b99, I even got your lines to work.
git checkout make
git clean -fdxq share; time make man-pages >/dev/null 2>&1
real 0m4.329s
user 0m4.173s
sys 0m0.477s
git checkout master
git clean -fdxq share; time make man-pages >/dev/null 2>&1
real 0m1.230s
user 0m1.323s
sys 0m0.227s
There seems to also be a downside content-wise in rendering each page indiviually:
-<p><a class="man-ref" href="hub-am.1.html">hub-am<span class="s">(1)</span></a>, <a class="man-ref" href="hub.1.html">hub<span class="s">(1)</span></a>, <span class="man-ref">git-apply<span class="s">(1)</span></span></p>
+<p><span class="man-ref">hub-am<span class="s">(1)</span></span>, <span class="man-ref">hub<span class="s">(1)</span></span>, <span class="man-ref">git-apply<span class="s">(1)</span></span></p>but then I found that parallelism plays a role there:
git checkout master
git clean -fdxq ; mkdir -p bin; ln -sf /usr/bin/ronn bin/ronn ; time make -j2 man-pages >/dev/null 2>&1 ; ls share/man/man1/|wc -l
real 0m8.037s
user 0m15.778s
sys 0m1.555s
100
> git checkout make
Switched to branch 'make'
> git clean -fdxq ; mkdir -p bin; ln -sf /usr/bin/ronn bin/ronn ; time make -j2 man-pages >/dev/null 2>&1 ; ls share/man/man1/|wc -l
real 0m2.633s
user 0m4.913s
sys 0m0.578s
100
|
I think, both the original code and my current patch are not doing everything right. |
31ae133
to
9d1172f
Compare
|
I pushed a new version, that works nicely with any level of parallelism. The code might be a bit less straight-forward, though. I was wondering, if we could drop the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P.S. to support hub building from git checkout, you need to check it out to
your $GOPATH/src/github.com/github/hub.
Makefile
Outdated
|
|
||
| %.txt: %.ronn | ||
| groff -Wall -mtty-char -mandoc -Tutf8 -rLL=$(TEXT_WIDTH)n $< | col -b >$@ | ||
|
|
||
| %.1: %.1.ronn bin/ronn | ||
| %.1: %.1.ronn bin/ronn $(HELP_ALL:=.ronn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, this is not as straightforward as before.
What's the overall benefit of this change? On my computer, I don't see an improvement in your branch runing make man-pages, and with running make -j2 man-pages I see a 1s improvement. I'm not sure if supporting the parallelism case is worth it.
How often are you building man-pages? Is the current speed unacceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the patch was not meant so much for speedup but rather for correctness.
The old Makefile told ronn to render all manpages - once per input file, so if you have 24 input .ronn files, it could run ronn 24 times, writing every output file 24 times. That is what I dislike.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only see it run one time, since a single bin/ronn invocation is used to pass in all share/man/man1/*.ronn files
$ git clean -fdxq share; time make -d man-pages 2>/dev/null | grep ^bin/ronn
bin/ronn --organization=GITHUB --manual="Hub Manual" share/man/man1/*.ronn
real 0m3.109s
user 0m2.102s
sys 0m2.024s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try
git clean -fdxq share ; rm -f bin/hub ; time make -j2 man-pages 2>/dev/null | grep ^bin/ronnIt seems to make a difference that bin/hub is getting a new timestamp in the process and parallelism is involved via -j2.
with -j1 I get 1 ronn call and
real 0m1.752s
user 0m2.077s
sys 0m0.337s
but with -j2 I get 25 ronn calls and
real 0m8.042s
user 0m15.800s
sys 0m1.508s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try
git clean -fdxq share ; rm -f bin/hub ; time make -j2 man-pages 2>/dev/null | grep ^bin/ronn
Ah yes, thank you for that. I can see the problem now. Is this something that people will likely encounter, though? I see this as an edge case that might not be trivial nor worth it for us to fix. We've already spent a disproportionate amount of time discussing and debugging this when compared to the actual impact it has on our users.
If you have a solution that's straightforward in the Makefile, I would gladly accept it though. Thanks for your patience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen it in our package builds (because people think, using parallelism on today's 32-core systems is a good thing)
https://build.opensuse.org/package/live_build_log/openSUSE:Factory/hub/standard/x86_64
I can experiment some more with make to come up with something nicer.
|
When looking at the bison example I would think that this should do the trick, but it does not. $(HELP_ALL): bin/ronn $(HELP_ALL:=.ronn)
bin/ronn --organization=GITHUB --manual="Hub Manual" share/man/man1/*.ronnMay be a bug in GNU make? |
|
Filed a GNU make bug with a minimal reproducer based on what hub did. Also pushed a workaround where man-pages are generated in a non-parallel make run. |
2c05d41
to
aaef748
Compare
by a factor of 3 Without this patch, using make -j built every man page and html page 25 times Now, man pages are always generated in a single ronn call after all .ronn files are made. Workarounds GNU make bug https://savannah.gnu.org/bugs/?54854
|
Further refined the workaround with the help of my SUSE colleagues. Looks much nicer now. Can you please test/review/merge this? |
|
ping. anything left to do? |
|
@bmwiedemann Thank you for your contribution and assistance! |
Speed up man page generation by a factor of 3 (from 18 to 6 seconds in the single-CPU case)
by not building every man page and html page several times (e.g. 24x)