New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some low-hanging rustdoc optimizations #44613

Merged
merged 2 commits into from Oct 15, 2017

Conversation

@QuietMisdreavus
Member

QuietMisdreavus commented Sep 15, 2017

There were a few discussions earlier today in #rust-internals about the syscall usage and overall performance of rustdoc. This PR is intended to pick some low-hanging fruit and try to rein in some of the performance issues of rustdoc.

@rust-highfive

This comment has been minimized.

Show comment
Hide comment
@rust-highfive

rust-highfive Sep 15, 2017

Collaborator

r? @frewsxcv

(rust_highfive has picked a reviewer for you, use r? to override)

Collaborator

rust-highfive commented Sep 15, 2017

r? @frewsxcv

(rust_highfive has picked a reviewer for you, use r? to override)

@QuietMisdreavus

This comment has been minimized.

Show comment
Hide comment
@QuietMisdreavus

QuietMisdreavus Sep 15, 2017

Member

cc @retep998 since they offered to help profile on windows - winapi was one of the inspirations for this PR

cc @bluss since they were also in the discussion and itertools was the other inspiration

Member

QuietMisdreavus commented Sep 15, 2017

cc @retep998 since they offered to help profile on windows - winapi was one of the inspirations for this PR

cc @bluss since they were also in the discussion and itertools was the other inspiration

@frewsxcv frewsxcv assigned GuillaumeGomez and unassigned frewsxcv Sep 15, 2017

@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton Sep 21, 2017

Member

ping @QuietMisdreavus, just want to make sure this doesn't fall off your radar!

Member

alexcrichton commented Sep 21, 2017

ping @QuietMisdreavus, just want to make sure this doesn't fall off your radar!

@QuietMisdreavus

This comment has been minimized.

Show comment
Hide comment
@QuietMisdreavus

QuietMisdreavus Sep 21, 2017

Member

I tried to find other things to work on by profiling rustdoc, but Visual Studio didn't think that symbols were made for rustdoc itself. Even without that, there's one more thing i'd like to try on this branch (keeping a single write buffer and handing that around when rendering all the pages). Otherwise i'd like to ask @retep998 to make sure this lowers the amount of WriteFile syscalls that are done for redirect pages.

Member

QuietMisdreavus commented Sep 21, 2017

I tried to find other things to work on by profiling rustdoc, but Visual Studio didn't think that symbols were made for rustdoc itself. Even without that, there's one more thing i'd like to try on this branch (keeping a single write buffer and handing that around when rendering all the pages). Otherwise i'd like to ask @retep998 to make sure this lowers the amount of WriteFile syscalls that are done for redirect pages.

@QuietMisdreavus QuietMisdreavus changed the title from [WIP] rustdoc optimizations to some low-hanging rustdoc optimizations Sep 23, 2017

@QuietMisdreavus

This comment has been minimized.

Show comment
Hide comment
@QuietMisdreavus

QuietMisdreavus Sep 23, 2017

Member

So here's where this PR stands:

Right now there are two basic changes here:

  • Create the directory structure for the documentation ahead of time, instead of calling create_dir_all for every file that gets written.
  • Proxy every call to layout::render or layout::redirect through a wrapper that uses a shared buffer before writing to whatever writer was asked for. (And strip out all of the ad-hoc buffering that was happening beforehand.) This helps cut down on the number of allocations that would get made during a doc rendering, and also buffers each file write (most of the file writes were already buffered, but redirect pages were being written directly, causing ~nine syscalls per redirect page.) (This slightly cuts down on the parallelizability of the rendering process, but if truly necessary we can move the buffer into TLS instead of the SharedContext.)

I would like to look at more structural optimization opportunities, but whenever i try to attach Visual Studio to rustdoc it refuses to acknowledge any debuginfo that would let it use the source files in librustdoc. It sees the symbols for rustdoc-tool-binary, which actually includes things like function names, but it doesn't see any source info that it can use to assist a debug or profile session. As such, i'm calling this PR ready to go, and i'll defer any farther work until i can figure out what's going on with that.

Member

QuietMisdreavus commented Sep 23, 2017

So here's where this PR stands:

Right now there are two basic changes here:

  • Create the directory structure for the documentation ahead of time, instead of calling create_dir_all for every file that gets written.
  • Proxy every call to layout::render or layout::redirect through a wrapper that uses a shared buffer before writing to whatever writer was asked for. (And strip out all of the ad-hoc buffering that was happening beforehand.) This helps cut down on the number of allocations that would get made during a doc rendering, and also buffers each file write (most of the file writes were already buffered, but redirect pages were being written directly, causing ~nine syscalls per redirect page.) (This slightly cuts down on the parallelizability of the rendering process, but if truly necessary we can move the buffer into TLS instead of the SharedContext.)

I would like to look at more structural optimization opportunities, but whenever i try to attach Visual Studio to rustdoc it refuses to acknowledge any debuginfo that would let it use the source files in librustdoc. It sees the symbols for rustdoc-tool-binary, which actually includes things like function names, but it doesn't see any source info that it can use to assist a debug or profile session. As such, i'm calling this PR ready to go, and i'll defer any farther work until i can figure out what's going on with that.

@QuietMisdreavus

This comment has been minimized.

Show comment
Hide comment
@QuietMisdreavus

QuietMisdreavus Sep 23, 2017

Member

travis failure was some rustdoc output tests failing:

[01:05:53] failures:
[01:05:53]     [rustdoc] rustdoc/extern-links.rs
[01:05:53]     [rustdoc] rustdoc/inline_local/glob-extern.rs
[01:05:53]     [rustdoc] rustdoc/inline_local/glob-private.rs
[01:05:53]     [rustdoc] rustdoc/issue-34025.rs

Gonna figure out what i broke.

Member

QuietMisdreavus commented Sep 23, 2017

travis failure was some rustdoc output tests failing:

[01:05:53] failures:
[01:05:53]     [rustdoc] rustdoc/extern-links.rs
[01:05:53]     [rustdoc] rustdoc/inline_local/glob-extern.rs
[01:05:53]     [rustdoc] rustdoc/inline_local/glob-private.rs
[01:05:53]     [rustdoc] rustdoc/issue-34025.rs

Gonna figure out what i broke.

@QuietMisdreavus

This comment has been minimized.

Show comment
Hide comment
@QuietMisdreavus

QuietMisdreavus Sep 24, 2017

Member

Looks like i misjudged the checks for making sure a rendered item is empty, and wound up emitting empty files when the file shouldn't exist in the first place. I'm gonna try running these tests locally - hopefully my laptop fares better than my server, which has never been able to locally run a test in my experience >_>

Member

QuietMisdreavus commented Sep 24, 2017

Looks like i misjudged the checks for making sure a rendered item is empty, and wound up emitting empty files when the file shouldn't exist in the first place. I'm gonna try running these tests locally - hopefully my laptop fares better than my server, which has never been able to locally run a test in my experience >_>

@QuietMisdreavus

This comment has been minimized.

Show comment
Hide comment
@QuietMisdreavus

QuietMisdreavus Sep 24, 2017

Member

Turns out, i had the control flow wrong. I assumed the buffer checks i wrote in to places that checked for zero-sized writes were equivalent, but in some cases it doesn't even go through the write call, which is when that buffer would have been empty in the first place. I squashed the commits up with the fix. The tests that failed last time passed on my machine; let's see if travis agrees...

Member

QuietMisdreavus commented Sep 24, 2017

Turns out, i had the control flow wrong. I assumed the buffer checks i wrote in to places that checked for zero-sized writes were equivalent, but in some cases it doesn't even go through the write call, which is when that buffer would have been empty in the first place. I squashed the commits up with the fix. The tests that failed last time passed on my machine; let's see if travis agrees...

Show outdated Hide outdated src/librustdoc/html/render.rs Outdated
Show outdated Hide outdated src/librustdoc/html/render.rs Outdated
@QuietMisdreavus

This comment has been minimized.

Show comment
Hide comment
@QuietMisdreavus

QuietMisdreavus Sep 28, 2017

Member

ping @ollie27 and @GuillaumeGomez, just wanted to make sure this doesn't fall off everyone's radar.

Member

QuietMisdreavus commented Sep 28, 2017

ping @ollie27 and @GuillaumeGomez, just wanted to make sure this doesn't fall off everyone's radar.

@GuillaumeGomez

This comment has been minimized.

Show comment
Hide comment
@GuillaumeGomez

GuillaumeGomez Sep 28, 2017

Member

Do you have some numbers to allow us to compare? A before/after would be very appreciated. :)

Member

GuillaumeGomez commented Sep 28, 2017

Do you have some numbers to allow us to compare? A before/after would be very appreciated. :)

@QuietMisdreavus

This comment has been minimized.

Show comment
Hide comment
@QuietMisdreavus

QuietMisdreavus Sep 28, 2017

Member

This is from when @retep998 compared this branch to the latest nightly at the time:

PM 094225 <WindowsBunnyDoesSupportIndexing> misdreavus: 114 seconds in nightly to 107 seconds using semi recent version of your PR

That was when rendering winapi on windows, which was the worst-case inspiration for this PR. (The last time i tried comparing on my own system things kept going wrong, but i can give it another shot tonight.)

Member

QuietMisdreavus commented Sep 28, 2017

This is from when @retep998 compared this branch to the latest nightly at the time:

PM 094225 <WindowsBunnyDoesSupportIndexing> misdreavus: 114 seconds in nightly to 107 seconds using semi recent version of your PR

That was when rendering winapi on windows, which was the worst-case inspiration for this PR. (The last time i tried comparing on my own system things kept going wrong, but i can give it another shot tonight.)

@retep998

This comment has been minimized.

Show comment
Hide comment
@retep998

retep998 Sep 28, 2017

Member

Keep in mind that out of that time, 17 seconds is spent on compiling winapi, and another 78 seconds is spent on unavoidable NtCreateFile/NtWriteFile/NtCloseFile calls (unless rustdoc decides to stop creating so many files). If you exclude those two things, the difference is much more significant: 19 seconds to 12 seconds.

Member

retep998 commented Sep 28, 2017

Keep in mind that out of that time, 17 seconds is spent on compiling winapi, and another 78 seconds is spent on unavoidable NtCreateFile/NtWriteFile/NtCloseFile calls (unless rustdoc decides to stop creating so many files). If you exclude those two things, the difference is much more significant: 19 seconds to 12 seconds.

bors added a commit that referenced this pull request Oct 3, 2017

Auto merge of #44949 - QuietMisdreavus:rustdoctest-dirs, r=nikomatsakis
let htmldocck.py check for directories

Since i messed this up during #44613, i wanted to codify this into the rustdoc tests to make sure that doesn't happen again.
@shepmaster

This comment has been minimized.

Show comment
Hide comment
@shepmaster

shepmaster Oct 6, 2017

Member

Ping @rust-lang/docs. It's been over 6 days since we last heard from @GuillaumeGomez. It may be time to assign a new reviewer!

Member

shepmaster commented Oct 6, 2017

Ping @rust-lang/docs. It's been over 6 days since we last heard from @GuillaumeGomez. It may be time to assign a new reviewer!

@QuietMisdreavus

This comment has been minimized.

Show comment
Hide comment
@QuietMisdreavus

QuietMisdreavus Oct 6, 2017

Member

Might be better/faster to tag @rust-lang/dev-tools? This doesn't necessarily deal with the appearance or structure of docs so it more a dev-tools concern than a docs one.

Member

QuietMisdreavus commented Oct 6, 2017

Might be better/faster to tag @rust-lang/dev-tools? This doesn't necessarily deal with the appearance or structure of docs so it more a dev-tools concern than a docs one.

@GuillaumeGomez

This comment has been minimized.

Show comment
Hide comment
@GuillaumeGomez

GuillaumeGomez Oct 7, 2017

Member

Ah right, I thought I already said it was good for me but I didn't. My bad... I'd just prefer that the @rust-lang/dev-tools take a look at it first.

Member

GuillaumeGomez commented Oct 7, 2017

Ah right, I thought I already said it was good for me but I didn't. My bad... I'd just prefer that the @rust-lang/dev-tools take a look at it first.

@aidanhs

This comment has been minimized.

Show comment
Hide comment
@aidanhs

aidanhs Oct 12, 2017

Member

Hi @fitzgen, you're the lucky random person from the dev tools team I've decided to additionally assign this PR to during triage! Would you be able to take a look at this, or select a more appropriate member of your team?

Member

aidanhs commented Oct 12, 2017

Hi @fitzgen, you're the lucky random person from the dev tools team I've decided to additionally assign this PR to during triage! Would you be able to take a look at this, or select a more appropriate member of your team?

@fitzgen

This comment has been minimized.

Show comment
Hide comment
@fitzgen

fitzgen Oct 12, 2017

Member

@michaelwoerister agreed to take a look on irc.

Member

fitzgen commented Oct 12, 2017

@michaelwoerister agreed to take a look on irc.

@michaelwoerister

This comment has been minimized.

Show comment
Hide comment
@michaelwoerister

michaelwoerister Oct 12, 2017

Contributor

r? @michaelwoerister (so I don't forget)

Contributor

michaelwoerister commented Oct 12, 2017

r? @michaelwoerister (so I don't forget)

@michaelwoerister michaelwoerister self-assigned this Oct 12, 2017

@michaelwoerister

This is a review of just the first commit. My general impression is that introducing the complications of globally shared mutable state outweigh the potential performance gains here. I would suggest just making sure that file access is wrapped in BufWriters with enough capacity everywhere and testing whether that doesn't solve the problem too.

If it is indeed memory allocation that is a bottleneck here, I would suggest implementing a pool of Vec<u8> instead of sharing a single vector in a RefCell. Rust's move semantics make these perfectly safe and the pool's get method can take care of clearing the buffer before returning it.

Show outdated Hide outdated src/librustdoc/html/render.rs Outdated
Show outdated Hide outdated src/librustdoc/html/render.rs Outdated
Show outdated Hide outdated src/librustdoc/html/render.rs Outdated
Show outdated Hide outdated src/librustdoc/html/render.rs Outdated
Show outdated Hide outdated src/librustdoc/html/render.rs Outdated
@michaelwoerister

EDIT: This is the review of the second commit.

It seems to me that the logic in fn recurse could very easily go out of sync with what directories the subsequent code expects.

If the goal is to avoid redundant calls to fs::create_dir_all(), you could also just keep an in-memory cache of directories already created, in SharedContext for example. Then you could have a method like this:

impl Context {
    fn ensure_dir_exists(&self, dir: &Path) -> Result<(), Error> {
        if self.shared.dirs_created.borrow_mut().insert(dir.to_path_buf()) {
            try_err!(fs::create_dir_all(path), path);
        }
    }
}

That way you don't have to keep two complicated trees of decision logic in sync.

@QuietMisdreavus

This comment has been minimized.

Show comment
Hide comment
@QuietMisdreavus

QuietMisdreavus Oct 13, 2017

Member

Re: ensure_dir_exists, that sounds more reasonable at this point. I think i was really anxious to avoid the allocations for the PathBufs and HashSet while writing it, but looking back, that's probably much less of a cost than the actual directory creation calls. I'll go ahead and do that.

Member

QuietMisdreavus commented Oct 13, 2017

Re: ensure_dir_exists, that sounds more reasonable at this point. I think i was really anxious to avoid the allocations for the PathBufs and HashSet while writing it, but looking back, that's probably much less of a cost than the actual directory creation calls. I'll go ahead and do that.

@QuietMisdreavus

This comment has been minimized.

Show comment
Hide comment
@QuietMisdreavus

QuietMisdreavus Oct 13, 2017

Member

I've force-pushed an update that massively strips down this PR:

  • The global write buffer is gone. The main culprit that that commit was intended to address was the multiple write calls when writing a redirect page, so now that commit only wraps those files in BufWriters. Everything else went through a buffer in one way or another, so i left it alone.
  • The advance directory creation logic is gone. In its place is ensure_dir, written nearly exactly like @michaelwoerister suggested. The create_dir_all calls that were taken out of the first commit have been replaced with calls to ensure_dir instead. (Except for one of them - that directory is always going to exist by the time that line comes up, so i just left that one out.)
Member

QuietMisdreavus commented Oct 13, 2017

I've force-pushed an update that massively strips down this PR:

  • The global write buffer is gone. The main culprit that that commit was intended to address was the multiple write calls when writing a redirect page, so now that commit only wraps those files in BufWriters. Everything else went through a buffer in one way or another, so i left it alone.
  • The advance directory creation logic is gone. In its place is ensure_dir, written nearly exactly like @michaelwoerister suggested. The create_dir_all calls that were taken out of the first commit have been replaced with calls to ensure_dir instead. (Except for one of them - that directory is always going to exist by the time that line comes up, so i just left that one out.)
@retep998

This comment has been minimized.

Show comment
Hide comment
@retep998

retep998 Oct 13, 2017

Member

Unfortunate to see that the shared buffer is gone, considering that a significant portion of the CPU time that wasn't spent on IO syscalls was spent on heap allocation.

Member

retep998 commented Oct 13, 2017

Unfortunate to see that the shared buffer is gone, considering that a significant portion of the CPU time that wasn't spent on IO syscalls was spent on heap allocation.

impl SharedContext {
fn ensure_dir(&self, dst: &Path) -> io::Result<()> {
let mut dirs = self.created_dirs.borrow_mut();
if !dirs.contains(dst) {

This comment has been minimized.

@michaelwoerister

michaelwoerister Oct 15, 2017

Contributor

This nicely avoids allocating the PathBuf if the directory is already present 👍

@michaelwoerister

michaelwoerister Oct 15, 2017

Contributor

This nicely avoids allocating the PathBuf if the directory is already present 👍

@michaelwoerister

This comment has been minimized.

Show comment
Hide comment
@michaelwoerister

michaelwoerister Oct 15, 2017

Contributor

@QuietMisdreavus Thanks a lot for updating the PR. I think it's worth keeping things simple.

@bors r+

Contributor

michaelwoerister commented Oct 15, 2017

@QuietMisdreavus Thanks a lot for updating the PR. I think it's worth keeping things simple.

@bors r+

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Oct 15, 2017

Contributor

📌 Commit 2c9d452 has been approved by michaelwoerister

Contributor

bors commented Oct 15, 2017

📌 Commit 2c9d452 has been approved by michaelwoerister

@michaelwoerister

This comment has been minimized.

Show comment
Hide comment
@michaelwoerister

michaelwoerister Oct 15, 2017

Contributor

Unfortunate to see that the shared buffer is gone, considering that a significant portion of the CPU time that wasn't spent on IO syscalls was spent on heap allocation.

There's still the option of implementing a pool of re-usable buffers which should have pretty much the same performance characteristics but without the architectural downsides.

Contributor

michaelwoerister commented Oct 15, 2017

Unfortunate to see that the shared buffer is gone, considering that a significant portion of the CPU time that wasn't spent on IO syscalls was spent on heap allocation.

There's still the option of implementing a pool of re-usable buffers which should have pretty much the same performance characteristics but without the architectural downsides.

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Oct 15, 2017

Contributor

⌛️ Testing commit 2c9d452 with merge c4f489a...

Contributor

bors commented Oct 15, 2017

⌛️ Testing commit 2c9d452 with merge c4f489a...

bors added a commit that referenced this pull request Oct 15, 2017

Auto merge of #44613 - QuietMisdreavus:rustdoc-perf, r=michaelwoerister
some low-hanging rustdoc optimizations

There were a few discussions earlier today in #rust-internals about the syscall usage and overall performance of rustdoc. This PR is intended to pick some low-hanging fruit and try to rein in some of the performance issues of rustdoc.
@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Oct 15, 2017

Contributor

☀️ Test successful - status-appveyor, status-travis
Approved by: michaelwoerister
Pushing c4f489a to master...

Contributor

bors commented Oct 15, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: michaelwoerister
Pushing c4f489a to master...

@bors bors merged commit 2c9d452 into rust-lang:master Oct 15, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details

@QuietMisdreavus QuietMisdreavus deleted the QuietMisdreavus:rustdoc-perf branch Feb 26, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment