Skip to content

Conversation

graydon
Copy link
Contributor

@graydon graydon commented Apr 17, 2018

This is a followup to rdar://39332957 in which it was observed that a batch mode run with -Xfrontend -debug-time-function-bodies seemed to be slowing down quite a bit.

Turns out that the driver's task queue polling loop, while it does poll for readiness on each of its child fds, then assumes that initial readiness on an fd means all data the child is going to write is now available, so does blocking reads on the pipe until it ends as soon as the first byte is available.

This is Very Wrong. It means that if multiple subprocesses trickle output, the poll loop switches to serializing them, draining the entirety of the first subprocess' pipe while the others are blocked on full pipes, then moving on to the second, and so forth.

This happens independently of batch mode, and is unrelated to the error-stream separation (turns out we only separate error streams when running xcrun anyways).

Long term the polling loop here has become quite a mess and needs substantial refactoring / rewriting. Short term, this can be hacked-around by passing a flag to the two non-terminal read loops to cause them to exit after the first successful read they execute (thus returning to the poll loop and continuing to interleave reads from other children).

rdar://39504759

@graydon
Copy link
Contributor Author

graydon commented Apr 17, 2018

@swift-ci please test

/// \p untilEnd is true, reads until the end of the stream; otherwise
/// reads once (posslby with a retry on EINTR), and returns.
/// \returns true on error, false on success.
bool readFromPipes(bool untilEnd=true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't seen that formatting of = without spaces before. Does git-clang-format let it pass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I did not clang-format this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine with me, but maybe others will object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think convention definitely skews towards including the spaces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -253,15 +255,17 @@ static bool readFromAPipe(int Pipe, std::string &Output) {
}

Output.append(outputBuffer, readBytes);
if (!untilEnd)
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this now reads only 1k at a time normally. That seems a bit small according to my intuition. Rather than stop after one read, would it work better to stop whenever readBytes != sizeof(outputBuffer) (modulo EINTR)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait--I had assumed the pipe was in non-blocking mode. Hmmm..... If blocking, you'd need to check that data are available before another read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. I can extend the condition to if (!(readBytes == sizeof(outputBuffer) || untilEnd)), good enough?

Copy link
Contributor

@davidungar davidungar Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the pipe is in blocking mode, you probably need to do an ioctl OR fcntrl ahead of the read to ensure you don't block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the whole point here is to only read on fds that are readable. Which, hm, yeah, is why I wrote it to be only one read per poll in the first place.

It's crude, but it works as-is and we're only digesting diagnostic streams here anyways, they're not exactly high volume.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. We can tweak later.

@@ -0,0 +1,48 @@
// RUN: %empty-directory(%t)
// RUN: %empty-directory(%t/foo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to use the name of the test instead of "foo" here in case anyone ever needs to back trace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%t already includes the name of the test. I'd still be inclined to give it a stronger name, though.

(Also, no need to do %t first; %empty-directory is a form of mkdir -p.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

@davidungar davidungar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried that only reading 1k at a time might be quite inefficient. What do you think?

let a = [v, 1, ${i}, 3, ${N}, 4]
return a.reduce(0, {$0 + $1})
}
% end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever test!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is still the round-robin issue in the code, I'm wondering if the test would be better-named something about fairness, or not-blocking on pipes? Or non-sequential pipes??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mind, what we deduced while examining the code is that it's written as though it's cycling fairly between all the readable pipes (i.e. it does actually want to be fair, with a for (struct pollfd &fd : PollFds) loop wrapped around the reads), but that it's just overcommitting to one of the elements it finds in that outer loop, and so this patch is restoring the round-robin-ness of the existing design to a functional form.

In our earlier conversation we discussed maybe trying to impose fairness on the loop by maintaining an index into the fds that we're scanning between calls to poll, but in retrospect I think that won't really work (the poll-set returned after each poll() call will be different) and I think so long as it's not doing a read that will block, it's going to get to all the readable fds in order on each loop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

bool readFromPipes();
/// \brief Reads data from the pipes, if any is available. If
/// \p untilEnd is true, reads until the end of the stream; otherwise
/// reads once (posslby with a retry on EINTR), and returns.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type: "posslby". Also, does this really belong in the "brief" description?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

/// \p untilEnd is true, reads until the end of the stream; otherwise
/// reads once (posslby with a retry on EINTR), and returns.
/// \returns true on error, false on success.
bool readFromPipes(bool untilEnd=true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think convention definitely skews towards including the spaces.

@@ -0,0 +1,48 @@
// RUN: %empty-directory(%t)
// RUN: %empty-directory(%t/foo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%t already includes the name of the test. I'd still be inclined to give it a stronger name, though.

(Also, no need to do %t first; %empty-directory is a form of mkdir -p.)

// serialization) so this should catch it.
//
// RUN: %swiftc_driver -j 4 -module-name foo -c -stats-output-dir %t/stats-pre %t/foo/*.swift
// RUN: %swiftc_driver -j 4 -module-name foo -c -stats-output-dir %t/stats-post -Xfrontend -debug-time-function-bodies %t/foo/*.swift
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be %target-build-swift so that they work for cross-compilation, and may as well use -typecheck instead of -c to do a bit less work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@@ -440,7 +440,12 @@ def evaluate(args):
ident = re.compile('(\w+)$')
for (k, v) in merged.stats.items():
if k.startswith("time.") or '.time.' in k:
continue
# In merge-timers mode, we flatten timer names so they can be
# used as idents; otherwise we omit them since they're messy.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated, but sure. I can't really review it because I don't know what problem it's solving.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just plumbing to make it possible to evaluate the ratio I use in the test.

Copy link
Contributor

@davidungar davidungar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice small change.

@graydon graydon force-pushed the rdar-39504759-keep-demultiplexing-after-first-read branch from 29107fc to aa2ec1a Compare April 18, 2018 07:26
@graydon
Copy link
Contributor Author

graydon commented Apr 18, 2018

@swift-ci please test and merge

1 similar comment
@graydon
Copy link
Contributor Author

graydon commented Apr 18, 2018

@swift-ci please test and merge

@graydon graydon force-pushed the rdar-39504759-keep-demultiplexing-after-first-read branch from aa2ec1a to c1682fa Compare April 18, 2018 18:52
@graydon
Copy link
Contributor Author

graydon commented Apr 18, 2018

(Updated with new, more stable testing strategy of ratio-of-reads-to-polls)

@graydon
Copy link
Contributor Author

graydon commented Apr 18, 2018

@swift-ci please test and merge

1 similar comment
@graydon
Copy link
Contributor Author

graydon commented Apr 18, 2018

@swift-ci please test and merge

@swift-ci swift-ci merged commit ea39cea into swiftlang:master Apr 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants