New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eagerly fetch stderr in remote process execution #5735

Merged
merged 5 commits into from Apr 26, 2018

Conversation

Projects
2 participants
@ity
Copy link
Member

ity commented Apr 22, 2018

Problem

As defined in #5511 - Right now when remote execution finishes, we just return whether it was successful, and any inline stdout/stderr.

Instead, we should fetch stderr if it's not inline, and populate those fields.

Solution

  • fetch stderr and populate using digest/raw
  • extract stdout and stderr into functions
  • add test for stderr

@ity ity added the remoting label Apr 22, 2018

@ity ity self-assigned this Apr 22, 2018

@ity ity requested a review from illicitonion Apr 22, 2018

@ity ity changed the title Ity/5511 eagerly fetch stderr in remote process execution Apr 22, 2018

@ity ity added this to In progress in Remoting Apr 22, 2018

@illicitonion
Copy link
Contributor

illicitonion left a comment

Looks great :) Just a couple of clean-ups to do

@@ -207,121 +208,165 @@ impl CommandRunner {
);
// TODO: Log less verbosely
debug!("Got (nested) execute response: {:?}", execute_response);
let stdout = self.extract_stdout(execute_response.clone());

This comment has been minimized.

@illicitonion

illicitonion Apr 23, 2018

Contributor

It's a little sad to be cloning ExecuteResponse so many times, and I think we can avoid it, but there are a few steps:

  1. Modify the extract_ methods to take references to ExecuteResponses, rather than taking mutable ownership of them. In order to do this, rather than calling take_result().take_stdout_raw() in the extract_ methods we'd need to call .get_result().get_stdout_raw(). The reason it was taking before is because otherwise the stdout_raw Vec needed copying, which was sad, but stdout_raw is now a Bytes rather than a Vec, which means copying is now incrementing a reference count rather than copying an underlying buffer.
  2. We need one consistent owner of ExecuteResponse (because we're not cloning it any more). There are a couple of ways that we could nicely do this. One is to chain our extractions, having each extract_ method take ownership over the ExecuteResponse, and then return it out:
self.extract_stdout(execute_response)
  .and_then(move |(stdout, execute_response)| {
    let stderr = runner.extract_stderr(execute_response));
    stderr.and_then(move |(stderr, execute_response)| {
      match grpcio...
    })
  })

This is kind of ugly because of the whole "take ownership of, and then return" part.

Another option would be to move ExecuteResponse onto the heap into an Arc, and have the extract_ methods use that as a reference. Nothing else in the method actually needs to modify the ExecuteResponse, so this is pretty clean and safe (but we need to up-front clone the Arc a couple of times, because Arcs + Futures aren't very ergonomic. Also, it allows us to parallelise the stdout and stderr extraction:

let execute_response_arc = Arc::new(execute_response_arc);
let execute_response_for_stdout = execute_response_arc.clone();
let execute_response_for_stderr = execute_response_arc.clone();

futures::join_all(self.extract_stdout(execute_response_for_stdout), self.extract_stdout(execute_response_for_stderr))
  .and_then(move |(stdout, stderr) { <the contents of the inner and_then, but referencing execute_response_arc not execute_response> })

This comment has been minimized.

@ity

ity Apr 25, 2018

Member

thank you for this explanation - makes complete sense. also read https://carllerche.github.io/bytes/bytes/struct.Bytes.html to solidify understanding there. And talking to @stu realized that execute_response didnt need to be cloned after all and could be passed in as a reference. this cleaned up the code a fair bit I think.

return stdout;
}

fn extract_stderr(

This comment has been minimized.

@illicitonion

illicitonion Apr 23, 2018

Contributor

We can re-use the logic between extract_stdout and extract_stderr by taking three functions as arguments here. I think the signature would end up looking like:
fn extract_output(&self, mut execute_response: ExecuteResponse, Fn(&bazel_protos::remote_execution::ActionResult) -> bool, Fn(&bazel_protos::remote_execution::ActionResult) -> &bazel_protos::remote_execution::Digest, Fn(&bazel_protos::remote_execution::ActionResult) -> bytes::Bytes), but there could be some reference lifetime fun. But you'd just call:

self.extract_output(
  execute_response,
  |action| action.has_stderr_digest(),
  |action| action.get_stderr_digest()
  |action| action.get_stdout_raw_for_reflect().clone()
)

This comment has been minimized.

@ity

ity Apr 25, 2018

Member

I thought about this, but decided against it since readability wise, this didnt look great to me. happy to change though, if you insist.

@ity ity force-pushed the ity:ity/5511 branch from 0549d3e to e817770 Apr 25, 2018

@illicitonion
Copy link
Contributor

illicitonion left a comment

Looks great, thanks!

@@ -415,6 +453,13 @@ mod tests {
Raw(String),
Digest(Digest),
}

#[derive(Debug, PartialEq)]
enum StderrType {

This comment has been minimized.

@illicitonion

illicitonion Apr 26, 2018

Contributor

I'm 50:50 on renaming StdoutType to just OutputType and re-used it for both vs the type safety of different arguments having different types... :)

illicitonion added some commits Apr 26, 2018

@illicitonion illicitonion merged commit ff50cf7 into pantsbuild:master Apr 26, 2018

1 check failed

continuous-integration/travis-ci/pr The Travis CI build failed
Details

Remoting automation moved this from In progress to Done Apr 26, 2018

@ity ity deleted the ity:ity/5511 branch Apr 26, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment