Skip to content

Conversation

@ndokos
Copy link
Member

@ndokos ndokos commented May 16, 2016

Neither pbench-register-tool, nor pbench-register-tool-set
was checking for errors. In addition, the ssh | sed pipeline
in pbench-register-tool was returning the sed status which was
0.

Set the pipefail option in base: that way pipelines will return
the status of the last failed command in the pipeline, not that of the
last command. In particular, ssh failures will be caught.

pbench-register-tool now exits with the status of the pipeline.
pbench-register-tool-set counts pbench-register-tool failures
and exits with status equal to the number of failed pbench-register-tool
calls.

@ndokos ndokos force-pushed the wip-ssh-status branch 2 times, most recently from 6bae2b7 to 2a52de9 Compare May 16, 2016 17:21
@ndokos
Copy link
Member Author

ndokos commented May 16, 2016

@atheurer: can you please review this?

@portante
Copy link
Member

Looks good to me.

Could we add a unit test that catches hidden pipeline failures?

@ndokos ndokos added this to the v0.39 milestone Jun 10, 2016
@portante portante self-assigned this Jun 17, 2016
@ndokos ndokos force-pushed the wip-ssh-status branch 2 times, most recently from e2a9dd7 to bb68731 Compare July 11, 2016 22:25
@ndokos
Copy link
Member Author

ndokos commented Jul 11, 2016

@portante: I'll add a couple more tests but the underlying structure is sound, I think.

@@ -1,4 +1,5 @@
#!/bin/bash
# -*- mode: shell-script; indent-tabs-mode: t; sh-basic-offset: 8; sh-indentation: 8; sh-indent-for-case-alt: + -*-
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we specify real tabs instead of spaces for emacs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this is what indent-tabs-mode: t does. The doc says

Indentation can insert tabs if this is non-nil.

@ndokos
Copy link
Member Author

ndokos commented Jul 12, 2016

I'm going to squash the three non-unit-test commits into one and combine the commit messages.

@ndokos ndokos force-pushed the wip-ssh-status branch 2 times, most recently from ff20a01 to a55f87e Compare July 14, 2016 18:53
ndokos added 3 commits July 15, 2016 09:08
Scripts where pipelines like this: ssh remote command | process output
are used do not deal with ssh errors properly. The pipeline returns
the status of the last command.

Set the pipefail option in base: that way pipelines will return the
status of the last failed command in the pipeline, not that of the
last command.

o pbench-register-tool now exits with the correct status of the
  pipeline.  pbench-register-tool-set counts pbench-register-tool
  failures and exits with status equal to the number of failed
  pbench-register-tool calls.

o Extend ssh status checking to pbench-clear-tools.

  There are two cases to consider: pbench-clear-tools is called
  with or without a --name=foo option.

  In the first case, the intent is to clear a single
  tool. pbench-clear-tools will try to ssh to all the remotes and
  clear the tool; it will also try to count how many tools are left
  (with another ssh) and if none, it will delete the local @Remote
  entry.  If the first ssh fails, we do not continue to the second: we
  just count it as a failure, but since we cannot find out the state
  of the remote, we assume that it still has tools left and we do not
  remove the local entry for the remote.

  In the second case, the intent is to clear all tools, so even if the
  ssh fails, we remove the local entry for the remote. IOW,
  pbench-clear-tools with no --name argument might leave junk lying
  around on the remote if we can't get to it, but it clears everything
  locally.

  In either case, we return the number of ssh failures as the exit
  status of the command.

o pbench-start-tools now checks its ssh pipeline for errors and returns
  status properly.

  In addition, pbench-start-tools is modified to call
  pbench-kill-tools before starting a new run. It has often been the
  case that tools from earlier runs are not cleaned up properly,
  particularly when the run is interrupted. Although that is a problem
  that should be resolved in the calling script, using "trap
  'pbench-kill-tools' INT QUIT EXIT", this is an attempt to make tool
  handling more robust and avoid the most common failure scenarios.
Replace wait with a loop of "wait $pid" for each background process
created, so we can get the exit status of the process. Each non-zero
status counts as an error. The total number of errors is then returned
as the status of pbench-postprocess-tools.
Modify mock ssh to return failures on a given host name ("fubar").
Modify unittests to pass when the test is expected to fail
and does so with an expected non-zero status code.
@portante portante merged commit 65abb91 into distributed-system-analysis:master Jul 15, 2016
@ndokos ndokos deleted the wip-ssh-status branch July 15, 2016 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants