Skip to content

Conversation

@portante
Copy link
Member

@portante portante commented Nov 17, 2020

Add support for collecting system configuration information via the Tool Meister sub-system.

The pbench-collect-sysinfo CLI interface has been modified to send a message to the Tool Meisters to have them collect all the requested system configuration information. Much like tool data is collected and sent back to the Tool Data Sink from each Tool Meister, the system information is collected by each Tool Meister and sent back to the Tool Data sink to be stored in the requested location.

This means that pbench-collect-sysinfo no longer uses ssh to collect and copy the system information from remote hosts. As it is for tools running on the local host, the local Tool Meister write the collected data into the local volume directly without sending it through the Tool Data Sink.

This change allows us to delete the pbench-remote-sysinfo-dump trampoline interface since the Tool Meister takes care of remote operations.

Each benchmark-script has been modified to invoke the pbench-collect-sysinfo following the pbench-tool-meister-start call, and before the pbench-tool-meister-stop call.

We fixed a small bug for pbench-trafficgen which was overwriting the "beginning" (.../sysinfo/beg) collected system information when collecting the "ending" (.../sysinfo/end) information.

The first 2 commits are small, isolated, changes to stop sub-classing object in Python 3 (now the default), and to clean up an instance of using f-strings in logging statements.

@portante portante added enhancement Agent tools Of and related to the operation and behavior of various tools (iostat, sar, etc.) labels Nov 17, 2020
@portante portante added this to the v0.71 milestone Nov 17, 2020
@portante portante self-assigned this Nov 17, 2020
@lgtm-com
Copy link

lgtm-com bot commented Nov 17, 2020

This pull request introduces 1 alert when merging 9e56fca into f0c9aa0 - view on LGTM.com

new alerts:

  • 1 for Unreachable code

Copy link
Member

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah; the foundation looks good. Time to build the addition!

@portante portante force-pushed the sysinfo-tool-meister branch from 9e56fca to aa73324 Compare November 17, 2020 13:10
@lgtm-com
Copy link

lgtm-com bot commented Nov 17, 2020

This pull request introduces 1 alert when merging aa73324 into f0c9aa0 - view on LGTM.com

new alerts:

  • 1 for Unreachable code

@portante portante force-pushed the sysinfo-tool-meister branch from aa73324 to 05a70ad Compare November 17, 2020 13:39
@lgtm-com
Copy link

lgtm-com bot commented Nov 17, 2020

This pull request introduces 1 alert when merging 05a70ad into 8bb4bef - view on LGTM.com

new alerts:

  • 1 for Unreachable code

@portante portante force-pushed the sysinfo-tool-meister branch from 05a70ad to 916c20a Compare November 18, 2020 22:06
@lgtm-com
Copy link

lgtm-com bot commented Nov 18, 2020

This pull request introduces 1 alert when merging 916c20a into 6986c8d - view on LGTM.com

new alerts:

  • 1 for Unreachable code

@portante portante force-pushed the sysinfo-tool-meister branch 3 times, most recently from 317d993 to 3d85748 Compare November 19, 2020 03:19
@portante portante marked this pull request as ready for review November 19, 2020 03:20
ln -s "$(basename "${rdir}")" "$(dirname "${rdir}")/reference-result"
done

${script_path}/postprocess/user-benchmark-wrapper "${benchmark_run_dir}" "${total_duration}"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all the other bench-scripts we invoke pbench-metadata-log, then pbench-collect-sysinfo, and then pbench-tool-meister-stop. This change corrects pbench-user-benchmark to do the same.

However, we used to run the user-benchmark-wrapper after pbench-metadata-log and then pbench-tool-meister-stop, immediately before pbench-collect-sysinfo. Preserving that sequence was not possible, so we now run the user-benchmark-wrapper before we run the finalization sequence.

I don't think that will be a problem, but it seemed running it after everything would make the user wait longer for their optional script to run than it did before. This way it runs as soon as possible.

self._tool_dir = None
# The temporary directory to use for capturing all tool data.
self._tmp_dir = os.environ.get("_PBENCH_TOOL_MEISTER_TMP", "/var/tmp")
self._tmp_dir = os.environ["pbench_tmp"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prevents us from using /var/tmp during the test runs, and brings the Tool Meister in-line with the other code using the ${pbench_run}/tmp directory.

Copy link
Member Author

@portante portante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RobertKrawitz, this adds support for collecting system information via Tool Meister, so now we no longer use ssh for any of that data collection.

dbutenhof
dbutenhof previously approved these changes Nov 19, 2020
Copy link
Member

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments, but looks good.

Copy link
Member

@Maxusmusti Maxusmusti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for delay, looks good, I agree with Dave that there may be a better way to handle the ("send, "sysinfo") tuples in the future, but not the most important thing.

Only issue I encountered was that it seems to be impossible to build rpms off this branch (would require changes to Makefile). This is the point it currently gets stuck:
cp -a cdm-get-iterations pbench-add-metalog-option pbench-agent-config-activate pbench-agent-config-ssh-key pbench-avg-stddev pbench-cleanup pbench-clear-results pbench-collect-sysinfo pbench-copy-results pbench-copy-result-tb pbench-end-tools pbench-get-iteration-metrics pbench-get-metric-data pbench-get-primary-metric pbench-import-cdm pbench-init-tools pbench-kill-tools pbench-list-tools pbench-log-timestamp pbench-make-result-tb pbench-metadata-log pbench-move-results pbench-output-monitor pbench-postprocess-tools pbench-postprocess-tools-cdm pbench-register-tool pbench-register-tool-set pbench-register-tool-trigger pbench-remote-sysinfo-dump pbench-send-tools pbench-start-tools pbench-stop-tools pbench-sysinfo-dump pbench-tool-meister-client pbench-tool-meister-start pbench-tool-meister-stop pbench-tool-trigger README require-rpm tool-meister /tmp/opt/pbench-agent-0.71.0/agent/util-scripts
cp: cannot stat 'pbench-remote-sysinfo-dump': No such file or directory

dbutenhof
dbutenhof previously approved these changes Nov 19, 2020
Copy link
Member

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@portante
Copy link
Member Author

@dbutenhof, @Maxusmusti, I've collapsed the code review comments commit with the others to maintain the three separate comments, and I've updated the 3rd commits description, as well as this PR description.

@portante portante force-pushed the sysinfo-tool-meister branch from c731fc5 to aad6209 Compare November 20, 2020 20:43
The `pbench-collect-sysinfo` CLI interface has been modified to send a
message to the Tool Meisters to have them collect all the requested
system configuration information.  Much like tool data is collected and
sent back to the Tool Data Sink from each Tool Meister, the system
information is collected by each Tool Meister and sent back to the Tool
Data sink to be stored in the requested location.

This means that `pbench-collect-sysinfo` no longer uses `ssh` to collect
and copy the system information from remote hosts.  As it is for tools
running on the local host, the local Tool Meister write the collected
data into the local volume directly without sending it through the Tool
Data Sink.

This change allows us to delete the `pbench-remote-sysinfo-dump`
trampoline interface since the Tool Meister takes are of remote
operations.

Each benchmark-script has been modified to invoke the
`pbench-collect-sysinfo` following the `pbench-tool-meister-start` call,
and before the `pbench-tool-meister-stop` call.

We fixed a small bug for `pbench-trafficgen` which was overwriting the
"beginning" (`.../sysinfo/beg`) collected system information when
collecting the "ending" (`.../sysinfo/end`) information.
@portante portante force-pushed the sysinfo-tool-meister branch from aad6209 to c77461c Compare November 20, 2020 21:24
Copy link
Member

@ndokos ndokos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are getting rid of pbench-remote-sysinfo-dump? Wha's the world coming to?

Copy link
Member

@Maxusmusti Maxusmusti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Built and tested, it all works!

@portante portante merged commit 455b9dc into distributed-system-analysis:master Nov 20, 2020
portante added a commit to portante/pbench that referenced this pull request Nov 24, 2020
The change to rename `tool-scripts/README` to `tool-scripts/README.md`
in commit 6d1ccbf broke the RPM builds.
Here we backuport the fix on `master` from PR distributed-system-analysis#1981, in commit
455b9dc.
portante added a commit that referenced this pull request Nov 30, 2020
The change to rename `tool-scripts/README` to `tool-scripts/README.md`
in commit 6d1ccbf broke the RPM builds.
Here we backuport the fix on `master` from PR #1981, in commit
455b9dc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Agent enhancement tools Of and related to the operation and behavior of various tools (iostat, sar, etc.)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Move collection of system configuration information to the Tool Meisters

4 participants