Clone this wiki locally
This proposes to compare ClusterShell and the famous
pdsh which clush aims to replace and provide more extended features.
clushstandard command line is the same:
$ pdsh -w foo[1-5] echo "Hello World"
$ clush -w foo[1-5] echo "Hello World"
- host selection options are supported (
-w -x -g -X)
- ssh related options are supported (
-f -t -u -l)
- File copies are supported. Equivalent to
rpdcpare available through
And other ones. All simple
pdsh command could be adapted simply changing the command name to
clubak is a replacement tool for
dshbak, which is commonly used with
pdsh to regroup similar outputs.
clubak feature is directly available in
clush. You do not have to call another external tool.
If you need it anyway:
But there are plugins
Pdsh offers possibilities to add plugins to connect nodes or select them. Those plugins should dynamic libraries using pdsh C interface. ClusterShell provides 3 ways to extend its features which can be simply shell commands or Python extensions.
NodeGroups provides an easy way to plug
clushto any external node database.
- Softwate used to connect to other nodes could be easily done implementing a new Python class.
pdsh plugin feature could be available with
But ClusterShell does not aim to reimplement
pdsh in Python. There is much more features!
Group of node handling
ClusterShell introduces the nodeset command and its backend which ables to easily manipulates ranges of nodes.
$ nodeset -c nova[0-7,32-159] 136
$ nodeset -f nova[0-7,32-159] nova[160-163] nova[0-7,32-163]
$ nodeset -f @oss,@mds node[2-9]
For some reasons its common to cancel of
pdsh execution because a node is hang. If you are also using
dshbak, due to the pipe, all nodes output will be lost.
$ pdsh -w foo[1-5] ls /remote/nfs/ | dshbak -c
Now hit Ctrl-C. No output will be printed, even if all nodes have successfully run the command.
- Output is not lost even if you hit Ctrl+C
$ clush -b -w foo[1-5] uname -r Warning: Caught keyboard interrupt! --------------- foo[2-4] (3) --------------- 22.214.171.124-145.fc11 --------------- foo5 --------------- 2.6.18-164.11.1.el5 Keyboard interrupt (foo1 did not complete).
ClusterShell improves administrator experience with several new features like:
- Automatic same output merging
- Stdout and stderr handling
- Nodeset size, colors, ...
Easy modification of ssh options
$ clush '-o -X' -w foo[1-5] xterm
Supports stdin forwarding
/etc/motdcontent with the same file on a group of nodes
$ cat /etc/motd | clush -b -w foo[1-5] diff - /etc/motd
- Binary content is supported:
$ tar -Cf - /tmp | clush -w foo[1-5] tar xfv -
ClusterShell was first intended to be an event-based, distributed, command execution library, in Python. All command line tool features are accessible through the Python API to offer possibilities to easily write sequential or event-based program.
Some of the possibilities are presented in the following topics:
And it is very fast !
Some could say that as ClusterShell is a Python library, it should be slow.
Here is a short benchmark comparing a
clush command and
pdsh command and compute the time they needed to run a simple command on a lot of nodes.
As you can see, ClusterShell outperforms
pdsh mostly all the time. As soon as more than 100 nodes are involved, ClusterShell is faster and scales better. The more nodes you add the larger the difference is.
There is a very little overhead due to Python interpretor that become insignificant when you are running real commands. Moreover Python language helps a lot in doing easy developing of ClusterShell where raw C could be really a pain.