This proposes to compare ClusterShell and the famous
pdsh which clush aims to replace and provide more extended features.
clushstandard command line is the same:
$ pdsh -w foo[1-5] echo "Hello World"
$ clush -w foo[1-5] echo "Hello World"
-w -x -g -X)
-f -t -u -l)
rpdcpare available through
And other ones. All simple
pdsh command could be adapted simply changing the command name to
clubak is a replacement tool for
dshbak, which is commonly used with
pdsh to regroup similar outputs.
clubak feature is directly available in
clush. You do not have to call another external tool.
If you need it anyway:
Pdsh offers possibilities to add plugins to connect nodes or select them. Those plugins should dynamic libraries using pdsh C interface. ClusterShell provides 3 ways to extend its features which can be simply shell commands or Python extensions.
clushto any external node database.
pdsh plugin feature could be available with
But ClusterShell does not aim to reimplement
pdsh in Python. There is much more features!
ClusterShell introduces the nodeset command and its backend which ables to easily manipulates ranges of nodes.
$ nodeset -c nova[0-7,32-159] 136
$ nodeset -f nova[0-7,32-159] nova[160-163] nova[0-7,32-163]
$ nodeset -f @oss,@mds node[2-9]
For some reasons its common to cancel of
pdsh execution because a node is hang. If you are also using
dshbak, due to the pipe, all nodes output will be lost.
$ pdsh -w foo[1-5] ls /remote/nfs/ | dshbak -c
Now hit Ctrl-C. No output will be printed, even if all nodes have successfully run the command.
$ clush -b -w foo[1-5] uname -r Warning: Caught keyboard interrupt! --------------- foo[2-4] (3) --------------- 126.96.36.199-145.fc11 --------------- foo5 --------------- 2.6.18-164.11.1.el5 Keyboard interrupt (foo1 did not complete).
ClusterShell improves administrator experience with several new features like:
$ clush '-o -X' -w foo[1-5] xterm
/etc/motdcontent with the same file on a group of nodes
$ cat /etc/motd | clush -b -w foo[1-5] diff - /etc/motd
$ tar -Cf - /tmp | clush -w foo[1-5] tar xfv -
ClusterShell was first intended to be an event-based, distributed, command execution library, in Python. All command line tool features are accessible through the Python API to offer possibilities to easily write sequential or event-based program.
Some of the possibilities are presented in the following topics:
Some could say that as ClusterShell is a Python library, it should be slow.
Here is a short benchmark comparing a
clush command and
pdsh command and compute the time they needed to run a simple command on a lot of nodes.
As you can see, ClusterShell outperforms
pdsh mostly all the time. As soon as more than 100 nodes are involved, ClusterShell is faster and scales better. The more nodes you add the larger the difference is.
There is a very little overhead due to Python interpretor that become insignificant when you are running real commands. Moreover Python language helps a lot in doing easy developing of ClusterShell where raw C could be really a pain.