Mike Fisk edited this page May 30, 2014 · 11 revisions
Clone this wiki locally

No real “installation” is required.

  1. Download the fm command (a Python script) to the computer you will use to launch computations.
    • You can also download a tarball
    • You may want to pick an older release rather than the current devel version
  2. Create a filemap.conf file describing the nodes you are using.
  3. Run “fm init” once to prepare necessary directory structures on each node.
  4. Copy “fm” to each node in the computation and make sure it is in your PATH on each node.


  • Linux
  • Python >= 2.4 (requires subprocess module)
  • OpenSSH
  • rsync
  • bash

SSH setup

If you’re using ssh (the default) to communicate between nodes, you will need to setup some form of authentication that doesn’t prompt you for a password each time. There are several options:

  • Use an ssh keypair (private public key). Spread the private key all worker hosts as well as the “master”. Add the public key to all hosts’ `authorized_keys` file.
  • As root, configure HostbasedAuthentication.
  • Use a single sign-on infrastructure like Kerberos.

Also, for best performance, reconfigure your nodes’ sshd_config file to set MaxStartups to a number larger than the number of nodes in your cluster configuration. Otherwise, sshd rate-limiting may hurt the performance of replication and shuffle operations.