How to train a model using multiple machines?

The main selling point of CNTK (compared to other deep learning packages) is that it supports training a large model on a compute cluster. However, I didn't find any information online or from the book on how to set up training across computers. Anybody can help?