Skip to content
jimfcarroll edited this page Aug 10, 2012 · 2 revisions

Distributed Deployment

In order to distribute a Dempsy application on a set of nodes, you simply start the application using the com.nokia.dempsy.spring.RunNode wrapper with a few system parameters ... on all of the nodes. Some of the parameters are required and some are optional.

java -classpath classpath [options] com.nokia.dempsy.spring.RunNode

options include:

  1. -Dappdef - Required - This identifies the spring application context with the ApplicationDefinition describing the application being started.
  2. -Dapplication - Required - This parameter identifies the application and cluster to start by name. The format is: application:cluster. Both application and cluster can be regular expressions. This provides for the ability to start multiple clusters within the same node.
  3. -Dzk_connect - Required - As mentioned in the Prerequisites section, an Apache ZooKeeper install is required to run Dempsy in a distributed manner. The connect string for ZooKeeper needs to be supplied.
  4. -Dzk_session_timeout - Optional - Please refer to the ZooKeeper documentation for the description of the timeout parameter. The value is in milliseconds and the default is 5000 (milliseconds).
  5. -Dmin_nodes_for_cluster - Optional - This parameter should be set to the minimum number of nodes it will require to run the application. If the number of nodes falls below this number, then some of the messages being sent to this cluster (or these clusters) will be dropped. The default is 1.
  6. -Dtotal_slots_for_cluster - Optional - This allows the tunning of the number of shards being used in the node management. Please see the section on Routing for more details. The default is 100.

That's it. There's no configuration about what is running where as this is discovered dynamically. There's no configuring of a physical topology and even the logical topology (what messages from what processors go to what other processors) is discovered at runtime and can change dynamically (see the sections on Routing and Dynamic Topology for more details).

As a result of this flexibility Dempsy cooperates with applications that run in a cloud environment where good IAAS tools provision and start nodes automatically and dynamically in response to whatever the current environmental conditions call for.

As an example, suppose we're running our WordCount example distributed and we find that we are overloaded. That we are overloaded can be discovered in any number of ways including: examination of Dempsy monitoring points (see the section on Monitoring) or due to back pressure on the source (e.g. say the words are being drawn from a traditional message queuing system and Dempsy is configured not to discard messages when it gets behind, the queue of words would begin to lengthen indicating the need for more processing power to keep up) or by simply looking at the system metrics on the nodes (CPU and memory).

In response to any or all of these conditions new nodes brought online will automatically join the processing.

Next section: Application Definition

Clone this wiki locally