Distributed Deployment

Meni Vaitsi edited this page May 14, 2016 · 4 revisions

Introduction

Cloud administrators can manually specify which machines should host each of the critical services in AppScale. This document outlines how to use this service to explore alternative deployment schemes that may give your user's applications potentially better performance or make it more fault tolerant.

For a list of the deployments that we fully support and thus test as part of our continuous integration system, please see this page.

What are the critical services?

In order to host your applications, AppScale employs the following components:

  • App Engine: Our modified version of the non-scalable Google App Engine SDKs that adds in the ability to store data to and retrieve data from databases that support the Google Datastore API. Throughout this document, we use the phrases "App Engine" and "AppServer" interchangeably - this is because internally our modified version of the App Engine SDK is called the AppServer.
  • Database: Runs all the services needed to host the database.
  • Memcache: Provides caching support for App Engine applications.
  • Login: The primary machine that is used to route users to their Google App Engine applications.
  • ZooKeeper: Hosts metadata needed to implement database-agnostic transaction support and information about the deployment.
  • TaskQueue: Implements Task Queue API support via the RabbitMQ message bus service and Celery scheduling service.

How do I specify where they run?

Change your `ips_layout' to specify exactly what machines should run which services in your deployment. Here's an example:

master: 192.168.1.2
appengine:
- 192.168.1.3
- 192.168.1.4
database:
- 192.168.1.5

Also, as no machine has been specified as the login node, the master node automatically takes up this role. Therefore, one node (192.168.1.2) routes users to their App Engine applications, hosted at 192.168.1.3 and 192.168.1.4. Finally, one machine (192.168.1.5) hosts the database in the system.

Let's look at another example:

master: 192.168.1.2
appengine: 
- 192.168.1.3
- 192.168.1.4
database: 
- 192.168.1.3
- 192.168.1.4
login:
- 192.168.1.5

In this example, one node (192.168.1.5) routes users to their App Engine applications and performs no other functions. Two nodes (192.168.1.3 and 192.168.1.4) host App Engine applications and host the chosen database. Finally, the master node (192.168.1.2) queries the other nodes in the system to ensure they are running properly and handles transactions via ZooKeeper.

Aggregate roles

controller: 192.168.1.2
servers: 
- 192.168.1.3
- 192.168.1.4
- 192.168.1.5

This deployment employs a controller and a number of servers. These "aggregate roles" each run a number of roles in the system:

  • Controller: database, login, and ZooKeeper
  • Servers: App Engine and database

Using Placement Support in Cloud Deployments

But how do you use this placement support in public cloud mode? It's simple - just replace each of the IPs in your ips_layout with node-X (where X is an integer). Here's an example:

master: node-1
appengine:
- node-2
- node-3
database:
- node-4

And for completeness, here's the second example once more:

master: node-1
appengine: 
- node-2
- node-3
database: 
- node-2
- node-3
login:
- node-4

Impact of Placement Support on Performance and Fault Tolerance

Component placement in AppScale deployments can possibly offer better (or worse) performance and/or fault tolerance. Let's examine this fact through a familiar example:

master: node-1
appengine:
- node-2
- node-3
database:
- node-4

Here, performance is likely to be better under lower load due to having only one database - many of the internal agreement protocols are vastly simplified when only one node is involved. However, this node is now a single-point-of-failure in the system - if it goes down, users won't be able to read or write data.

Let's look at another familiar example:

controller: node-1
servers: 
- node-2
- node-3
- node-4

This deployment gives us four database nodes (one for each node in the system) and three App Engine nodes - giving us vastly better fault tolerance than in the previous deployment.

Conclusion

This document outlines a number of ways by which cloud administrators can manually specify where AppScale's critical services should run. Explore the various deployment options that are now available, and let us know what you're using!