Skip to content

Commit

Permalink
HA Docs: Document Controller Nodes in Active / Active setup
Browse files Browse the repository at this point in the history
This is the first commit of active / active section for Controller nodes
in the OpenStack HA guide.

bug #1196098
Blueprint improve-high-availability-support

Change-Id: Iaabbf64b425ccff810cc6a6656b6d018254e865a
  • Loading branch information
Émilien Macchi committed Jun 30, 2013
1 parent 7e0ce69 commit 89e6783
Show file tree
Hide file tree
Showing 3 changed files with 216 additions and 1 deletion.
60 changes: 59 additions & 1 deletion doc/src/docbkx/openstack-ha/aa-controllers.txt
@@ -1,4 +1,62 @@
[[ha-aa-controllers]]
=== OpenStack Controller Nodes

(Coming soon)
OpenStack Controller Nodes contains:
* All OpenStack API services
* All OpenStack schedulers
* Memcached service

==== Running OpenStack API & schedulers

===== API Services

All OpenStack projects have an API service for controlling all the resources in the Cloud.
In Active / Active mode, the most common setup is to scale-out these services on at least two nodes
and use load-balancing and virtual IP (with HAproxy & Keepalived in this setup).


*Configuring API OpenStack services*

To configure our Cloud using Highly available and scalable API services, we need to ensure that:
* Using Virtual IP when configuring Keystone Endpoints.
* All OpenStack configuration files should refer to Virtual IP.

*In case of failure*

The monitor check is quite simple since it just establishes a TCP connection to the API port. Comparing to the
Active / Passive mode using Corosync & Resources Agents, we don’t check if the service is actually running).
That’s why all OpenStack API should be monitored by another tool (i.e. Nagios) with the goal to detect
failures in the Cloud Framework infrastructure.


===== Schedulers

OpenStack schedulers are used to determine how to dispatch compute, network and volume requests. The most
common setup is to use RabbitMQ as messaging system already documented in this guide.
Those services are connected to the messaging backend and can scale-out :
* nova-scheduler
* nova-conductor
* cinder-scheduler
* quantum-server
* ceilometer-collector
* heat-engine

Please refer to the RabbitMQ section for configure these services with multiple messaging servers.


==== Memcached

Most of OpenStack services use an application to offer persistence and store ephemeral datas (like tokens).
Memcached is one of them and can scale-out easily without specific trick.

To install and configure it, you can read the http://code.google.com/p/memcached/wiki/NewStart[official documentation].

Memory caching is managed by Oslo-incubator for so the way to use multiple memcached servers is the same for all projects.

Example with two hosts:
----
memcached_servers = controller1:11211,controller2:11211
----

By default, controller1 will handle the caching service but if the host goes down, controller2 will do the job.
More informations about memcached installation are in the OpenStack Compute Manual.
156 changes: 156 additions & 0 deletions doc/src/docbkx/openstack-ha/aa-haproxy.txt
@@ -0,0 +1,156 @@
[[ha-aa-haproxy]]
=== HAproxy Nodes

HAProxy is a very fast and reliable solution offering high availability, load balancing, and proxying
for TCP and HTTP-based applications. It is particularly suited for web sites crawling under very high loads
while needing persistence or Layer 7 processing. Supporting tens of thousands of connections is clearly
realistic with todays hardware.

For installing HAproxy on your nodes, you should consider its http://haproxy.1wt.eu/#docs[official documentation].
Also, you have to consider that this service should not being a single point of failure, so you need at least two
nodes running HAproxy.

Here is an example for HAproxy configuration file:
----
global
chroot /var/lib/haproxy
daemon
group haproxy
maxconn 4000
pidfile /var/run/haproxy.pid
user haproxy

defaults
log global
maxconn 8000
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout check 10s

listen dashboard_cluster
bind <Virtual IP>:443
balance source
option tcpka
option httpchk
option tcplog
server controller1 10.0.0.1:443 check inter 2000 rise 2 fall 5
server controller2 10.0.0.2:443 check inter 2000 rise 2 fall 5

listen galera_cluster
bind <Virtual IP>:3306
balance source
option httpchk
server controller1 10.0.0.4:3306 check port 9200 inter 2000 rise 2 fall 5
server controller2 10.0.0.5:3306 check port 9200 inter 2000 rise 2 fall 5
server controller3 10.0.0.6:3306 check port 9200 inter 2000 rise 2 fall 5

listen glance_api_cluster
bind <Virtual IP>:9292
balance source
option tcpka
option httpchk
option tcplog
server controller1 10.0.0.1:9292 check inter 2000 rise 2 fall 5
server controller2 10.0.0.2:9292 check inter 2000 rise 2 fall 5

listen glance_registry_cluster
bind <Virtual IP>:9191
balance source
option tcpka
option tcplog
server controller1 10.0.0.1:9191 check inter 2000 rise 2 fall 5
server controller2 10.0.0.2:9191 check inter 2000 rise 2 fall 5

listen keystone_admin_cluster
bind <Virtual IP>:35357
balance source
option tcpka
option httpchk
option tcplog
server controller1 10.0.0.1:35357 check inter 2000 rise 2 fall 5
server controller2 10.0.0.2.42:35357 check inter 2000 rise 2 fall 5

listen keystone_public_internal_cluster
bind <Virtual IP>:5000
balance source
option tcpka
option httpchk
option tcplog
server controller1 10.0.0.1:5000 check inter 2000 rise 2 fall 5
server controller2 10.0.0.2:5000 check inter 2000 rise 2 fall 5

listen nova_ec2_api_cluster
bind <Virtual IP>:8773
balance source
option tcpka
option tcplog
server controller1 10.0.0.1:8773 check inter 2000 rise 2 fall 5
server controller2 10.0.0.2:8773 check inter 2000 rise 2 fall 5

listen nova_compute_api_cluster
bind <Virtual IP>:8774
balance source
option tcpka
option httpchk
option tcplog
server controller1 10.0.0.1:8774 check inter 2000 rise 2 fall 5
server controller2 10.0.0.2:8774 check inter 2000 rise 2 fall 5

listen nova_metadata_api_cluster
bind <Virtual IP>:8775
balance source
option tcpka
option tcplog
server controller1 10.0.0.1:8775 check inter 2000 rise 2 fall 5
server controller2 10.0.0.2:8775 check inter 2000 rise 2 fall 5

listen cinder_api_cluster
bind <Virtual IP>:8776
balance source
option tcpka
option httpchk
option tcplog
server controller1 10.0.0.1:8776 check inter 2000 rise 2 fall 5
server controller2 10.0.0.2:8776 check inter 2000 rise 2 fall 5

listen ceilometer_api_cluster
bind <Virtual IP>:8777
balance source
option tcpka
option httpchk
option tcplog
server controller1 10.0.0.1:8774 check inter 2000 rise 2 fall 5
server controller2 10.0.0.2:8774 check inter 2000 rise 2 fall 5

listen spice_cluster
bind <Virtual IP>:6082
balance source
option tcpka
option tcplog
server controller1 10.0.0.1:6080 check inter 2000 rise 2 fall 5
server controller2 10.0.0.2:6080 check inter 2000 rise 2 fall 5

listen quantum_api_cluster
bind <Virtual IP>:9696
balance source
option tcpka
option httpchk
option tcplog
server controller1 10.0.0.1:9696 check inter 2000 rise 2 fall 5
server controller2 10.0.0.2:9696 check inter 2000 rise 2 fall 5

listen swift_proxy_cluster
bind <Virtual IP>:8080
balance source
option tcplog
option tcpka
server controller1 10.0.0.1:8080 check inter 2000 rise 2 fall 5
server controller2 10.0.0.2:8080 check inter 2000 rise 2 fall 5
----

After each change of this file, you should restart HAproxy.
1 change: 1 addition & 0 deletions doc/src/docbkx/openstack-ha/ha-guide.txt
Expand Up @@ -12,6 +12,7 @@ include::ap-api-node.txt[]
include::aa-overview.txt[]
include::aa-database.txt[]
include::aa-rabbitmq.txt[]
include::aa-haproxy.txt[]
include::aa-controllers.txt[]
include::aa-network.txt[]
include::aa-computes.txt[]
Expand Down

0 comments on commit 89e6783

Please sign in to comment.