A cloudpool is a an abstract, cloud-neutral, management interface for an elastic pool of servers.
Different cloudpool implementations are available, each handling the communication with its particular cloud provider and the API it offers, thereby shielding the cloudpool client from such details, allowing identical management of server groups for any cloud. No matter which cloud infrastructure the group is deployed on, increasing or decreasing the size of the group is as easy as setting a desired size for the group. If the desired size is increased, a new machine instance is provisioned. If the desired size is decreased, a machine instance is selected for termination (note that, depending on the cloud and cloudpool implementation, the machine may not be immediately terminated, but may be kept around as long as possible until it's about to enter a new billing period). If a machine instance in the group is no longer operational, a replacement is provisioned.
A cloudpool offers a number of management primitives for the group of servers it mananages, the most important ones being primitives for
- Tracking the machine pool members and their states.
- Setting the desired size of the machine pool. The cloud pool continuously starts/stops machine instances so that the number of machines in the pool matches the desired size set for the pool.
The cloudpool is managed via a REST API. For all the details, refer to http://cloudpoolrestapi.readthedocs.io/.
Typically, each cloudpool is used to manage a group of similar servers, fulfilling one single role in the overall system. Examples include user-facing web server frontends, application servers, and database read-replicas. One can think of a single cloudpool as managing the number of replicas of a certain micro service.
With Elastisys cloudpools, managing large deployments of micro services, even across different clouds, becomes easier, more robust, and more cost-efficient.
The growing list of cloudpool implementations includes. For more details on
each implementation and its use, refer to its README.md:
ec2pool: a cloudpool that manages a group of AWS EC2 instances. READMEspotpool: a cloudpool that manages a group of AWS Spot instances. READMEawsaspool: a cloudpool that manages an AWS Auto Scaling Group. READMEcitycloudpool: a cloudpool that manages a group of OpenStack servers in City Cloud. READMEazurepool: a cloudpool that manages a group of Microsoft Azure VMs. READMEgcepool: a cloudpool that manages a GCE instance group. READMEgkepool: a cloudpool that manages a GKE container cluster. READMEkubernetespool: a cloudpool that manages a group of Kubernetes pod replicas. READMEopenstackpool: a cloudpool that manages a group of OpenStack servers. README
For implementers, it may be worth noting that the cloudpool.commons
module contains a generic CloudPool implementation (BaseCloudPool) intended
to be used as a basis for building cloud-specific cloudpools.
A MultiCloudPool, which allows a dynamic collection of cloudpool instances
to be published on a single server, is also available under the
multipool module. All of the above cloudpool
implementations are possible to run both as singleton cloudpools and
as multipools.
This project depends on scale.commons.
If you are building from master yourself (where the pom.xml file refers to
a SNAPSHOT version), you need to clone and build that code repository first.
Once that has been installed using Maven, this project can be built with:
mvn clean install
For each of the cloudpool implementation modules, the build produces an executable jar file that starts an HTTP(S) server that publishes the cloud pool REST API.
To start a server, simply execute the jar file (in the target directory of
the cloudpool implementation's module):
java -jar <artifact>.jar ...
This will start a HTTP/HTTPS server publishing the
cloudpool REST API
at the specified port. For a full list of options run with the --help flag.
The behavior of the cloudpool is controlled through a JSON-formatted
configuration document and can either be set at start-time with the
--config command-line flag, or over the
POST /config
REST API method.
The JSON document is specific to the cloudpool implementation (refer to
its README.md for full details) but some common configuration options
are described below.
Most of the cloudpool implementations follow a similar schema for
the configuration document (refer to the individual cloudpool's README.md
for details). In the general schema, outlined below, there are two parts
of the configuration document that carries cloud provider-specific settings:
cloudApiSettings: typically declares API access credentials and settings.provisioningTemplate: contains cloud provider-specific server provisioning parameters.
The configuration parameters supported by a particular cloudpool implementation can be
found in its README.md file.
A common structure of a cloudpool configuration is illustrated below:
{
"name": "webserver-pool",
"cloudApiSettings": {
... cloud provider-specific API access credentials and settings ...
},
"provisioningTemplate": {
... cloud provider-specific provisioning parameters ...
},
"scaleInConfig": {
"victimSelectionPolicy": "NEWEST"
},
"alerts": {
"duplicateSuppression": { "time": 5, "unit": "minutes" },
"smtp": [
{
"subject": "[elastisys:scale] cloud pool alert for MyScalingPool",
"recipients": ["receiver@destination.com"],
"sender": "noreply@elastisys.com",
"smtpClientConfig": {
"smtpHost": "mail.server.com",
"smtpPort": 587,
"authentication": {"userName": "john", "password": "secret"}
}
}
],
"http": [
{
"destinationUrls": ["https://some.host1:443/"],
"severityFilter": "ERROR|FATAL",
"auth": {
"basicCredentials": { "username": "user1", "password": "secret1" }
}
},
{
"destinationUrls": ["https://some.host2:443/"],
"severityFilter": "INFO|WARN",
"auth": {
"certificateCredentials": { "keystorePath": "src/test/resources/security/client_keystore.p12", "keystorePassword": "secret" }
}
}
]
},
"poolFetch": {
"retries": {
"maxRetries": 3,
"initialBackoffDelay": {"time": 3, "unit": "seconds"}
},
"refreshInterval": {"time": 30, "unit": "seconds"},
"reachabilityTimeout": {"time": 5, "unit": "minutes"}
},
"poolUpdate": {
"updateInterval": {"time": 1, "unit": "minutes"}
}
}The configuration document declares how the cloudpool:
-
identifies pool members (the
namekey). As an example, the cloudpool implementation may choose to assign a metadata tag with the pool name to each started machine. -
should configure its cloud-specific CloudPoolDriver to allow it to communicate with its cloud API (the
cloudApiSettingskey). -
provisions new machines when the pool needs to grow (the
provisioningTemplatekey). -
decommissions machines when the pool needs to shrink (the
scaleInConfigkey). -
alerts system administrators (via email) of resize operations, error conditions, etc (the
alertskey).
In a little more detail, the configuration keys have the following meaning:
-
name: (required): The logical name of the managed group of machines. The exact way of identifying pool members may differ between implementations, but machine tags could, for example, be used to mark pool membership. -
cloudApiSettings(required): API access credentials and settings required to communicate with the targeted cloud. The structure of this documnent is cloud-specific. Refer to theREADME.mdof a particular cloud implementation for details. -
provisioningTemplate(required): Describes how to provision additional servers (on scale-out). The appearance of this document is cloud-specific. Refer to theREADME.mdof a particular cloud implementation for details. -
scaleInConfig(optional): Describes how to decommission servers by selecting a strategy for the order in which to consider machines for termination when the pool needs to shrink. Generally, theBaseCloudPoolwill first terminate machines inREQUESTEDstate (since they are likely to not yet be useful). For the remaining machines (which are not protected by a membership status withevictable: false) thevictimSelectionPolicyguides the selection of scale-down candidates.victimSelectionPolicy: Policy for selecting which machine to terminate. Allowed values:NEWEST,OLDEST.
-
alerts(optional): Configuration that describes how to send alerts via email or HTTP(S) webhooks.duplicateSuppression(optional): Duration of time to suppress duplicate alerts from being re-sent. Two alerts are considered equal if they share topic, message and metadata tags.smtp: a list of email alert senderssubject: The subject line to use in sent mails (Subject).recipients: The receiver list (a list of recipient email addresses).sender: The sender email address to use in sent mails (From).severityFilter: A regular expression used to filter alerts. Alerts with a severity (one ofDEBUG,INFO,NOTICE,WARN,ERROR,FATAL) that doesn't match the filter expression are suppressed and not sent. Default:.*.smtpClientConfig: Connection settings for the SMTP client.smtpHost: SMTP server host name/IP address.smtpPort: SMTP server port. Default is 25.authentication: Optional username/password to authenticate with SMTP server. If left out, authentication is disabled.useSsl: Enables/disables the use of SSL for SMTP connections. Default is false (disabled).
http: a list of HTTP(S) webhook alert senders, which willPOSTalerts to the specified endpoint using the (optional) configured authentication credentials.destinationUrls: The list of destination endpoint URLs.severityFilter: A regular expression used to filter alerts. Alerts with a severity (one ofDEBUG,INFO,NOTICE,WARN,ERROR,FATAL) that doesn't match the filter expression are suppressed and not sent. Default:.*.auth: Authentication credentials. Can specify eitherbasicCredentialsorcertificateCredentialsor both.basicCredentials:usernameandpasswordto use for BASIC-style authentication.certificateCredentials:keystorePathandkeystorePasswordfor client certificate-based authentication.
-
poolFetch(optional): Controls how often to refresh the cloud pool member list and for how long to mask cloud API errors. Default:retries: 3 retries with 3 second initial exponential back-off delay,refreshInterval: 30 seconds,reachabilityTimeout: 5 minutes.retries: Retry handling when fetching pool members from the cloud API fails.maxRetries: Maximum number of retries to make on each attempt to fetch pool members.initialBackoffDelay: Initial delay to use in exponential back-off on retries. May be zero, which results in no delay between retries.
refreshInterval: How often to refresh the cloudpool's view of the machine pool members.reachabilityTimeout: How long to respond with cached machine pool observations before responding with a cloud reachability error. In other words, for how long should failures to fetch the machine pool be masked.
-
poolUpdate(optional): Controls the behavior with respect to how often to attempt to update the size of the machine pool to match the desired size.updateInterval: The time interval between periodical pool size updates. Default: 60 seconds.
Elastisys has also developed a Splitter cloudpool implementation, which lets a single logical cloudpool span across several clouds (and even cloud providers), complete with fail-over functionality built in. It adheres to the exact same cloudpool API. Users of the a cloudpool defines a splitting policy, such as "90 percent AWS Spot instances, 10 percent AWS EC2 instances", and the Splitter cloudpool takes care of maintaining this ratio.
Should a cloud fail to provide an instance fast enough (for whatever reason), the Splitter cloudpool will obtain an equivalent instance from another of its configured cloud backends. Once the original cloud provider is operating as intended again, the Splitter will automatically decommission the replacement machine from the other cloud backend.
Some of our customers use it to ensure that their services are highly available, even in the face of cloud provider failure. Others use it to run mostly Spot instances, and fall back to on-demand instances when Spot instances are scarce.
Read more about it on our website: https://elastisys.com/cloud-platform-features/multi-cloud/
The Splitter cloudpool is not open source at this time, so contact Elastisys if you would like to discuss how the Splitter cloudpool can help optimize your cloud deployment.
Copyright (C) 2013 Elastisys AB
Licensed under the Apache License, Version 2.0