willy
(short for ecs-will-it-fit
) is a CLI tool that helps you answer the question: "Will this ECS service fit on
my ECS cluster backed by EC2 instances?". It does so by mimicking1 the selection process that
the ECS scheduler performs while selecting suitable container instances for your service.
willy
is useful only if your cluster does not have auto-scaling using capacity providers enabled
(it should).
Note
This is a work-in-progress, alpha version. It may be incorrect, unfinished and its usage may change over time.
- ecs-will-it-fit
Install from GitHub
pip install git+https://github.com/ivica-k/ecs-will-it-fit
willy
supports the default authentication mechanism of boto3.
The read-only API calls it performs to AWS ECS require the following IAM permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ecs:DescribeClusters",
"ecs:ListContainerInstances",
"ecs:DescribeContainerInstances",
"ecs:DescribeServices",
"ecs:DescribeTaskDefinition"
],
"Resource": "*"
}
]
}
General help:
$ willy -h
usage: willy [-h] -c CLUSTER -s SERVICE [--verbose | --no-verbose | -V]
Checks whether an ECS service can fit on an ECS (EC2) cluster.
optional arguments:
-h, --help show this help message and exit
-c CLUSTER, --cluster CLUSTER
Name of the ECS cluster.
-s SERVICE, --service SERVICE
Name of the ECS service.
--verbose, --no-verbose, -V
Enable verbose output, with EC2 instance information and other details. (default: False)
Enough CPU units, short
$ willy --service my-service --cluster my-cluster
Cluster 'my-cluster' has enough CPU units to run containers from the 'my-service' service.
Enough CPU units, verbose
$ willy --service my-service --cluster my-cluster --verbose
Cluster 'my-cluster' has enough CPU units to run containers from the 'my-service' service.
The following container instances meet the hardware requirements of 512 CPU units.
Container instances capable of running the service:
Instance ID | CPU remaining | CPU total | Memory remaining | Memory total |
------------------- | --------------- | --------------- | ---------------- | --------------- |
i-abcdefgh123456789 | 1792 | 2048 | 15231 | 15743 |
i-hgfedcba987654321 | 1792 | 2048 | 15231 | 15743 |
i-hgfedcba123456789 | 512 | 2048 | 8575 | 15743 |
Not enough CPU units, short
$ willy -s my-service -c my-cluster --verbose
Service 'my-service' can not run on the 'my-cluster' cluster. Number of required CPU units is 3072 but the cluster
has 2048 CPU units available across 2 container instances.
Not enough CPU units, verbose
$ willy -s my-service -c my-cluster --verbose
Service 'my-service' can not run on the 'my-cluster' cluster. There are no container instances that meet the hardware
requirements of 3072 CPU units.
Container instances incapable of running the service:
Instance ID | CPU remaining | CPU total | Memory remaining | Memory total |
------------------- | --------------- | --------------- | ---------------- | --------------- |
i-abcdefgh123456789 | 1792 | 2048 | 15231 | 15743 |
i-hgfedcba987654321 | 1792 | 2048 | 15231 | 15743 |
i-hgfedcba123456789 | 512 | 2048 | 8575 | 15743 |
Not enough memory, short
$ willy -s my-service -c my-cluster
Service 'my-service' can not run on the 'my-cluster' cluster. Number of required memory units is 1024 but the
cluster has 256 memory units available across 2 container instance(s).
Not enough memory, verbose
$ willy -s my-service -c my-cluster --verbose
Service 'my-service' can not run on the 'my-cluster' cluster. There are no container instances that meet the
hardware requirements of 1024 memory units.
Container instances incapable of running the service:
Instance ID | CPU remaining | CPU total | Memory remaining | Memory total |
------------------- | --------------- | --------------- | ---------------- | --------------- |
i-abcdefgh123456789 | 1792 | 2048 | 512 | 15743 |
i-hgfedcba987654321 | 1792 | 2048 | 512 | 15743 |
i-hgfedcba123456789 | 512 | 2048 | 512 | 15743 |
Port(s) taken, short
$ willy -s my-service -c my-cluster
Service 'my-service' can not run on the 'my-cluster' cluster. The service requires ports [21, 22] that are used on
all container instances in the cluster.
Port(s) taken, verbose
$ willy -s my-service -c my-cluster --verbose
Service 'my-service' can not run on the 'my-cluster' cluster. The service requires ports [22, 53] that are used on all
container instances in the cluster.
Container instances incapable of running the service:
Instance ID | Used ports (TCP) |Used ports (UDP) |
------------------- | ---------------- | --------------- |
i-abcdefgh123456789 | 22, 53 | |
i-hgfedcba987654321 | 22, 53 | |
Wrong instance type placement constraint, short
$ willy -s my-service -c my-cluster --verbose
Service 'my-service' can not run on the 'my-cluster' cluster. There are no container instances that have the
attribute(s) required by the service.
Wrong instance type placement constraint, verbose
$ willy -s my-service -c my-cluster --verbose
Service 'my-service' can not run on the 'my-cluster' cluster. There are no container instances that have the
attribute(s) required by the service.
Missing attribute(s):
attribute:ecs.instance-type==t2.nano
Simply put, willy
sacrifices being 100% correct five seconds from now in favor of providing a quick answer now.
ECS scheduler tries to deploy your service in a robust and safe way. This can take time, depending on several configuration options. Check out Nathan's amazing article on Speeding up Amazon ECS container deployments for details.
willy
perform its checks at a point in time and its answer represents the possibility to fit all the tasks in your
service on the cluster at that time. Those tasks might fit on the cluster five seconds later, depending on the state
of the cluster and willy
can't predict that.
If an ECS deployment fails because of lack of CPU resources, the deployment event is :
Service my-service was unable to place a task because no container instance met all of its requirements.
The closest matching container-instance 48fccf62981f4fc2b53e62233a586fe8 has insufficient CPU units available.
willy
provides more details with its --verbose
flag:
Service 'my-service' can not run on the 'my-cluster' cluster. There are no container instances that meet the hardware
requirements of 3072 CPU units.
Container instances incapable of running the service:
Instance ID | CPU remaining | CPU total | Memory remaining | Memory total |
------------------- | --------------- | --------------- | ---------------- | --------------- |
i-abcdefgh123456789 | 1792 | 2048 | 15231 | 15743 |
i-hgfedcba987654321 | 1792 | 2048 | 15231 | 15743 |
i-hgfedcba123456789 | 512 | 2048 | 8575 | 15743 |
If an ECS deployment fails because of a missing attribute, the deployment event will state something similar to:
The closest matching container-instance is missing an attribute required by your task
which lacks important details, such as the required attribute's name and value.
willy
does it differently when reporting missing attributes:
Service 'my-service' can not run on the 'my-cluster' cluster. There are no container instances that have the attributes
required by the service. Attribute(s) missing or incorrect on the container instance:
'ecs.vpc-id' with value 'vpc-a1b2c3d4e5f6'"
Task placement process on Amazon ECS - source:
When Amazon ECS places tasks, it uses the following process to select container instances:
1. Identify the container instances that satisfy the CPU, GPU, memory, and port requirements in the service.
2. Identify the container instances that satisfy the task placement constraints.
3. Identify the container instances that satisfy the task placement strategies.
4. Select the container instances for task placement.
willy
implements a validator for each of the steps listed above
Identify the container instances that satisfy the... | willy feature |
Implemented? | Has tests? |
---|---|---|---|
CPU requirements | CPU validator | ✅ | ✅ |
Memory requirements | Memory validator | ✅ | ✅ |
Port requirements | Network validator | ✅ | ❌ |
GPU requirements | Attributes validator | ❌ | ❌ |
Task placement constraints | Attributes validator | ✅2 | ✅ |
Exact technical details of the container instance selection process are not publicly available. willy
approximates the
process from observations made while scheduling services on ECS.
willy
sacrifices being 100% correct five seconds from now in favor of providing a quick answer now.
willy
perform its checks at a point in time and its answer represents the possibility to fit all the tasks in your
service on the cluster at that time. All tasks might fit on the cluster five seconds later, depending on the state of
the cluster and willy
can't predict that.
Implementation of all operators supported by ECS is not complete.
Operator | Description | Implemented? |
---|---|---|
==, equals | String equality | ✅ |
!=, not_equals | String inequality | ❌ |
>, greater_than | Greater than | ❌ |
>=, greater_than_equal | Greater than or equal to | ❌ |
<, less_than | Less than | ❌ |
<=, less_than_equal | Less than or equal to | ❌ |
exists | Subject exists | ✅ |
!exists, not_exists | Subject doesn't exist | ❌ |
in | Value in argument list | ✅ |
!in, not_in | Value not in argument list | ❌ |
=~, matches | Pattern match | ❌ |
!~, not_matches | Pattern mismatch | ❌ |