Skip to content

A CLI tool that checks whether an ECS service can be scheduled on an ECS cluster. Shows verbose information on why it can or can't be scheduled.

Notifications You must be signed in to change notification settings

ivica-k/ecs-will-it-fit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ecs-will-it-fit

Heeeeere's Willy!

willy (short for ecs-will-it-fit) is a CLI tool that helps you answer the question: "Will this ECS service fit on my ECS cluster backed by EC2 instances?". It does so by mimicking1 the selection process that the ECS scheduler performs while selecting suitable container instances for your service.

willy is useful only if your cluster does not have auto-scaling using capacity providers enabled (it should).

Note

This is a work-in-progress, alpha version. It may be incorrect, unfinished and its usage may change over time.

Installation

Install from GitHub

pip install git+https://github.com/ivica-k/ecs-will-it-fit

Authentication and authorization

willy supports the default authentication mechanism of boto3. The read-only API calls it performs to AWS ECS require the following IAM permissions:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ecs:DescribeClusters",
                "ecs:ListContainerInstances",
                "ecs:DescribeContainerInstances",
                "ecs:DescribeServices",
                "ecs:DescribeTaskDefinition"
            ],
            "Resource": "*"
        }
    ]
}

Usage examples

General help:

$ willy -h
usage: willy [-h] -c CLUSTER -s SERVICE [--verbose | --no-verbose | -V]

Checks whether an ECS service can fit on an ECS (EC2) cluster.

optional arguments:
  -h, --help            show this help message and exit
  -c CLUSTER, --cluster CLUSTER
                        Name of the ECS cluster.
  -s SERVICE, --service SERVICE
                        Name of the ECS service.
  --verbose, --no-verbose, -V
                        Enable verbose output, with EC2 instance information and other details. (default: False)

CPU units

Enough CPU units, short
$ willy --service my-service --cluster my-cluster
Cluster 'my-cluster' has enough CPU units to run containers from the 'my-service' service.
Enough CPU units, verbose
$ willy --service my-service --cluster my-cluster --verbose
Cluster 'my-cluster' has enough CPU units to run containers from the 'my-service' service.
The following container instances meet the hardware requirements of 512 CPU units.

Container instances capable of running the service:

        Instance ID |   CPU remaining |       CPU total | Memory remaining |    Memory total |
------------------- | --------------- | --------------- | ---------------- | --------------- |
i-abcdefgh123456789 |            1792 |            2048 |           15231  |           15743 |
i-hgfedcba987654321 |            1792 |            2048 |           15231  |           15743 |
i-hgfedcba123456789 |             512 |            2048 |            8575  |           15743 |
Not enough CPU units, short
$ willy -s my-service -c my-cluster --verbose
Service 'my-service' can not run on the 'my-cluster' cluster. Number of required CPU units is 3072 but the cluster
has 2048 CPU units available across 2 container instances.
Not enough CPU units, verbose
$ willy -s my-service -c my-cluster --verbose
Service 'my-service' can not run on the 'my-cluster' cluster. There are no container instances that meet the hardware
requirements of 3072 CPU units.

Container instances incapable of running the service:

        Instance ID |   CPU remaining |       CPU total | Memory remaining |    Memory total |
------------------- | --------------- | --------------- | ---------------- | --------------- |
i-abcdefgh123456789 |            1792 |            2048 |           15231  |           15743 |
i-hgfedcba987654321 |            1792 |            2048 |           15231  |           15743 |
i-hgfedcba123456789 |             512 |            2048 |            8575  |           15743 |

Memory

Not enough memory, short
$ willy -s my-service -c my-cluster
Service 'my-service' can not run on the 'my-cluster' cluster. Number of required memory units is 1024 but the
cluster has 256 memory units available across 2 container instance(s).
Not enough memory, verbose
$ willy -s my-service -c my-cluster --verbose
Service 'my-service' can not run on the 'my-cluster' cluster. There are no container instances that meet the
hardware requirements of 1024 memory units.

Container instances incapable of running the service:

        Instance ID |   CPU remaining |       CPU total | Memory remaining |    Memory total |
------------------- | --------------- | --------------- | ---------------- | --------------- |
i-abcdefgh123456789 |            1792 |            2048 |             512  |           15743 |
i-hgfedcba987654321 |            1792 |            2048 |             512  |           15743 |
i-hgfedcba123456789 |             512 |            2048 |             512  |           15743 |

Ports

Port(s) taken, short
$ willy -s my-service -c my-cluster
Service 'my-service' can not run on the 'my-cluster' cluster. The service requires ports [21, 22] that are used on
all container instances in the cluster.
Port(s) taken, verbose
$ willy -s my-service -c my-cluster --verbose
Service 'my-service' can not run on the 'my-cluster' cluster. The service requires ports [22, 53] that are used on all
container instances in the cluster.

Container instances incapable of running the service:

        Instance ID | Used ports (TCP) |Used ports (UDP) |
------------------- | ---------------- | --------------- |
i-abcdefgh123456789 |           22, 53 |                 |
i-hgfedcba987654321 |           22, 53 |                 |

Task placement constraints (attributes)

Wrong instance type placement constraint, short
$ willy -s my-service -c my-cluster --verbose
 Service 'my-service' can not run on the 'my-cluster' cluster. There are no container instances that have the 
attribute(s) required by the service.
Wrong instance type placement constraint, verbose
$ willy -s my-service -c my-cluster --verbose
Service 'my-service' can not run on the 'my-cluster' cluster. There are no container instances that have the
attribute(s) required by the service.

Missing attribute(s):

attribute:ecs.instance-type==t2.nano

Why use willy?

It's fast

Simply put, willy sacrifices being 100% correct five seconds from now in favor of providing a quick answer now.

ECS scheduler tries to deploy your service in a robust and safe way. This can take time, depending on several configuration options. Check out Nathan's amazing article on Speeding up Amazon ECS container deployments for details.

willy perform its checks at a point in time and its answer represents the possibility to fit all the tasks in your service on the cluster at that time. Those tasks might fit on the cluster five seconds later, depending on the state of the cluster and willy can't predict that.

It has details

Missing hardware resources

If an ECS deployment fails because of lack of CPU resources, the deployment event is :

Service my-service was unable to place a task because no container instance met all of its requirements.
The closest matching container-instance 48fccf62981f4fc2b53e62233a586fe8 has insufficient CPU units available.

willy provides more details with its --verbose flag:

Service 'my-service' can not run on the 'my-cluster' cluster. There are no container instances that meet the hardware
requirements of 3072 CPU units.

Container instances incapable of running the service:

        Instance ID |   CPU remaining |       CPU total | Memory remaining |    Memory total |
------------------- | --------------- | --------------- | ---------------- | --------------- |
i-abcdefgh123456789 |            1792 |            2048 |           15231  |           15743 |
i-hgfedcba987654321 |            1792 |            2048 |           15231  |           15743 |
i-hgfedcba123456789 |             512 |            2048 |            8575  |           15743 |

Missing/incorrect attribute (VPC ID)

If an ECS deployment fails because of a missing attribute, the deployment event will state something similar to:

The closest matching container-instance is missing an attribute required by your task

which lacks important details, such as the required attribute's name and value.

willy does it differently when reporting missing attributes:

Service 'my-service' can not run on the 'my-cluster' cluster. There are no container instances that have the attributes
required by the service. Attribute(s) missing or incorrect on the container instance:

'ecs.vpc-id' with value 'vpc-a1b2c3d4e5f6'"

Implemented features

Task placement process on Amazon ECS - source:

When Amazon ECS places tasks, it uses the following process to select container instances:

1. Identify the container instances that satisfy the CPU, GPU, memory, and port requirements in the service.
2. Identify the container instances that satisfy the task placement constraints.
3. Identify the container instances that satisfy the task placement strategies.
4. Select the container instances for task placement.

willy implements a validator for each of the steps listed above

Identify the container instances that satisfy the... willy feature Implemented? Has tests?
CPU requirements CPU validator
Memory requirements Memory validator
Port requirements Network validator
GPU requirements Attributes validator
Task placement constraints Attributes validator 2

Caveats and known limitations

Mimicking

Exact technical details of the container instance selection process are not publicly available. willy approximates the process from observations made while scheduling services on ECS.

Speed vs. accuracy

willy sacrifices being 100% correct five seconds from now in favor of providing a quick answer now.

willy perform its checks at a point in time and its answer represents the possibility to fit all the tasks in your service on the cluster at that time. All tasks might fit on the cluster five seconds later, depending on the state of the cluster and willy can't predict that.

Task placement constraints

Implementation of all operators supported by ECS is not complete.

Operator Description Implemented?
==, equals String equality
!=, not_equals String inequality
>, greater_than Greater than
>=, greater_than_equal Greater than or equal to
<, less_than Less than
<=, less_than_equal Less than or equal to
exists Subject exists
!exists, not_exists Subject doesn't exist
in Value in argument list
!in, not_in Value not in argument list
=~, matches Pattern match
!~, not_matches Pattern mismatch

About

A CLI tool that checks whether an ECS service can be scheduled on an ECS cluster. Shows verbose information on why it can or can't be scheduled.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages