Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge Feature/task-resource-accounting to dev #3819

Merged
merged 2 commits into from Jul 20, 2023

Conversation

prateekchaudhry
Copy link
Contributor

@prateekchaudhry prateekchaudhry commented Jul 20, 2023

Summary

This feature implements task resource accounting in agent.

  • It removes task serialization where a task would wait for previous stopping tasks before progressing
  • It implements a host resource manager which keeps account of tasks which have 'taken' resources on the host. A running task frees resources with host resource manager on emitting it's change of status to STOPPED to ECS backend. Tasks waiting for resources will wait in a queue until enough resources for them become available. Current resources being tracked are cpu, memory, ports (tcp/udp) and number of gpus.
  • Fixes a known bug related to memory accounting related to this feature which incorrectly accounted memory when multiple containers are specified in task definition, with some containers using MemoryReservation field and others using Memory field

Implementation details

#3684 Host Resource Manager initialization
#3706 Add method to get host resources reserved for a task
#3700 Add host resource manager methods
#3723 Remove task serialization and use host resource manager for task resources
#3741 Add integ tests for task accounting
#3747 Change reconcile/container update order on init and waitForHostResources/emitCurrentStatus order
#3750 Dont consume host resources for tasks getting STOPPED while waiting in waitingTasksQueue
#3782 fix memory resource accounting for multiple containers in single task

Related Containers Roadmap Issue

aws/containers-roadmap#325

Testing

Manual verification along with unit tests, integ tests and functional tests in the mentioned PRs to check methods and functional behavior

New tests cover the changes:
Yes

Description for the changelog

Add task resource accounting in ECS Agent

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

* Revert "Revert "host resource manager initialization""

This reverts commit dafb967.

* Revert "Revert "Add method to get host resources reserved for a task (#3706)""

This reverts commit 8d824db.

* Revert "Revert "Add host resource manager methods (#3700)""

This reverts commit bec1303.

* Revert "Revert "Remove task serialization and use host resource manager for task resources (#3723)""

This reverts commit cb54139.

* Revert "Revert "add integ tests for task accounting (#3741)""

This reverts commit 61ad010.

* Revert "Revert "Change reconcile/container update order on init and waitForHostResources/emitCurrentStatus order (#3747)""

This reverts commit 60a3f42.

* Revert "Revert "Dont consume host resources for tasks getting STOPPED while waiting in waitingTasksQueue (#3750)""

This reverts commit 8943792.
…#3782)

* fix memory resource accounting for multiple containers

* change unit tests for multiple containers, add unit test for awsvpc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants