Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase steady state and burst throttles for task endpoints #1240

Merged
merged 1 commit into from
Feb 19, 2018

Conversation

sharanyad
Copy link
Contributor

@sharanyad sharanyad commented Feb 17, 2018

Summary

This addresses #1231

Implementation details

default steady state throttle of 40 rps and burst of 60 rps.
values configurable through ECS_TASK_METADATA_RPS_LIMIT environment variable.

Testing

  • Builds on Linux (make release)
  • Builds on Windows (go build -out amazon-ecs-agent.exe ./agent)
  • Unit tests on Linux (make test) pass
  • Unit tests on Windows (go test -timeout=25s ./agent/...) pass
  • Integration tests on Linux (make run-integ-tests) pass
  • Integration tests on Windows (.\scripts\run-integ-tests.ps1) pass
  • Functional tests on Linux (make run-functional-tests) pass
  • Functional tests on Windows (.\scripts\run-functional-tests.ps1) pass

New tests cover the changes: yes

Description for the changelog

Licensing

This contribution is under the terms of the Apache 2.0 License: yes

@yhlee-aws
Copy link
Contributor

looks like gocyclo failed? if/else statements with || tends to push the cyclomatic complexity up, which we currently fail if a method has complexity > 15.

rpsLimitEnvVal := os.Getenv("ECS_TASK_METADATA_RPS_LIMIT")
rpsLimitSplits := strings.Split(rpsLimitEnvVal, ",")
if len(rpsLimitSplits) != 2 {
seelog.Warnf("Invalid format for \"ECS_TASK_METADATA_RPS_LIMIT\", expected: \"rateLimit,burst\"")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can also do something like this to avoid these escapes:

seelog.Warnf(`Invalid format for "ECS_TASK_METADATA_RPS_LIMIT", expected: "rateLimit,burst"`)

Also, you can just use seelog.Warn() here.

rpsLimitSplits := strings.Split(rpsLimitEnvVal, ",")
if len(rpsLimitSplits) != 2 {
seelog.Warnf("Invalid format for \"ECS_TASK_METADATA_RPS_LIMIT\", expected: \"rateLimit,burst\"")
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just return here so that you don't need an additional else block. Avoids one level of indentation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block got moved into this method from the original environmentConfig() and was never refactored. Will make this change.

} else {
rpsLimitVal, err := strconv.Atoi(strings.TrimSpace(rpsLimitSplits[0]))
rpsBurstLimitVal, err1 := strconv.Atoi(strings.TrimSpace(rpsLimitSplits[1]))
if err != nil || err1 != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of error handling makes it harder to know which field exactly caused the error. Doing something like this is much simpler imo:

rpsLimit, err := strconv.Atoi(strings.TrimSpace(rpsLimitSplits[0]))
if err != nil {
 ...
 return 0, 0
}

rpsBurstLimit, err := strconv.Atoi(strings.TrimSpace(rpsLimitSplits[1]))
if err != nil {
 ...
 return 0, 0
}

return rpsLimit, rpsBurstLimit

// DefaultRpsLimit is set as 40. This is arrived from our benchmarking
// results where task endpoint can handle 4000 rps effectively. Here, 100 containers
// will be able to send out 40 rps.
DefaultRpsLimit = 40
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DefaultRpsLimit is a bit vague. Similar to above. Can you please rename this to something like DefaultTaskMetadataSteadyStateRate?

DefaultRpsLimit = 40

// DefaultRpsBurstLimit is set to handle 60 burst requests at once
DefaultRpsBurstLimit = 60
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above. Can you please rename this to something like DefaultTaskMetadataBurstRate?

@@ -434,6 +445,25 @@ func getTaskCPUMemLimitEnabled() Conditional {
return taskCPUMemLimitEnabled
}

func getTaskMetatadataRpsLimits() (int, int) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getTaskMetadataThrottles() is a better name for this.

@@ -198,4 +198,10 @@ type Config struct {
// CgroupPath is the path expected by the agent, defaults to
// '/sys/fs/cgroup'
CgroupPath string

// RpsLimit specifies the steady state throttle for the task metadata endpoint
RpsLimit int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above. Can you please rename this to something like TaskMetadataSteadyStateRate?

RpsLimit int

// RpsBurstLimit specifies the burst rate throttle for the task metadata endpoint
RpsBurstLimit int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above. Can you please rename this to something like TaskMetadataBurstRate?

@@ -346,6 +354,7 @@ func environmentConfig() (Config, error) {
}
containerMetadataEnabled := utils.ParseBool(os.Getenv("ECS_ENABLE_CONTAINER_METADATA"), false)
dataDirOnHost := os.Getenv("ECS_HOST_DATA_DIR")
rpsLimit, rpsBurstLimit := getTaskMetatadataRpsLimits()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

steadyStateRate and burstRate are better names imo

@@ -434,6 +445,25 @@ func getTaskCPUMemLimitEnabled() Conditional {
return taskCPUMemLimitEnabled
}

func getTaskMetatadataRpsLimits() (int, int) {
var rpsLimit, rpsBurstLimit int
rpsLimitEnvVal := os.Getenv("ECS_TASK_METADATA_RPS_LIMIT")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add an entry in README and changelog for this?

@@ -434,6 +445,25 @@ func getTaskCPUMemLimitEnabled() Conditional {
return taskCPUMemLimitEnabled
}

func getTaskMetatadataRpsLimits() (int, int) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe i missed an earlier conversation. why don't we want two separate env variables for this? ECS_TASK_METADATA_RPS_LIMIT and ECS_TASK_METADATA_RPS_BURST_LIMIT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since steady state and burst throttles both are needed and related, it would be convenient for customers to have a single environment variable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm i don't feel strongly either way, my concern was mostly if it would add extra overhead of enforcing ordering of the throttle limits (for the customer) and extra validation code (for us). but what you said about "needed and related" also makes sense. okay thanks.

@sharanyad sharanyad force-pushed the fix1231 branch 2 times, most recently from dc61d25 to 91dc8fa Compare February 19, 2018 19:03
@sharanyad sharanyad added this to the 1.17.1 milestone Feb 19, 2018
Copy link
Contributor

@adnxn adnxn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes lgtm, minor nits only.

rpsLimitSplits := strings.Split(rpsLimitEnvVal, ",")
if len(rpsLimitSplits) != 2 {
seelog.Warn(`Invalid format for "ECS_TASK_METADATA_RPS_LIMIT", expected: "rateLimit,burst"`)
return steadyStateRate, burstRate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for consistency, should we explicitly return 0, 0 as you do for errors reading input in the two cases below?

README.md Outdated
@@ -177,6 +177,8 @@ additional details on each available environment variable.
| `ECS_HOST_DATA_DIR` | `/var/lib/ecs` | The source directory on the host from which ECS_DATADIR is mounted. We use this to determine the source mount path for container metadata files in the case the ECS Agent is running as a container. We do not use this value in Windows because the ECS Agent is not running as container in Windows. | `/var/lib/ecs` | `Not used` |
| `ECS_ENABLE_TASK_CPU_MEM_LIMIT` | `true` | Whether to enable task-level cpu and memory limits | `true` | `false` |
| `ECS_CGROUP_PATH` | `/sys/fs/cgroup` | The root cgroup path that is expected by the ECS agent. This is the path that accessible from the agent mount. | `/sys/fs/cgroup` | Not applicable |
| `ECS_TASK_METADATA_RPS_LIMIT` | `100,150` | Comma separated values for steady state and burst throttle limits for task metadata endpoint | `40,60` | `40,60` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: specify integer value w/ "Comma separated integer values for..."

@sharanyad sharanyad merged commit ea8ea05 into aws:dev Feb 19, 2018
@richardpen richardpen mentioned this pull request Feb 19, 2018
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants