-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase steady state and burst throttles for task endpoints #1240
Conversation
looks like gocyclo failed? if/else statements with || tends to push the cyclomatic complexity up, which we currently fail if a method has complexity > 15. |
agent/config/config.go
Outdated
rpsLimitEnvVal := os.Getenv("ECS_TASK_METADATA_RPS_LIMIT") | ||
rpsLimitSplits := strings.Split(rpsLimitEnvVal, ",") | ||
if len(rpsLimitSplits) != 2 { | ||
seelog.Warnf("Invalid format for \"ECS_TASK_METADATA_RPS_LIMIT\", expected: \"rateLimit,burst\"") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can also do something like this to avoid these escapes:
seelog.Warnf(`Invalid format for "ECS_TASK_METADATA_RPS_LIMIT", expected: "rateLimit,burst"`)
Also, you can just use seelog.Warn()
here.
agent/config/config.go
Outdated
rpsLimitSplits := strings.Split(rpsLimitEnvVal, ",") | ||
if len(rpsLimitSplits) != 2 { | ||
seelog.Warnf("Invalid format for \"ECS_TASK_METADATA_RPS_LIMIT\", expected: \"rateLimit,burst\"") | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just return
here so that you don't need an additional else
block. Avoids one level of indentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block got moved into this method from the original environmentConfig()
and was never refactored. Will make this change.
agent/config/config.go
Outdated
} else { | ||
rpsLimitVal, err := strconv.Atoi(strings.TrimSpace(rpsLimitSplits[0])) | ||
rpsBurstLimitVal, err1 := strconv.Atoi(strings.TrimSpace(rpsLimitSplits[1])) | ||
if err != nil || err1 != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kind of error handling makes it harder to know which field exactly caused the error. Doing something like this is much simpler imo:
rpsLimit, err := strconv.Atoi(strings.TrimSpace(rpsLimitSplits[0]))
if err != nil {
...
return 0, 0
}
rpsBurstLimit, err := strconv.Atoi(strings.TrimSpace(rpsLimitSplits[1]))
if err != nil {
...
return 0, 0
}
return rpsLimit, rpsBurstLimit
agent/config/config.go
Outdated
// DefaultRpsLimit is set as 40. This is arrived from our benchmarking | ||
// results where task endpoint can handle 4000 rps effectively. Here, 100 containers | ||
// will be able to send out 40 rps. | ||
DefaultRpsLimit = 40 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DefaultRpsLimit
is a bit vague. Similar to above. Can you please rename this to something like DefaultTaskMetadataSteadyStateRate
?
agent/config/config.go
Outdated
DefaultRpsLimit = 40 | ||
|
||
// DefaultRpsBurstLimit is set to handle 60 burst requests at once | ||
DefaultRpsBurstLimit = 60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above. Can you please rename this to something like DefaultTaskMetadataBurstRate
?
agent/config/config.go
Outdated
@@ -434,6 +445,25 @@ func getTaskCPUMemLimitEnabled() Conditional { | |||
return taskCPUMemLimitEnabled | |||
} | |||
|
|||
func getTaskMetatadataRpsLimits() (int, int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getTaskMetadataThrottles()
is a better name for this.
agent/config/types.go
Outdated
@@ -198,4 +198,10 @@ type Config struct { | |||
// CgroupPath is the path expected by the agent, defaults to | |||
// '/sys/fs/cgroup' | |||
CgroupPath string | |||
|
|||
// RpsLimit specifies the steady state throttle for the task metadata endpoint | |||
RpsLimit int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above. Can you please rename this to something like TaskMetadataSteadyStateRate
?
agent/config/types.go
Outdated
RpsLimit int | ||
|
||
// RpsBurstLimit specifies the burst rate throttle for the task metadata endpoint | ||
RpsBurstLimit int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above. Can you please rename this to something like TaskMetadataBurstRate
?
agent/config/config.go
Outdated
@@ -346,6 +354,7 @@ func environmentConfig() (Config, error) { | |||
} | |||
containerMetadataEnabled := utils.ParseBool(os.Getenv("ECS_ENABLE_CONTAINER_METADATA"), false) | |||
dataDirOnHost := os.Getenv("ECS_HOST_DATA_DIR") | |||
rpsLimit, rpsBurstLimit := getTaskMetatadataRpsLimits() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
steadyStateRate
and burstRate
are better names imo
agent/config/config.go
Outdated
@@ -434,6 +445,25 @@ func getTaskCPUMemLimitEnabled() Conditional { | |||
return taskCPUMemLimitEnabled | |||
} | |||
|
|||
func getTaskMetatadataRpsLimits() (int, int) { | |||
var rpsLimit, rpsBurstLimit int | |||
rpsLimitEnvVal := os.Getenv("ECS_TASK_METADATA_RPS_LIMIT") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add an entry in README and changelog for this?
agent/config/config.go
Outdated
@@ -434,6 +445,25 @@ func getTaskCPUMemLimitEnabled() Conditional { | |||
return taskCPUMemLimitEnabled | |||
} | |||
|
|||
func getTaskMetatadataRpsLimits() (int, int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe i missed an earlier conversation. why don't we want two separate env variables for this? ECS_TASK_METADATA_RPS_LIMIT
and ECS_TASK_METADATA_RPS_BURST_LIMIT
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since steady state and burst throttles both are needed and related, it would be convenient for customers to have a single environment variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm i don't feel strongly either way, my concern was mostly if it would add extra overhead of enforcing ordering of the throttle limits (for the customer) and extra validation code (for us). but what you said about "needed and related" also makes sense. okay thanks.
dc61d25
to
91dc8fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changes lgtm, minor nits only.
agent/config/config.go
Outdated
rpsLimitSplits := strings.Split(rpsLimitEnvVal, ",") | ||
if len(rpsLimitSplits) != 2 { | ||
seelog.Warn(`Invalid format for "ECS_TASK_METADATA_RPS_LIMIT", expected: "rateLimit,burst"`) | ||
return steadyStateRate, burstRate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: for consistency, should we explicitly return 0, 0
as you do for errors reading input in the two cases below?
README.md
Outdated
@@ -177,6 +177,8 @@ additional details on each available environment variable. | |||
| `ECS_HOST_DATA_DIR` | `/var/lib/ecs` | The source directory on the host from which ECS_DATADIR is mounted. We use this to determine the source mount path for container metadata files in the case the ECS Agent is running as a container. We do not use this value in Windows because the ECS Agent is not running as container in Windows. | `/var/lib/ecs` | `Not used` | | |||
| `ECS_ENABLE_TASK_CPU_MEM_LIMIT` | `true` | Whether to enable task-level cpu and memory limits | `true` | `false` | | |||
| `ECS_CGROUP_PATH` | `/sys/fs/cgroup` | The root cgroup path that is expected by the ECS agent. This is the path that accessible from the agent mount. | `/sys/fs/cgroup` | Not applicable | | |||
| `ECS_TASK_METADATA_RPS_LIMIT` | `100,150` | Comma separated values for steady state and burst throttle limits for task metadata endpoint | `40,60` | `40,60` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: specify integer value w/ "Comma separated integer values for..."
…endpoints This fixes aws#1231
Summary
This addresses #1231
Implementation details
default steady state throttle of 40 rps and burst of 60 rps.
values configurable through
ECS_TASK_METADATA_RPS_LIMIT
environment variable.Testing
make release
)go build -out amazon-ecs-agent.exe ./agent
)make test
) passgo test -timeout=25s ./agent/...
) passmake run-integ-tests
) pass.\scripts\run-integ-tests.ps1
) passmake run-functional-tests
) pass.\scripts\run-functional-tests.ps1
) passNew tests cover the changes: yes
Description for the changelog
Licensing
This contribution is under the terms of the Apache 2.0 License: yes