-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Serverless] Improve the adaptive flush strategy of the serverless extension #11166
Conversation
// something reliable. | ||
if len(d.lastInvocations) < 3 { | ||
// with less than 20 invocations, we may switch to periodical flushing prematurely. | ||
if len(d.lastInvocations) < 20 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about line 75? if freq.Seconds() < 60*5
, could/should we also tighten up that rule? Delaying metric for 5 mins is not a good idea to me. The only reason we want the periodic flushing instead of the per invocation flushing is to avoid hefty cost overhead, but according to my estimates, flushing once per minute, assuming the flushing adds 1s duration, with memorysize 1GB, the cost overhead < $1/month.
assert.Equal((&flush.AtTheEnd{}).String(), d.AutoSelectStrategy().String(), "not the good strategy has been selected") | ||
assert.True(d.StoreInvocationTime(now.Add(-time.Second * 70))) | ||
assert.True(d.StoreInvocationTime(now.Add(time.Second * 19))) | ||
assert.Equal((&flush.AtTheEnd{}).String(), d.AutoSelectStrategy().String(), "not the good strategy has been selected") | ||
|
||
// add a third invocation, after this, we have enough data to decide to switch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this comment is out of date now, probably best to just remove it and let the code speak for itself.
@@ -87,15 +93,14 @@ func TestInvocationInterval(t *testing.T) { | |||
// first scenario, validate that we're not computing the interval if we only have 2 invocations done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise this comment is also wrong now
What does this PR do?
Increases the amount of invocations required from for the extension to transition into a periodic flush strategy.
Motivation
Hand testing a lambda function was transitioning the extension into a periodic flush strategy too quickly since users would often times invoke the function more than 3 times. This could lead to confusion since not all invocations would appear on the serverless page until reaching the flush interval of 20 seconds.
Additional Notes
Possible Drawbacks / Trade-offs
Describe how to test/QA your changes
Tested by manually deploying the extension and hand testing.
Reviewer's Checklist
Triage
milestone is set.major_change
label if your change either has a major impact on the code base, is impacting multiple teams or is changing important well-established internals of the Agent. This label will be use during QA to make sure each team pay extra attention to the changed behavior. For any customer facing change use a releasenote.changelog/no-changelog
label has been applied.qa/skip-qa
label is not applied.team/..
label has been applied, indicating the team(s) that should QA this change.need-change/operator
andneed-change/helm
labels have been applied.