Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serverless] Improve the adaptive flush strategy of the serverless extension #11166

Merged
merged 5 commits into from
Mar 8, 2022

Conversation

jcstorms1
Copy link
Contributor

@jcstorms1 jcstorms1 commented Mar 4, 2022

What does this PR do?

Increases the amount of invocations required from for the extension to transition into a periodic flush strategy.

  • Increased the amount of invocation time stamps stored from 20 to 30.
  • Increased periodic flush threshold from 3 to 20 invocations.

Motivation

Hand testing a lambda function was transitioning the extension into a periodic flush strategy too quickly since users would often times invoke the function more than 3 times. This could lead to confusion since not all invocations would appear on the serverless page until reaching the flush interval of 20 seconds.

Additional Notes

Possible Drawbacks / Trade-offs

Describe how to test/QA your changes

Tested by manually deploying the extension and hand testing.

Reviewer's Checklist

  • If known, an appropriate milestone has been selected; otherwise the Triage milestone is set.
  • Use the major_change label if your change either has a major impact on the code base, is impacting multiple teams or is changing important well-established internals of the Agent. This label will be use during QA to make sure each team pay extra attention to the changed behavior. For any customer facing change use a releasenote.
  • A release note has been added or the changelog/no-changelog label has been applied.
  • Changed code has automated tests for its functionality.
  • Adequate QA/testing plan information is provided if the qa/skip-qa label is not applied.
  • At least one team/.. label has been applied, indicating the team(s) that should QA this change.
  • If applicable, docs team has been notified or an issue has been opened on the documentation repo.
  • If applicable, the need-change/operator and need-change/helm labels have been applied.
  • If applicable, the config template has been updated.

@jcstorms1 jcstorms1 added this to the Triage milestone Mar 4, 2022
@jcstorms1 jcstorms1 requested a review from a team as a code owner March 4, 2022 21:23
nhinsch
nhinsch previously approved these changes Mar 4, 2022
// something reliable.
if len(d.lastInvocations) < 3 {
// with less than 20 invocations, we may switch to periodical flushing prematurely.
if len(d.lastInvocations) < 20 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about line 75? if freq.Seconds() < 60*5, could/should we also tighten up that rule? Delaying metric for 5 mins is not a good idea to me. The only reason we want the periodic flushing instead of the per invocation flushing is to avoid hefty cost overhead, but according to my estimates, flushing once per minute, assuming the flushing adds 1s duration, with memorysize 1GB, the cost overhead < $1/month.

assert.Equal((&flush.AtTheEnd{}).String(), d.AutoSelectStrategy().String(), "not the good strategy has been selected")
assert.True(d.StoreInvocationTime(now.Add(-time.Second * 70)))
assert.True(d.StoreInvocationTime(now.Add(time.Second * 19)))
assert.Equal((&flush.AtTheEnd{}).String(), d.AutoSelectStrategy().String(), "not the good strategy has been selected")

// add a third invocation, after this, we have enough data to decide to switch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this comment is out of date now, probably best to just remove it and let the code speak for itself.

@@ -87,15 +93,14 @@ func TestInvocationInterval(t *testing.T) {
// first scenario, validate that we're not computing the interval if we only have 2 invocations done
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise this comment is also wrong now

@jcstorms1 jcstorms1 merged commit b42d391 into main Mar 8, 2022
@jcstorms1 jcstorms1 deleted the storms/improve-periodic-flushing branch March 8, 2022 17:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants