Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add backoff mechanism to recover from APM Server transport failures #148

Merged
merged 22 commits into from
Mar 22, 2022

Conversation

jlvoiseux
Copy link
Contributor

@jlvoiseux jlvoiseux commented Mar 15, 2022

Motivation / Summary

This PR aims to implement a backoff mechanism to recover from APM Server transport failures, due to the APM server being unavailable. To meet this goal, the mechanism described in issue #131 is implemented.

Changes

In apm_server.go, a simple finiste state machine is implemented. The states are as follows :

  • nominal : in this state, the extension is able to send APM data without any issue.
  • failure : we enter this state upon detection of a transport error. In this state, no data is sent to the APM server (the data is queued back into the dedicated channel).
  • waitingForNextAttempt : upon detection of a transport error, a timer (defined according to the APM Transport spec) is started. When it ends, this state is used to signal to the extension that it can try again to send data to the APM server.

Pending questions

  • When should we stop trying to reconnect to the APM server ? How can we relate this mechanism to the ELASTIC_APM_DATA_FORWARDER_TIMEOUT_SECONDS config option.

@github-actions github-actions bot added the aws-λ-extension AWS Lambda Extension label Mar 15, 2022
@elastic-apm-tech elastic-apm-tech added this to In Progress in APM-Agents (OLD) Mar 15, 2022
@apmmachine
Copy link

apmmachine commented Mar 15, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-03-22T12:51:00.525+0000

  • Duration: 7 min 2 sec

Test stats 🧪

Test Results
Failed 0
Passed 240
Skipped 6
Total 246

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

apm-lambda-extension/extension/apm_server.go Outdated Show resolved Hide resolved
apm-lambda-extension/extension/apm_server.go Outdated Show resolved Hide resolved
apm-lambda-extension/main.go Outdated Show resolved Hide resolved
apm-lambda-extension/main.go Outdated Show resolved Hide resolved
apm-lambda-extension/extension/apm_server.go Outdated Show resolved Hide resolved
apm-lambda-extension/extension/apm_server.go Outdated Show resolved Hide resolved
apm-lambda-extension/extension/apm_server.go Outdated Show resolved Hide resolved
apm-lambda-extension/extension/process_events.go Outdated Show resolved Hide resolved
apm-lambda-extension/extension/apm_server.go Outdated Show resolved Hide resolved
apm-lambda-extension/extension/apm_server.go Outdated Show resolved Hide resolved
apm-lambda-extension/extension/apm_server.go Outdated Show resolved Hide resolved
apm-lambda-extension/extension/apm_server.go Outdated Show resolved Hide resolved
apm-lambda-extension/main.go Outdated Show resolved Hide resolved
apm-lambda-extension/extension/apm_server.go Outdated Show resolved Hide resolved
apm-lambda-extension/extension/apm_server.go Outdated Show resolved Hide resolved
apm-lambda-extension/extension/logger.go Outdated Show resolved Hide resolved
@jlvoiseux jlvoiseux marked this pull request as ready for review March 18, 2022 09:41
Copy link
Contributor

@estolfo estolfo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good and I'm approving but I have just a few minor comments.

apm-lambda-extension/extension/apm_server.go Outdated Show resolved Hide resolved
apm-lambda-extension/extension/apm_server.go Outdated Show resolved Hide resolved
apm-lambda-extension/extension/apm_server_test.go Outdated Show resolved Hide resolved
@jlvoiseux jlvoiseux requested a review from estolfo March 21, 2022 17:16
Copy link
Contributor

@estolfo estolfo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor request

apm-lambda-extension/extension/apm_server.go Show resolved Hide resolved
apm-lambda-extension/extension/apm_server_test.go Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws-λ-extension AWS Lambda Extension
Projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants