Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OTLP/GRPC] Ensure OTLP receiver handles consume errors correctly #8080

Merged
merged 17 commits into from
Jan 17, 2024

Conversation

VihasMakwana
Copy link
Contributor

@VihasMakwana VihasMakwana commented Jul 12, 2023

Description: Follow the receiver contract and return Unavailable for non-permanent and InvalidArgument for permanent errors for OTLP/gRPC receiver.

Leave the "Retry-After" field blank and let the client implement an exponential backoff strategy.

Link to tracking Issue: #4335

Testing: Added relevant test cases.

@codecov
Copy link

codecov bot commented Jul 12, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (9b5ef90) 90.74% compared to head (fec7f82) 90.76%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8080      +/-   ##
==========================================
+ Coverage   90.74%   90.76%   +0.01%     
==========================================
  Files         341      341              
  Lines       18344    18389      +45     
==========================================
+ Hits        16647    16691      +44     
- Misses       1360     1361       +1     
  Partials      337      337              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@VihasMakwana
Copy link
Contributor Author

@bogdandrutu can you have a quick look at this, if it's fine.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 9, 2023

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Aug 9, 2023
@github-actions github-actions bot removed the Stale label Aug 15, 2023
@VihasMakwana
Copy link
Contributor Author

@bogdandrutu @evan-bradley ^^

receiver/otlpreceiver/internal/metrics/otlp.go Outdated Show resolved Hide resolved
receiver/otlpreceiver/internal/trace/otlp.go Outdated Show resolved Hide resolved
VihasMakwana and others added 2 commits August 31, 2023 18:34
Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>
Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>
@VihasMakwana
Copy link
Contributor Author

@bogdandrutu can you a quick look if it's fine by you?

@VihasMakwana VihasMakwana changed the title Ensure OTLP receiver handles consume errors correctly [OTLP/GRPC] [OTLP/GRPC] Ensure OTLP receiver handles consume errors correctly Sep 5, 2023
@evan-bradley
Copy link
Contributor

Bogdan will be out for a few more weeks. @open-telemetry/collector-approvers could one of you please take a look?

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Sep 22, 2023
@atoulme atoulme removed the Stale label Nov 15, 2023
@VihasMakwana
Copy link
Contributor Author

@bogdandrutu can we get this merged? #8676 seems to waiting on it.

Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added Stale and removed Stale labels Dec 12, 2023
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added Stale and removed Stale labels Dec 28, 2023
Copy link
Member

@mx-psi mx-psi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@open-telemetry/collector-approvers I will merge this in 5 working days unless someone blocks it

Comment on lines +55 to +62
s, ok := status.FromError(err)
if !ok {
code := codes.Unavailable
if consumererror.IsPermanent(err) {
code = codes.InvalidArgument
}
s = status.New(code, err.Error())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this error handling is the same for all signals I do wonder if it could be extracted and re-used, but that is an optimization we could handle later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #9300 for this

@mx-psi mx-psi merged commit bb1ae64 into open-telemetry:main Jan 17, 2024
32 checks passed
@github-actions github-actions bot added this to the next release milestone Jan 17, 2024
codeboten pushed a commit that referenced this pull request Feb 1, 2024
The otlp receiver was recently updated via
#8080 to
properly propagate consumer errors back to clients as either permanent
or retriable. The code we're using to indicate a non-retriable error is
`codes.InvalidArgument`, which is the equivalent of `400` in HTTP.

While 100% correct according to the [OTLP
specification](https://github.com/open-telemetry/opentelemetry-proto/blob/main/docs/specification.md#failures)
to indicate a non-retriable error, I think `codes.Internal` (which is
equivalent to HTTP `500`), better conveys the actual state of the
collector in these situations.

Related to
#9357 (comment)


---------

Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>
mx-psi pushed a commit that referenced this pull request Mar 27, 2024
…rrors (#9357)

**Description:**
Updates the receiver's http response to return a proper http status
based on whether or not the pipeline returned a retryable error. Builds
upon the work done in
#8080 and
#9307

**Link to tracking Issue:**

Closes
#9337
Closes
#8132
Closes
#9636
Closes
#6725

**Testing:**

Updated lots of unit tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants