-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add section on retry config, new OTLP Troubleshooting page
- Loading branch information
Showing
3 changed files
with
61 additions
and
0 deletions.
There are no files selected for viewing
49 changes: 49 additions & 0 deletions
49
...ntegrations/opentelemetry/best-practices/opentelemetry-otlp-troubleshooting.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
--- | ||
title: New Relic OTLP Troubleshooting | ||
tags: | ||
- Integrations | ||
- Open source telemetry integrations | ||
- OpenTelemetry | ||
- OTLP | ||
- Troubleshoot | ||
metaDescription: Troubleshoot common OTLP ingest errors | ||
freshnessValidatedDate: never | ||
--- | ||
|
||
New Relic has supported [native OTLP ingest](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/) for several years. In the process of working through support cases that come up from time to time, we've learned about common issues users face. For some, the problem is easy to identify and fix. Others are deviously tricky, given that the internet is unreliable and there are many components (software, networking, hardware, etc) involved under the control of various parties (customers, New Relic, and public networking infrastructure outside the control of either). With so much complexity, configuration, and failure points, it can be difficult to determine which is at fault an how to best address. | ||
|
||
Filing and working through a support case can be time consuming and at times frustrating for customers (and for New Relic!). Therefore, we've put together this troubleshooting guide to help establish a shared understanding, and provide tools to self-diagnose and fix issues when possible. | ||
|
||
First, please review the New Relic [OTLP configuration requirements / recommendations](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#configuration). It contains essential advice and context that anyone looking to use OTLP with New Relic should be aware of. | ||
|
||
The [Issues Catalog](#issue-catalog) lists a variety of different errors we've seen customers experience, with mitigation steps which often reference items from [OTLP configuration requirements / recommendations](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#configuration). | ||
|
||
# Issue Catalog [#issue-catalog] | ||
|
||
| OTLP Protocol Version | Type | Language / Ecosystem | Fingerprint | Known Resolution | Notes | | ||
|---|---|---|---|---|---| | ||
| HTTP | 401 - Unauthorized | Java | `io.opentelemetry.exporter.internal.http.HttpExporter - Failed to export spans. Server responded with HTTP status code 401.` | [Include API Key](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#api-key) | Missing `api-key` header | | ||
| HTTP | 401 - Unauthorized | Collector | `Exporting failed. The error is not retryable. Dropping data. {"kind": "exporter", "data_type": "traces", "name": "otlphttp", "error": "Permanent error: error exporting items, request to https://otlp.nr-data.net/v1/traces responded with HTTP Status Code 401, Message=, Details=[]", "dropped_items": 4}` | [Include API Key](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#api-key) | Missing `api-key` header | | ||
| HTTP | 401 - Unauthorized | Go | `failed to upload metrics: failed to send metrics to https://otlp.nr-data.net/v1/metrics: 401 Unauthorized` | [Include API Key](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#api-key) | Missing `api-key` header | | ||
| HTTP | 403 - Forbidden | Java | `io.opentelemetry.exporter.internal.http.HttpExporter - Failed to export spans. Server responded with HTTP status code 403.` | [Verify API Key](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#api-key) | Invalid `api-key` header | | ||
| HTTP | 403 - Forbidden | Java | `Exporting failed. The error is not retryable. Dropping data. {"kind": "exporter", "data_type": "traces", "name": "otlphttp", "error": "Permanent error: error exporting items, request to https://otlp.nr-data.net/v1/traces responded with HTTP Status Code 403, Message=, Details=[]", "dropped_items": 14}` | [Verify API Key](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#api-key) | Invalid `api-key` header | | ||
| HTTP | 403 - Forbidden | Go | `traces export: failed to send to https://otlp.nr-data.net/v1/traces: 403 Forbidden` | [Verify API Key](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#api-key) | Invalid `api-key` header | | ||
| HTTP | 403 - Forbidden | .NET | `Exporter failed send data to collector to {0} endpoint. Data will not be sent. Exception: {1}{https://otlp.nr-data.net:4317/v1/traces}{System.Net.Http.HttpRequestException: Response status code does not indicate success: 403 (Forbidden).` | [Verify API Key](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#api-key) | Invalid `api-key` header | | ||
| HTTP | Timeout | Java | `io.opentelemetry.exporter.internal.http.HttpExporter - Failed to export spans. The request could not be executed. Full error message: timeout` | [Tune batching / timeout](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#payload) | Occurs after export times out. Check timeout settings and client network status.<br></br>If you've ruled out client side network and configuration, open support case. | | ||
| HTTP | Timeout | Collector | `max elapsed time expired failed to make an HTTP request: Post \"https://otlp.nr-data.net/v1/traces\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)` | [Tune batching / timeout](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#payload) | Typically occurs after retry attempts fail and export times out. Can be related to client network, client retry / timeout configuration, or New Relic network / servers.<br></br>If you've ruled out client side network and configuration, open support case. | | ||
| HTTP | Timeout | Go | `failed to upload metrics: context deadline exceeded: retry-able request failure` | [Tune batching / timeout](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#payload) | Occurs after export times out. Check timeout settings and client network status.<br></br>If you've ruled out client side network and configuration, open support case. | | ||
| HTTP | Rate limit | Collector | `Exporting failed. Will retry the request after interval. {"kind": "exporter", "data_type": "metrics", "name": "otlphttp", "error": "Throttle (29s), error: error exporting items, request to https://otlp.nr-data.net:443/v1/metrics responded with HTTP Status Code 429", "interval": "29s"}` | [Tune batching](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#payload) | Rate limit exceeded.<br></br>Adjust batching configuration to reduce request rate. | | ||
| gRPC | Code 2 - Unknown<br></br>Timeout | Java | `io.opentelemetry.exporter.internal.grpc.GrpcExporter - Failed to export spans. Server responded with gRPC status code 2. Error message: timeout` | [Tune batching / timeout](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#payload) | Occurs after export times out. Check timeout settings and client network status.<br></br>If you've ruled out client side network and configuration, open support case. | | ||
| gRPC | Code 2 - Unknown<br></br>HTTP 500 | Collector | `rpc error: code = Unknown desc = unexpected HTTP status code received from server: 500 (Internal Server Error); malformed header: missing HTTP content-type` | | New Relic networking vendor produced non-retriable status code for transient error.<br></br>If this happens repeatedly, open support case. | | ||
| gRPC | Code 2 - Unknown<br></br>HTTP 530 | Collector | `rpc error: code = Unknown desc = unexpected HTTP status code received from server: 530 (); transport: received unexpected content-type \"text/html; charset=UTF-8\"` | | New Relic networking vendor produced non-retriable status code for transient error.<br></br>If this happens repeatedly, open support case. | | ||
| gRPC | Code 4 - DeadlineExceeded | Collector | `rpc error: code = DeadlineExceeded desc = context deadline exceeded` | [Tune batching / timeout](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#payload) | Typically occurs after retry attempts fail and export times out. Can be related to client network, client retry / timeout configuration, or New Relic network / servers.<br></br>If you've ruled out client side network and configuration, open support case. | | ||
| gRPC | Code 7 - Unauthenticated | Java | `io.opentelemetry.exporter.internal.grpc.GrpcExporter - Failed to export spans. Server responded with gRPC status code 7.` | [Include API Key](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#api-key) | Missing `api-key` header | | ||
| gRPC | Code 7 - Unauthenticated | .NET | `Exporter failed send data to collector to {0} endpoint. Data will not be sent. Exception: {1}{https://otlp.nr-data.net:4317/}{Grpc.Core.RpcException: Status(StatusCode="Unauthenticated", Detail="")` | [Include API Key](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#api-key) | Missing `api-key` header | | ||
| gRPC | Code 8 - ResourceExhausted | Collector | `rpc error: code = ResourceExhausted desc = Too many requests", "dropped_items": 1024` | [Tune batching](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#payload) | Rate limit exceeded.<br></br>Adjust batching configuration to reduce request rate. | | ||
| gRPC | Code 13 - Internal | Java | `io.opentelemetry.exporter.internal.grpc.GrpcExporter - Failed to export spans. Server responded with gRPC status code 13.` | | Not enough information to diagnose. Could be New Relic networking vendor produced non-retriable status code for a transient error.<br></br>If this happens repeatedly, open a support case. | | ||
| gRPC | Code 13 - Internal<br></br>HTTP 400 | Collector | `rpc error: code = Internal desc = unexpected HTTP status code received from server: 400 (Bad Request)` | | New Relic networking vendor produced non-retriable status code for a transient error.<br></br>If this happens repeatedly, open a support case. | | ||
| gRPC | Code 14 - Unavailable<br></br>Connection reset | Collector | `rpc error: code = Unavailable desc = error reading from server: read tcp 100.127.0.171:47470->162.247.241.110:4317: read: connection reset by peer` | [Tune retry](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#retry) | Should solve with retry. Ensure collector has sufficient resources to handle retry backpressure. | | ||
| gRPC | Code 14 - Unavailable<br></br>HTTP 502 | Collector | `rpc error: code = Unavailable desc = unexpected HTTP status code received from server: 502 (Bad Gateway); transport: received unexpected content-type "text/html"` | [Tune retry](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#retry) | Should solve with retry. Ensure collector has sufficient resources to handle retry backpressure. | | ||
| gRPC | Code 14 - Unavailable<br></br>HTTP 503 | Collector | `rpc error: code = Unavailable desc = unexpected HTTP status code received from server: 503 (Service Unavailable)` | [Tune retry](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#retry) | Should solve with retry. Ensure collector has sufficient resources to handle retry backpressure. | | ||
| gRPC | Code 16 - PermissionDenied | Java | `io.opentelemetry.exporter.internal.grpc.GrpcExporter - Failed to export spans. Server responded with gRPC status code 16.` | [Verify API Key](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#api-key) | Invalid `api-key` header | | ||
| gRPC | Code 16 - PermissionDenied | .NET | `Exporter failed send data to collector to {0} endpoint. Data will not be sent. Exception: {1}{https://otlp.nr-data.net:4317/}{Grpc.Core.RpcException: Status(StatusCode="PermissionDenied", Detail="")` | [Verify API Key](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-otlp/#api-key) | Invalid `api-key` header | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters