Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(OTEL): move troubleshooting topics to separate pages #17055

Closed
wants to merge 7 commits into from

Conversation

ally-sassman
Copy link
Contributor

@ally-sassman ally-sassman commented Apr 23, 2024

In this PR, I’m separating out troubleshooting sections into subpages within a new troubleshooting section (to be linked to from other places). Each page will follow the same format (traditionally two headers titled Problem/Solution).

@ally-sassman ally-sassman added content requests related to docs site content from_tw Identifies issues/PRs from Tech Docs writers labels Apr 23, 2024
@ally-sassman ally-sassman self-assigned this Apr 23, 2024
@github-actions github-actions bot added this to Hero to triage in Docs PRs and Issues Apr 23, 2024
Copy link

Hi @ally-sassman 👋

Thanks for your pull request! Your PR is in a queue, and a writer will take a look soon. We generally publish small edits within one business day, and larger edits within three days.

We will automatically generate a preview of your request, and will comment with a link when the preview is ready (usually 10 to 20 minutes). If you add any more commits, you can comment netlify build on this PR to update the preview.

Copy link

netlify bot commented Apr 23, 2024

Deploy Preview for docs-website-netlify ready!

Name Link
🔨 Latest commit 153ad6d
🔍 Latest deploy log https://app.netlify.com/sites/docs-website-netlify/deploys/662ae73aec6fd40008085d9d
😎 Deploy Preview https://deploy-preview-17055--docs-website-netlify.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@rhetoric101 rhetoric101 moved this from Hero to triage to In progress in Docs PRs and Issues Apr 23, 2024
@@ -0,0 +1,29 @@
---
title: "Troubleshoot OpenTelemetry: Collector issues"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a better title for this page?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another situation where I wonder if this really belongs as a "troubleshooting" document. I think it makes sense for us to have some documentation pertaining to the OpenTelemetry Collector, but this is one of those situations where we need to decide how much we should document ourselves versus directing people to the official OpenTelemetry documentation. For example https://opentelemetry.io/docs/collector/troubleshooting/


For solutions to common collector problems, see the [OpenTelemetry troubleshooting documentation](https://opentelemetry.io/docs/collector/troubleshooting/).

Within New Relic, we recommend using this NRQL query to show all collector metrics sent to New Relic:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are some examples of things they can find/troubleshoot with this data?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the metrics the query below pertains to are documented here https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/monitoring.md.


## Solution

OpenTelemetry entities are synthesized based on the public rules described for the [`EXT-SERVICE`](https://github.com/newrelic/entity-definitions/blob/b8e75d839eed7859044a9bb601a62e5323107350/definitions/ext-service/definition.yml) entity type. The standard rule to match relies on the presence of the `service.name` dimension (which follows the OpenTelemetry [semantic conventions](https://github.com/open-telemetry/opentelemetry-specification/blob/527206ea49e384de957eda105d9425aa8fefca25/specification/overview.md#semantic-conventions)).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The standard rule to match relies on the presence of the service.name dimension (which follows the OpenTelemetry semantic conventions)" is a bit too passive. How about this revision for clarity?

Suggested change
OpenTelemetry entities are synthesized based on the public rules described for the [`EXT-SERVICE`](https://github.com/newrelic/entity-definitions/blob/b8e75d839eed7859044a9bb601a62e5323107350/definitions/ext-service/definition.yml) entity type. The standard rule to match relies on the presence of the `service.name` dimension (which follows the OpenTelemetry [semantic conventions](https://github.com/open-telemetry/opentelemetry-specification/blob/527206ea49e384de957eda105d9425aa8fefca25/specification/overview.md#semantic-conventions)).
OpenTelemetry entities are synthesized based on the public rules described for the [`EXT-SERVICE`](https://github.com/newrelic/entity-definitions/blob/b8e75d839eed7859044a9bb601a62e5323107350/definitions/ext-service/definition.yml) entity type. How the entity name appears in New Relic is based on the `service.name` dimension, which follows the OpenTelemetry [semantic conventions](https://github.com/open-telemetry/opentelemetry-specification/blob/527206ea49e384de957eda105d9425aa8fefca25/specification/overview.md#semantic-conventions)).


OpenTelemetry entities are synthesized based on the public rules described for the [`EXT-SERVICE`](https://github.com/newrelic/entity-definitions/blob/b8e75d839eed7859044a9bb601a62e5323107350/definitions/ext-service/definition.yml) entity type. The standard rule to match relies on the presence of the `service.name` dimension (which follows the OpenTelemetry [semantic conventions](https://github.com/open-telemetry/opentelemetry-specification/blob/527206ea49e384de957eda105d9425aa8fefca25/specification/overview.md#semantic-conventions)).

To set the `service.name` with the OpenTelemetry Java SDK, include it in your [resource](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/opentelemetry-concepts/#resources):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this available with other language SDKs, or just Java?


* For all other SDK languages, set `service.name` by declaring it in the `OTEL_RESOURCE_ATTRIBUTES` or `OTEL_SERVICE_NAME` [environment variables](https://github.com/open-telemetry/opentelemetry-specification/blob/20c82de552d08428e8cadaaef3e6cb46812f7c00/specification/sdk-environment-variables.md#general-sdk-configuration).

For New Relic <InlinePopover type="logs" />, you can use a structured log template to inject the `service.name`. See [Logs in context with Log4j2](https://github.com/newrelic/newrelic-opentelemetry-examples/blob/e3f5ee85b4dcd8dd29f8f69d78d122b82a9638ba/other-examples/java/logs-in-context-log4j2/Log4j2EventLayout.json#L2) for an example.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would they inject this via logs vs. declaring it in the OTEL_RESOURCE_ATTRIBUTES or `OTEL_SERVICE_NAME?

For New Relic <InlinePopover type="logs" />, you can use a structured log template to inject the `service.name`. See [Logs in context with Log4j2](https://github.com/newrelic/newrelic-opentelemetry-examples/blob/e3f5ee85b4dcd8dd29f8f69d78d122b82a9638ba/other-examples/java/logs-in-context-log4j2/Log4j2EventLayout.json#L2) for an example.

<Callout variant="tip">
For more OpenTelemetry examples with New Relic, visit the [newrelic-opentelemetry-examples](https://github.com/newrelic/newrelic-opentelemetry-examples) repository on GitHub.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a better home/context (maybe after the "For all other SDKs...)

@@ -13,7 +13,7 @@ redirects:
freshnessValidatedDate: never
---

Setting up OpenTelemetry and getting the most from it can be a challenge. To help you optimize your experience, we've created these best practice guides. If you're trying to resolve a specific problem, see our [troubleshooting guide](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/opentelemetry-troubleshooting).
Setting up OpenTelemetry and getting the most from it can be a challenge. To help you optimize your experience, we've created these best practice guides.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Setting up OpenTelemetry and getting the most from it can be a challenge. To help you optimize your experience, we've created these best practice guides.
Setting up OpenTelemetry and getting the most from it can be a challenge. To help you optimize your experience, we've created these best practice guides. If you're trying to resolve a specific problem, see our [troubleshooting topics](/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/troubleshooting/collector-issues).

ally-sassman and others added 2 commits April 25, 2024 16:13
…rations/opentelemetry/troubleshooting/missing-entities-relationships.mdx

## Solution

To correlate your logs with trace data, the logs need to include the trace context in `trace_id` and `span_id`. However, to ensure your logs show up in the New Relic UI, you'll need to configure rules in your log pipeline to translate `trace_id` and `span_id` to `trace.id` and `span.id`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do they do this? Is there an example we can provide?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document lacks a lot of context. That is, this is not a general problem people will experience. It is a problem that may occur in specific use cases. We can elaborate and improve this doc in a separate PR.

That said, we have an example here. https://github.com/newrelic/newrelic-opentelemetry-examples/tree/main/other-examples/java/logs-in-context-log4j2#introduction.


If you've checked the above, try out these New Relic features:

* Check your data management hub to [facet data ingest](/docs/data-apis/manage-data/manage-data-coming-new-relic/#facet-data-ingest) and determine how much data is arriving from various sources.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you facet-ing with in the data management hub? If the recommendation is instrumentation.provider or newrelic.source, then I should delete this bullet point (and depend on the second bullet point)

Comment on lines +116 to +123
path: /docs/more-integrations/open-source-telemetry-integrations/opentelemetry/troubleshooting/collector-issues
- title: Missing entities or relationships
path: /docs/more-integrations/open-source-telemetry-integrations/opentelemetry/troubleshooting/missing-entities-relationships
- title: Missing logs
path: /docs/more-integrations/open-source-telemetry-integrations/opentelemetry/troubleshooting/missing-logs
- title: Missing OTLP data
path: /docs/more-integrations/open-source-telemetry-integrations/opentelemetry/troubleshooting/missing-otlp-data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should consider if the pattern of a page per troubleshooting topic is what we want. I understand these pages were broken out from the original mega troubleshooting doc, but this list of pages strikes me as a strange/incomplete set of topics.

It might make sense to start with a high level organization of topics we believe we should cover, then decide if it makes sense in one doc or multiple.

@@ -0,0 +1,42 @@
---
title: "Troubleshoot OpenTelemetry: Missing entities or relationships"
Copy link
Contributor

@alanwest alanwest Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document is 1) limited to service entities and 2) barely addresses relationships in any meaningful way.

A document about entities and relationships is sorely needed. Though, it's strange as a "troubleshooting" doc. We really need a top-level document addressing this subject holistically. Including, service, host, and other entity types that can be synthesized from otel data. Plus how relationships work between these entity types.

Copy link
Contributor

@alanwest alanwest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think it is worth reorganzing the existing troubleshooting document into separate documents.

@jack-berg and I discussed and we'd like to suggest a different plan of action.

  1. @jack-berg will submit a PR that better documents the "missing data" issue.
  2. Remove the "missing logs" document. Consider moving the content to the existing logs documentation. @jack-berg will make this assessment.
  3. Remove the "collector issues" document. Add a link (maybe here) to the official opentelemetry.io collector troubleshooting documentation.
  4. Remove the "entities/relationships" document. I will draft proper documentation about entities and relationships. It will not be a troubleshooting document. The existing content will be unnecessary.

In essence, I think all of this content just needs to be rewritten and the existing content just needs to be deleted.

@homelessbirds homelessbirds moved this from In progress to Drafts in Docs PRs and Issues May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content requests related to docs site content from_tw Identifies issues/PRs from Tech Docs writers
Projects
Development

Successfully merging this pull request may close these issues.

None yet

2 participants