Merge pull request #17130 from jsubirat/NR-239607_export_and_display_…

…fb_internal_metrics Nr 239607 export and display fb internal metrics
newrelic · May 8, 2024 · aaefc62 · aaefc62
2 parents 674f4f3 + 9e587a1
commit aaefc62
Showing 1 changed file with 28 additions and 1 deletion.
diff --git a/src/content/docs/logs/forward-logs/kubernetes-plugin-log-forwarding.mdx b/src/content/docs/logs/forward-logs/kubernetes-plugin-log-forwarding.mdx
@@ -25,7 +25,7 @@ To forward your Kubernetes logs to New Relic with our plugin:
 2. Optionally, you can further tune your installation in [Step 4 from the guided install](/docs/kubernetes-pixie/kubernetes-integration/installation/kubernetes-integration-install-configure/#kubernetes-install-navigation) using the numerous configuration options available in the [newrelic-logging repository](https://github.com/newrelic/helm-charts/tree/master/charts/newrelic-logging#configuration). However, we recommend the standard setup, as it is valid for most users.
 
   <Callout variant="important">
-  
+
   If you're [using a Kubernetes secret](https://github.com/newrelic/helm-charts/blob/master/charts/newrelic-logging/values.yaml#L8-L25) to store the New Relic license key, the `newrelic-logging` chart defaults to sending logs to the US API endpoint. If the license key belongs to an EU or FedRAMP account, and a secret is used for key storage, you must update the endpoint setting with the appropriate value from the [API reference docs](/docs/logs/log-api/introduction-log-api/#endpoint). Here's an example of how to set this for EU accounts:
 
   ```
@@ -39,6 +39,33 @@ To forward your Kubernetes logs to New Relic with our plugin:
 
 <InstallFeedback />
 
+## Troubleshoot your Kubernetes plugin for log forwarding installation [#troubleshoot-installation]
+
+Sometimes, despite correctly installing the Kubernetes plugin for log forwarding (`newrelic-logging` [Helm chart](https://github.com/newrelic/helm-charts/blob/master/charts/newrelic-logging)), you may encounter performance issues that affect the correct delivery of logs. In such circumstances, looking at the log forwarder internal metrics can be helpful to understand the cause of a potential bottleneck.
+
+The `newrelic-logging` Helm chart provides a configuration setting to enable the collection of such metrics for a given Kubernetes cluster. We also provide a JSON-formatted dashboard template to easily display all these metrics in New Relic.
+
+To configure your Kubernetes cluster to send the log forwarder internal metrics and represent them in a dashboard, follow these steps:
+
+  1. Install the Helm chart with the following extra configuration setting:
+    ```
+    newrelic-logging:
+      fluentBit:
+        sendMetrics: true
+    ```
+  You only need to enable the `newrelic-logging.fluentBit.sendMetrics` setting when troubleshooting a Kubernetes cluster. We recommend enabling it for a single Kubernetes cluster at a time to ease troubleshooting.
+  2. Download [this dashboard template file](https://raw.githubusercontent.com/newrelic/helm-charts/master/charts/newrelic-logging/fluent-bit-and-plugin-metrics-dashboard-template.json). Open it in a text editor and replace all the `YOUR_ACCOUNT_ID` occurrences (49 in total) by your [New Relic Account ID](/docs/accounts/accounts-billing/account-structure/account-id/).
+  3. Import the resulting dashboard in JSON format by following [these instructions](/docs/query-your-data/explore-query-data/dashboards/dashboards-charts-import-export-data/#import-json).
+  4. The imported dashboard will be available in your [Dashboards page](https://one.newrelic.com/dashboards) as `Kubernetes Fluent Bit monitoring`.
+
+### Additional metric details
+The `newrelic-logging` Helm chart uses [Fluent Bit](https://fluentbit.io/) together with New Relic's [newrelic-fluent-bit-output plugin](https://github.com/newrelic/newrelic-fluent-bit-output) to send logs to New Relic. The `fluentBit.sendMetrics` configuration option enables the collection of their individual metrics:
+
+  * **[Fluent Bit internal metrics](https://docs.fluentbit.io/manual/administration/monitoring#for-v2-metrics)**: emitted by Fluent Bit in Prometheus format and delivered to New Relic's Prometheus Export endpoint. **These are the recommended ones to troubleshoot a Kubernetes cluster** because they can be faceted by `cluster_name`, `node_name` and `hostname` (pod name).
+  * **[newrelic-fluent-bit-output's internal plugin metrics](https://github.com/newrelic/newrelic-fluent-bit-output?tab=readme-ov-file#troubleshooting-metrics)**: collected by the output plugin and sent to New Relic's Metric API. These metrics do not contain dimensions, so they cannot be narrowed down to a particular cluster or host. Despite this, capturing them for a single cluster is useful to assess the overall latency when delivering the logs to the New Relic Logs API or to observe potential packaging problems.
+
+We capture Fluent Bit's internal metrics by using its [prometheus_scrape INPUT plugin](https://docs.fluentbit.io/manual/pipeline/inputs/prometheus-scrape-metrics) in conjunction with its [prometheus_remote_write OUTPUT plugin](https://docs.fluentbit.io/manual/pipeline/outputs/prometheus-remote-write). All the Prometheus `counter` metrics are actually _cumulative counters_, but we automatically perform a _delta conversion_ when they are ingested at New Relic to ease querying them using NRQL later. You can find more details [here](/docs/data-apis/understand-data/metric-data/cumulative-metrics/).
+
 ## View log data [#find-data]
 
 If everything is configured correctly and your data is being collected, you should see log data in both of these places: