Self-hosted: Allow collector to receive logs via OpenTelemetry #544

keiko713 · 2024-04-30T06:06:25Z

Successor of #503

This is mainly intended to support Kubernetes, by using fluentbit with the opentelemetry output plugin to send pod logs to the collector. The logs received can optionally be filtered by pod name, or via label selectors.

Note this requires the log output to be jsonlog (Postgres 15+), with optional K8s context added, or plain logs without any additional context.

When the collector is deployed to k8s using the helm chart, update service of the values.yaml to create: true so that the service of otel server will be created.

service:
  create: true
  name: pganalyze-collector-otel-service
  type: ClusterIP
  port: 4318
  targetPort: 4318

An example fluentbit output configuration:

  [OUTPUT]
    name  opentelemetry
    match kube.*postgres*
    host  pganalyze-collector-otel-service
    port  4318

With a corresponding collector configuration:

  db_log_otel_server = 0.0.0.0:4318

When utilizing CloudNativePG, a filter like the following can be used to filter only to the primary's logs for a given cluster:

  db_log_otel_k8s_labels = cnpg.io/cluster=cluster-example,cnpg.io/instanceRole=primary

This is mainly intended to support Kubernetes, by using fluentbit with the opentelemetry output plugin to send pod logs to the collector. The logs received can optionally be filtered by pod name, or via label selectors. Note this requires the log output to be jsonlog (Postgres 15+), with optional K8s context added, or plain logs without any additional context. An example fluentbit output configuration: [OUTPUT] name opentelemetry match kube.*postgres* host 127.0.0.1 port 4318 With a corresponding collector configuration: db_log_otel_server = 127.0.0.1:4318 When utilizing CloudNativePG, a filter like the following can be used to filter only to the primary's logs for a given cluster: db_log_otel_k8s_labels = cnpg.io/cluster=cluster-example,cnpg.io/instanceRole=primary

keiko713 · 2024-04-30T12:16:09Z

I also created some draft of docs: https://pganalyze-we-add-selfho-wbo8ie.herokuapp.com/docs/log-insights/setup/opentelemetry

keiko713 · 2024-05-01T00:19:00Z

.gitignore

-pganalyze-collector-setup
+/pganalyze-collector
+/pganalyze-collector-helper
+/pganalyze-collector-setup


This was causing the contrib/helm/pganalyze-collector dir to be ignored. Adding / will enforce that only the root dir one is going to be ignored.

msakrejda

Some comments, but looks good (with the caveat that I am not familiar with otel logs). Does this work with --test and --test-logs? If not, what's our plan there?

contrib/helm/pganalyze-collector/README.md

msakrejda · 2024-05-07T23:09:10Z

input/system/selfhosted/otel_handler.go

+				return true
+			}
+		} else {
+			prefixedLogger.PrintWarning("Pod specification for OTel server not valid (need zero or one / separator): \"%s\", skipping log record\n", server.Config.LogOtelK8SPod)


Maybe we should put this in config/read.go? It seems silly to warn about this for every log line. I think we could also consider erroring out on this at startup, no?

Maybe we should also SplitN this while parsing, and set values on Config that don't correspond to config settings directly (one for name, one for namespace)? I doubt it's a bottleneck here, but that might be easier to follow. Lines 102-114 could become

if server.Config.LogOtelK8SPod != "" { if server.Config.LogOtelK8SPodNamespace != "" && server.Config.LogOtelK8SPodNamespace != k8sNamespaceName { return true } if server.Config.LogOtelK8SPod != k8sPodName { return true } }

Thanks for the idea! I refactored this, and also added the test for the label/selector matching part as it was actually in TODO of the original PR. Ideally we want to use the k8s code for this I'd say, but that'll involve lots of pulling deps so I decided not to do that at least in this PR.

Did you push up those changes? I don't see them in this branch.

Wow, I swear I did update this (well I did everything to prepare this, which is already pushed), but I don't see them too 🙃 Now I'm modifying, maybe I did in fact forget to update this part 🤦 Thanks for noticing! Update pushed 👍

msakrejda · 2024-05-07T23:22:35Z

input/system/selfhosted/otel_handler.go

+				return true
+			}
+		}
+	}


Similar to above, I think it'd be better parse this into a structured form when reading the config, and make it easier to work with for consumers.

msakrejda · 2024-05-07T23:24:49Z

input/system/selfhosted/otel_handler.go

+								if detailLine != nil {
+									parsedLogStream <- state.ParsedLogStreamItem{Identifier: server.Config.Identifier, LogLine: *detailLine}
+								}
+							} else if logger == "" && hasErrorSeverity {


Why are we requiring error_severity? It might be good to add a comment here or above.

Good question. This part is written by Lukas and actually don't have good context. This is about the following part:

Note this requires the log output to be jsonlog (Postgres 15+), with optional K8s context added, or plain logs without any additional context.

It is "plan logs without any additional context". I actually haven't tested this case. I think error_severity is somewhat "must be there" JSON key as a JSON format logs, therefore if that key exists, we can assume that it's a Postgres log? @lfittl does it sound right?
https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-JSONLOG

(I updated the comment assuming that it's correct, but happy to update)

keiko713 · 2024-05-08T06:45:50Z

Thanks for the review, @msakrejda !

Does this work with --test and --test-logs? If not, what's our plan there?

No it does not work, and the plan is to support with the branch that I have in my local machine 😬 (aka planning to do that as a follow-up)

util/kubernetes.go

msakrejda · 2024-05-08T22:18:26Z

util/kubernetes.go

+var K8sSelectorRegexp = regexp.MustCompile(`\s*([^!=\s]+)\s*([!=]+)\s*([^\s]+)\s*`)
+
+// CheckLabelSelectorMismatch checks if selectors do not match the given labels
+func CheckLabelSelectorMismatch(labels map[string]string, selectors []string) bool {


Is this actually called anywhere (other than the tests)? I didn't see any call sites.

It might also be clearer to invert the condition here (i.e., CheckLabelSelectorMatch or CheckLabelSelectorMatches or maybe just SelectorsMatchLabels) and then negate it in the caller.

Separately, since we're dealing with multiple selectors and multiple labels, it might be good to include the full expected behavior (i.e., if I'm reading this correctly, that every selector must match its corresponding label, if any) in the comment.

Is this actually called anywhere (other than the tests)? I didn't see any call sites.

No, CheckLabelSelectorMismatch is not called (though K8sSelectorRegexp is called, and that's the reason behind of making this util). I'd like to use the k8s code for this and refactor, so that I can retire this whole util. As this was a good chunk, I decided to keep them together (also makes easier to write tests focusing on this).

It might also be clearer to invert the condition here (i.e., CheckLabelSelectorMatch or CheckLabelSelectorMatches or maybe just SelectorsMatchLabels) and then negate it in the caller.

I thought about it, but the nice part of the current logic is that it can early return easily when something doesn't match (also logic is a bit simpler). Well, given neither labels and selectors won't be that long, maybe I don't need to worry about too much for the performance?

Separately, since we're dealing with multiple selectors and multiple labels, it might be good to include the full expected behavior (i.e., if I'm reading this correctly, that every selector must match its corresponding label, if any) in the comment.

I'll point to k8s label selector docs (that I linked in the other comment) 👍 FYI, the short answer to this is the following:

In the case of multiple requirements, all must be satisfied so the comma separator acts as a logical AND (&&) operator.

lfittl and others added 5 commits April 30, 2024 15:04

Support setting otel related config via env var

3e61ac3

Add log-test log for log OTEL

504e5b8

Add template for service for Otel log receiver

d85c577

Update comments

36dcc70

keiko713 mentioned this pull request Apr 30, 2024

Add docs for k8s + otel log insight setup pganalyze/pganalyze-docs#254

Merged

keiko713 marked this pull request as ready for review April 30, 2024 12:16

keiko713 requested a review from a team April 30, 2024 12:16

keiko713 commented May 1, 2024

View reviewed changes

msakrejda reviewed May 7, 2024

View reviewed changes

keiko713 added 3 commits May 8, 2024 18:24

Refactor k8s filter skipping logic

e8d7443

Add test for k8s label selector matching

2cbd9aa

Add comment for hasErrorSeverity

e1478bc

msakrejda reviewed May 8, 2024

View reviewed changes

keiko713 added 2 commits May 9, 2024 11:48

Update regex for the selectors

6c74d55

Actually use refactored code

beb6f8a

msakrejda approved these changes May 14, 2024

View reviewed changes

keiko713 merged commit 190c5e1 into main May 15, 2024
3 checks passed

keiko713 deleted the logs-otel-server branch May 15, 2024 09:44

keiko713 mentioned this pull request May 16, 2024

Add logs test with OpenTelemetry log receiver #548

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Self-hosted: Allow collector to receive logs via OpenTelemetry #544

Self-hosted: Allow collector to receive logs via OpenTelemetry #544

keiko713 commented Apr 30, 2024 •

edited

Loading

keiko713 commented Apr 30, 2024

keiko713 May 1, 2024

msakrejda left a comment

msakrejda May 7, 2024

msakrejda May 7, 2024

keiko713 May 8, 2024

msakrejda May 10, 2024

keiko713 May 14, 2024

msakrejda May 7, 2024

msakrejda May 7, 2024

keiko713 May 8, 2024

keiko713 May 8, 2024

keiko713 commented May 8, 2024

msakrejda May 8, 2024 •

edited

Loading

keiko713 May 9, 2024

Self-hosted: Allow collector to receive logs via OpenTelemetry #544

Self-hosted: Allow collector to receive logs via OpenTelemetry #544

Conversation

keiko713 commented Apr 30, 2024 • edited Loading

keiko713 commented Apr 30, 2024

Choose a reason for hiding this comment

msakrejda left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

keiko713 commented May 8, 2024

msakrejda May 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

keiko713 commented Apr 30, 2024 •

edited

Loading

msakrejda May 8, 2024 •

edited

Loading