Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more attributes extracted for traces #438

Merged
merged 10 commits into from
Mar 3, 2020
54 changes: 30 additions & 24 deletions deploy/helm/sumologic/conf/traces/traces.otelcol.conf.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,41 +4,47 @@ receivers:
protocols:
grpc:
zipkin:
endpoint: "0.0.0.0:9411"
endpoint: {{ .Values.otelcol.receivers.zipkin.endpoint }}
processors:
k8s_tagger:
passthrough: false
passthrough: {{ .Values.otelcol.processors.k8s_tagger.passthrough }}
extract:
metadata:
# extract the following well-known metadata fields
- podName
- deployment
- cluster
- namespace
- node
- startTime
{{- range .Values.otelcol.processors.k8s_tagger.extract.metadata }}
- {{ . | quote }}
{{- end }}
tags:
{{- range $key, $val := .Values.otelcol.processors.k8s_tagger.extract.tags }}
{{ $key }}: {{ $val | quote }}
{{- end }}
annotations:
{{- range .Values.otelcol.processors.k8s_tagger.extract.annotations }}
- tag_name: {{ .tag_name | quote }}
key: {{ .key | quote }}
{{- end }}
labels:
{{- range .Values.otelcol.processors.k8s_tagger.extract.labels }}
- tag_name: {{ .tag_name | quote }}
key: {{ .key | quote }}
{{- end }}
queued_retry:
num_workers: 16
queue_size: 10000
retry_on_failure: true
num_workers: {{ .Values.otelcol.processors.queued_retry.num_workers }}
queue_size: {{ .Values.otelcol.processors.queued_retry.queue_size }}
retry_on_failure: {{ .Values.otelcol.processors.queued_retry.retry_on_failure }}
batch:
send_batch_size: 1024
timeout: 5s
send_batch_size: {{ .Values.otelcol.processors.batch.send_batch_size }}
timeout: {{ .Values.otelcol.processors.batch.timeout }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@perk-sumo The challenge here is that there is a number of processors: https://github.com/open-telemetry/opentelemetry-collector/tree/master/processor - the list is actually growing and their properties might change over time.

An ideal solution would to provide a template with the full config file, but we are not sure how to tackle it. Do you have some suggestions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I think about it more and more, we should just take full config branch and put it as a configmap. If user have own processors it's gonna add changes to values or config file anyway. Additionally we want to support --set flags

@perk-sumo any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not so nice proposal, but had to support dynamic value for url :D

bb9c9b0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep some defaults for clients who don't want to deal with the config that much.
We could do it like fluent-bit: provide sane defaults in config.yaml file with some flags in values.yaml and if someone want to have a totally different config provide a special key in values.yaml for that.
We could also provide the template config but I'm not sure if that should go to values.yaml or we should create a new file for that like values.otelcol.yaml or sth. With that client would need to copy and paste something into values.yaml by hand.
I don't like the idea of whole otelcol config inside default values.yaml because it will grow really huge and that doesn't seem right.

Please take a look at:
https://github.com/helm/charts/blob/master/stable/fluent-bit/values.yaml#L146
https://github.com/helm/charts/blob/master/stable/fluent-bit/templates/config.yaml#L1

It's similar to what @sumo-drosiek did.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do it like fluent-bit I think that our case differs. We are able to use yaml. fluentbit were obligated to convert yaml configuration into fluent-bit specific format.
I would like to stick to the values.yaml. It's more clear and in both cases values.yaml will be growing

extensions:
health_check: {}
exporters:
zipkin:
url: http://{{ template "sumologic.fullname" . }}:9411/api/v2/spans
url: {{ if .Values.otelcol.exporters.zipkin.url }} {{ .Values.otelcol.exporters.zipkin.url }} {{ else }} http://{{ template "sumologic.fullname" . }}:9411/api/v2/spans {{ end }}
# logging:
# loglevel: debug
service:
extensions: [health_check]
extensions: {{ include "otelcol.generate_list" .Values.otelcol.extensions | nindent 5 }}
pipelines:
traces/1:
receivers: [jaeger, zipkin, opencensus]
processors: [batch, queued_retry, k8s_tagger]
exporters: [zipkin]
traces/2:
receivers: [jaeger, zipkin, opencensus]
processors: [batch, queued_retry, k8s_tagger]
exporters: [zipkin]
traces:
receivers: {{ include "otelcol.generate_list" .Values.otelcol.receivers | nindent 10 }}
processors: {{ include "otelcol.generate_list" .Values.otelcol.processors | nindent 10 }}
exporters: {{ include "otelcol.generate_list" .Values.otelcol.exporters | nindent 10 }}
32 changes: 32 additions & 0 deletions deploy/helm/sumologic/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,35 @@ release: "{{ .Release.Name }}"
heritage: "{{ .Release.Service }}"
{{- end -}}
{{- end -}}

{{/*
Generate list of extensions/receivers/etc from dictionary:

Example input:
```
extensions:
extension_a:
enabled: true
extension_b:
enabled: false
```

Usage: include "otelcol.generate_list" extensions"

Expected output:
```
- extension_a
```
*/}}
{{- define "otelcol.generate_list" -}}
{{- $empty_list := true }}
{{- range $key, $val := . }}
{{- if $val.enabled }}
{{- if $empty_list }}
{{- $empty_list = false }}
{{- end }}
- {{ $key | quote }}
{{- end }}
{{- end }}
{{- if $empty_list }} [] {{ end }}
{{- end -}}
77 changes: 75 additions & 2 deletions deploy/helm/sumologic/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,8 @@ otelcolDeployment:
# Memory Ballast size should be max 1/3 to 1/2 of memory.
memBallastSizeMib: "683"
image:
name: "omnition/opentelemetry-collector-contrib"
tag: "0.0.5"
name: "sumologic/opentelemetry-collector"
tag: "0.0.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmaciolek could you update?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's really good point - please use the tagged otelcol release.

pullPolicy: IfNotPresent

sumologic:
Expand Down Expand Up @@ -508,3 +508,76 @@ falco:
# enabled: true
falco:
jsonOutput: true

otelcol:
processors:
k8s_tagger:
enabled: true
passthrough: false
extract:
metadata:
# extract the following well-known metadata fields
- containerId
- containerName
- containerImage
- cluster
- daemonSetName
- deployment
- hostName
- namespace
- namespaceId
- node
- owners
- podId
- podName
- replicaSetName
- serviceName
- startTime
- statefulSetName
tags:
containerId: container_id
containerName: container_name
containerImage: conatiner_image
cluster: cluster
daemonSetName: daemonset
deployment: deployment
hostName: hostname
namespace: namespace
namespaceId: namespace_id
node: node
podId: pod_id
podName: pod
replicaSetName: replicaset
serviceName: service_name
startTime: start_time
statefulSetName: statefulset
annotations:
- tag_name: pod_annotation_%s
key: "*"
labels:
- tag_name: pod_label_%s
key: "*"
batch:
enabled: true
send_batch_size: 1024
timeout: 5s
queued_retry:
enabled: true
num_workers: 16
queue_size: 10000
retry_on_failure: true
receivers:
jaeger:
enabled: true
zipkin:
enabled: true
endpoint: "0.0.0.0:9411"
opencensus:
enabled: true
extensions:
health_check:
enabled: true
exporters:
zipkin:
enabled: true
url: null