Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for data persistence #351

Merged
merged 13 commits into from
Jan 29, 2020
2 changes: 1 addition & 1 deletion deploy/helm/sumologic/conf/buffer.output.conf
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@ compress gzip
flush_interval "#{ENV['FLUSH_INTERVAL']}"
flush_thread_count "#{ENV['NUM_THREADS']}"
chunk_limit_size "#{ENV['CHUNK_LIMIT_SIZE']}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we increase chunk_limit_size to 1mb knowing that the chunks will no longer be compressed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do still do compression for memory buffer and file buffer without persistence, it might be really tricky to have two defaults depending on the buffer type..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I'm fine with keeping this for now. Ideally we figure out how to continue to use gzip since these metrics will compress heavily (repeated dimensions + metadata)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be the issue we're seeing? fluent/fluentd#2619

Have we tried upgrading our fluentd image from v1.6.3 to v1.8.1 and see if it resolves the issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will try that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, v1.8.1 fixed the issue! will revert the compression changes

total_limit_size "#{ENV['TOTAL_LIMIT_SIZE']}"
overflow_action drop_oldest_chunk
3 changes: 2 additions & 1 deletion deploy/helm/sumologic/conf/events/events.conf
Original file line number Diff line number Diff line change
Expand Up @@ -35,11 +35,12 @@
verify_ssl "#{ENV['VERIFY_SSL']}"
proxy_uri "#{ENV['PROXY_URI']}"
<buffer>
{{- if eq .Values.sumologic.fluentd.buffer "file" }}
{{- if or .Values.sumologic.fluentd.persistence.enabled (eq .Values.sumologic.fluentd.buffer "file") }}
@type file
path /fluentd/buffer/events
{{- else }}
@type memory
total_limit_size "#{ENV['TOTAL_LIMIT_SIZE']}"
{{- end }}
@include buffer.output.conf
</buffer>
Expand Down
3 changes: 2 additions & 1 deletion deploy/helm/sumologic/conf/logs/logs.source.containers.conf
Original file line number Diff line number Diff line change
Expand Up @@ -56,11 +56,12 @@
@id sumologic.endpoint.logs
@include logs.output.conf
<buffer>
{{- if eq .Values.sumologic.fluentd.buffer "file" }}
{{- if or .Values.sumologic.fluentd.persistence.enabled (eq .Values.sumologic.fluentd.buffer "file") }}
@type file
path /fluentd/buffer/logs.containers
{{- else }}
@type memory
total_limit_size "#{ENV['TOTAL_LIMIT_SIZE']}"
{{- end }}
@include buffer.output.conf
</buffer>
Expand Down
6 changes: 4 additions & 2 deletions deploy/helm/sumologic/conf/logs/logs.source.systemd.conf
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,12 @@
@id sumologic.endpoint.logs.kubelet
@include logs.output.conf
<buffer>
{{- if eq .Values.sumologic.fluentd.buffer "file" }}
{{- if or .Values.sumologic.fluentd.persistence.enabled (eq .Values.sumologic.fluentd.buffer "file") }}
@type file
path /fluentd/buffer/logs.kubelet
{{- else }}
@type memory
total_limit_size "#{ENV['TOTAL_LIMIT_SIZE']}"
{{- end }}
@include buffer.output.conf
</buffer>
Expand Down Expand Up @@ -53,11 +54,12 @@
@id sumologic.endpoint.logs.systemd
@include logs.output.conf
<buffer>
{{- if eq .Values.sumologic.fluentd.buffer "file" }}
{{- if or .Values.sumologic.fluentd.persistence.enabled (eq .Values.sumologic.fluentd.buffer "file") }}
@type file
path /fluentd/buffer/logs.systemd
{{- else }}
@type memory
total_limit_size "#{ENV['TOTAL_LIMIT_SIZE']}"
{{- end }}
@include buffer.output.conf
</buffer>
Expand Down
24 changes: 16 additions & 8 deletions deploy/helm/sumologic/conf/metrics/metrics.conf
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,12 @@
endpoint "#{ENV['SUMO_ENDPOINT_METRICS_APISERVER']}"
@include metrics.output.conf
<buffer>
{{- if eq .Values.sumologic.fluentd.buffer "file" }}
{{- if or .Values.sumologic.fluentd.persistence.enabled (eq .Values.sumologic.fluentd.buffer "file") }}
@type file
path /fluentd/buffer/metrics.apiserver
{{- else }}
@type memory
total_limit_size "#{ENV['TOTAL_LIMIT_SIZE']}"
{{- end }}
@include buffer.output.conf
</buffer>
Expand All @@ -39,11 +40,12 @@
endpoint "#{ENV['SUMO_ENDPOINT_METRICS_KUBELET']}"
@include metrics.output.conf
<buffer>
{{- if eq .Values.sumologic.fluentd.buffer "file" }}
{{- if or .Values.sumologic.fluentd.persistence.enabled (eq .Values.sumologic.fluentd.buffer "file") }}
@type file
path /fluentd/buffer/metrics.kubelet
{{- else }}
@type memory
total_limit_size "#{ENV['TOTAL_LIMIT_SIZE']}"
{{- end }}
@include buffer.output.conf
</buffer>
Expand All @@ -54,11 +56,12 @@
endpoint "#{ENV['SUMO_ENDPOINT_METRICS_KUBELET']}"
@include metrics.output.conf
<buffer>
{{- if eq .Values.sumologic.fluentd.buffer "file" }}
{{- if or .Values.sumologic.fluentd.persistence.enabled (eq .Values.sumologic.fluentd.buffer "file") }}
@type file
path /fluentd/buffer/metrics.container
{{- else }}
@type memory
total_limit_size "#{ENV['TOTAL_LIMIT_SIZE']}"
{{- end }}
@include buffer.output.conf
</buffer>
Expand All @@ -69,11 +72,12 @@
endpoint "#{ENV['SUMO_ENDPOINT_METRICS_KUBE_CONTROLLER_MANAGER']}"
@include metrics.output.conf
<buffer>
{{- if eq .Values.sumologic.fluentd.buffer "file" }}
{{- if or .Values.sumologic.fluentd.persistence.enabled (eq .Values.sumologic.fluentd.buffer "file") }}
@type file
path /fluentd/buffer/metrics.controller
{{- else }}
@type memory
total_limit_size "#{ENV['TOTAL_LIMIT_SIZE']}"
{{- end }}
@include buffer.output.conf
</buffer>
Expand All @@ -84,11 +88,12 @@
endpoint "#{ENV['SUMO_ENDPOINT_METRICS_KUBE_SCHEDULER']}"
@include metrics.output.conf
<buffer>
{{- if eq .Values.sumologic.fluentd.buffer "file" }}
{{- if or .Values.sumologic.fluentd.persistence.enabled (eq .Values.sumologic.fluentd.buffer "file") }}
@type file
path /fluentd/buffer/metrics.scheduler
{{- else }}
@type memory
total_limit_size "#{ENV['TOTAL_LIMIT_SIZE']}"
{{- end }}
@include buffer.output.conf
</buffer>
Expand All @@ -99,11 +104,12 @@
endpoint "#{ENV['SUMO_ENDPOINT_METRICS_KUBE_STATE']}"
@include metrics.output.conf
<buffer>
{{- if eq .Values.sumologic.fluentd.buffer "file" }}
{{- if or .Values.sumologic.fluentd.persistence.enabled (eq .Values.sumologic.fluentd.buffer "file") }}
@type file
path /fluentd/buffer/metrics.state
{{- else }}
@type memory
total_limit_size "#{ENV['TOTAL_LIMIT_SIZE']}"
{{- end }}
@include buffer.output.conf
</buffer>
Expand All @@ -114,11 +120,12 @@
endpoint "#{ENV['SUMO_ENDPOINT_METRICS_NODE_EXPORTER']}"
@include metrics.output.conf
<buffer>
{{- if eq .Values.sumologic.fluentd.buffer "file" }}
{{- if or .Values.sumologic.fluentd.persistence.enabled (eq .Values.sumologic.fluentd.buffer "file") }}
@type file
path /fluentd/buffer/metrics.node
{{- else }}
@type memory
total_limit_size "#{ENV['TOTAL_LIMIT_SIZE']}"
{{- end }}
@include buffer.output.conf
</buffer>
Expand All @@ -129,11 +136,12 @@
endpoint "#{ENV['SUMO_ENDPOINT_METRICS']}"
@include metrics.output.conf
<buffer>
{{- if eq .Values.sumologic.fluentd.buffer "file" }}
{{- if or .Values.sumologic.fluentd.persistence.enabled (eq .Values.sumologic.fluentd.buffer "file") }}
@type file
path /fluentd/buffer/metrics.all
{{- else }}
@type memory
total_limit_size "#{ENV['TOTAL_LIMIT_SIZE']}"
{{- end }}
@include buffer.output.conf
</buffer>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{{- if eq .Values.sumologic.eventCollectionEnabled true }}
apiVersion: apps/v1
kind: Deployment
kind: StatefulSet
metadata:
name: {{ printf "%s-events" (include "sumologic.fullname" .) }}
labels:
Expand All @@ -17,13 +17,13 @@ spec:
{{- include "sumologic.labels.common" . | nindent 8 }}
spec:
serviceAccountName: {{ template "sumologic.fullname" . }}
{{- if .Values.eventsDeployment.nodeSelector }}
{{- if .Values.eventsStatefulset.nodeSelector }}
nodeSelector:
{{ toYaml .Values.eventsDeployment.nodeSelector | indent 8 }}
{{ toYaml .Values.eventsStatefulset.nodeSelector | indent 8 }}
{{- end }}
{{- if .Values.eventsDeployment.tolerations }}
{{- if .Values.eventsStatefulset.tolerations }}
tolerations:
{{ toYaml .Values.eventsDeployment.tolerations | indent 8 }}
{{ toYaml .Values.eventsStatefulset.tolerations | indent 8 }}
{{- end }}
volumes:
- name: pos-files
Expand All @@ -33,17 +33,23 @@ spec:
- name: config-volume
configMap:
name: {{ printf "%s-events" (include "sumologic.fullname" .) }}
securityContext:
fsGroup: {{ .Values.sumologic.fluentd.securityContext.fsGroup }}
containers:
- name: fluentd-events
image: {{ .Values.image.repository }}:{{ .Values.image.tag }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
resources:
{{- toYaml .Values.eventsDeployment.resources | nindent 10 }}
{{- toYaml .Values.eventsStatefulset.resources | nindent 10 }}
volumeMounts:
- name: config-volume
mountPath: /fluentd/etc/
- name: pos-files
mountPath: /mnt/pos/
{{- if .Values.sumologic.fluentd.persistence.enabled }}
- name: buffer
mountPath: "/fluentd/buffer"
{{- end}}
livenessProbe:
httpGet:
path: /fluentd.pod.healthcheck?json=%7B%22log%22%3A+%22health+check%22%7D
Expand Down Expand Up @@ -73,4 +79,15 @@ spec:
value: {{ .Values.sumologic.chunkLimitSize | quote }}
- name: TOTAL_LIMIT_SIZE
value: {{ .Values.sumologic.totalLimitSize | quote }}
{{- end }}
{{- if .Values.sumologic.fluentd.persistence.enabled }}
volumeClaimTemplates:
- metadata:
name: buffer
spec:
accessModes: [{{ .Values.sumologic.fluentd.persistence.accessMode }}]
storageClassName: {{ .Values.sumologic.fluentd.persistence.storageClass }}
resources:
requests:
storage: {{ .Values.sumologic.fluentd.persistence.size }}
{{- end }}
{{- end }}
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
apiVersion: apps/v1
kind: Deployment
kind: StatefulSet
metadata:
name: {{ template "sumologic.fullname" . }}
labels:
Expand All @@ -9,21 +9,21 @@ spec:
selector:
matchLabels:
app: {{ template "sumologic.labels.app" . }}
replicas: {{ .Values.deployment.replicaCount }}
replicas: {{ .Values.statefulset.replicaCount }}
template:
metadata:
labels:
app: {{ template "sumologic.labels.app" . }}
{{- include "sumologic.labels.common" . | nindent 8 }}
spec:
serviceAccountName: {{ template "sumologic.fullname" . }}
{{- if .Values.deployment.nodeSelector }}
{{- if .Values.statefulset.nodeSelector }}
nodeSelector:
{{ toYaml .Values.deployment.nodeSelector | indent 8 }}
{{ toYaml .Values.statefulset.nodeSelector | indent 8 }}
{{- end }}
{{- if .Values.deployment.tolerations }}
{{- if .Values.statefulset.tolerations }}
tolerations:
{{ toYaml .Values.deployment.tolerations | indent 8 }}
{{ toYaml .Values.statefulset.tolerations | indent 8 }}
{{- end }}
volumes:
- name: pos-files
Expand All @@ -33,12 +33,14 @@ spec:
- name: config-volume
configMap:
name: {{ template "sumologic.fullname" . }}
securityContext:
fsGroup: {{ .Values.sumologic.fluentd.securityContext.fsGroup }}
containers:
- name: fluentd
image: {{ .Values.image.repository }}:{{ .Values.image.tag }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
resources:
{{- toYaml .Values.deployment.resources | nindent 10 }}
{{- toYaml .Values.statefulset.resources | nindent 10 }}
ports:
- name: prom-write
containerPort: 9888
Expand All @@ -64,6 +66,10 @@ spec:
mountPath: /fluentd/etc/
- name: pos-files
mountPath: /mnt/pos/
{{- if .Values.sumologic.fluentd.persistence.enabled }}
- name: buffer
mountPath: "/fluentd/buffer"
{{- end}}
env:
- name: SUMO_ENDPOINT_METRICS
valueFrom:
Expand Down Expand Up @@ -152,4 +158,15 @@ spec:
- name: EXCLUDE_HOST_REGEX
value: {{ .Values.sumologic.excludeHostRegex | quote }}
- name: EXCLUDE_POD_REGEX
value: {{ .Values.sumologic.excludePodRegex | quote }}
value: {{ .Values.sumologic.excludePodRegex | quote }}
{{- if .Values.sumologic.fluentd.persistence.enabled }}
volumeClaimTemplates:
- metadata:
name: buffer
spec:
accessModes: [{{ .Values.sumologic.fluentd.persistence.accessMode }}]
storageClassName: {{ .Values.sumologic.fluentd.persistence.storageClass }}
resources:
requests:
storage: {{ .Values.sumologic.fluentd.persistence.size }}
{{- end }}
23 changes: 21 additions & 2 deletions deploy/helm/sumologic/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ image:

nameOverride: ""

deployment:
statefulset:
nodeSelector: {}
tolerations: {}
replicaCount: 3
Expand All @@ -17,7 +17,7 @@ deployment:
memory: 768Mi
cpu: 0.5

eventsDeployment:
eventsStatefulset:
nodeSelector: {}
tolerations: {}
resources:
Expand Down Expand Up @@ -173,6 +173,25 @@ sumologic:
maxReplicas: 10
targetCPUUtilizationPercentage: 50

securityContext:
## The group ID of all processes in the statefulset containers. By default this needs to be fluent(999).
fsGroup: 999

## Persist data to a persistent volume; When enabled, fluentd uses the file buffer instead of memory buffer.
persistence:
enabled: false

## If defined, storageClassName: <storageClass>
## If set to "-", storageClassName: "", which disables dynamic provisioning
## If undefined (the default) or set to null, no storageClassName spec is
## set, choosing the default provisioner. (gp2 on AWS, standard on
## GKE, Azure & OpenStack)
##
# storageClass: "-"
# annotations: {}
accessMode: ReadWriteOnce
size: 10Gi

k8sMetadataFilter:
## Option to control the enabling of metadata filter plugin watch.
## ref: https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter#configuration
Expand Down
Loading