New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fluentd support #189
Add fluentd support #189
Conversation
fb4506c
to
353d18d
Compare
// Tolerations | ||
Tolerations []corev1.Toleration `json:"tolerations,omitempty"` | ||
// RuntimeClassName represents the container runtime configuration. | ||
RuntimeClassName string `json:"runtimeClassName,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do fluentd
images have different configurations when different container runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to TODO list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it's a log aggregator, not a log agent. So it only receives logs via network.
|
||
// FluentdStatus defines the observed state of Fluentd | ||
type FluentdStatus struct { | ||
Errors string `json:"errs,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe errors
is better
bfb0ac1
to
b51bda2
Compare
PROJECT
Outdated
version: v1alpha2 | ||
- api: | ||
crdVersion: v1 | ||
namespaced: true | ||
domain: kubesphere.io | ||
group: logging | ||
kind: Parser | ||
path: kubesphere.io/fluentbit-operator/api/v1alpha2 | ||
path: kkubesphere.io/fluentbit-operator/api/fluentbitoperator/v1alpha2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified to kubesphere.io
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is kubesphere.io
? Shouldn't it be fluent.io
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see @wenchajun make this change in another pr :) /cc @wenchajun
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is fluentd's domain also kubesphere.io, if so, it may need to be changed as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
// Select output plugins | ||
OutputSelector metav1.LabelSelector `json:"outputSelector,omitempty"` | ||
// // The generated configuration style can be label/tag, default will use tag | ||
// // +kubebuilder:validation:Enum:=label;tag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clean up these
@@ -0,0 +1,60 @@ | |||
/* | |||
Copyright 2021. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2022
The crd version of the new fluent bit is |
I think no need to. Since the two operators can be deployed by themselves, there is no strong attachment. And this is the first version it published. What do you think @benjaminhuo @wenchajun |
Please change the director |
OK |
We'd better combine operator of fluentbit and fluentd together into one single deployment/image with fluentbit/fluentd crds/controllers in it and then rename the project to fluent-operator |
I will combine the two operators into one |
// Sticky tags will match only one record from an event stream. The same tag will be treated the same way. | ||
// will make no effect if EnableFilterKubernetes is set false. | ||
StickyTags string `json:"stickyTags,omitempty"` | ||
// Comma separated list of hosts. Ignored if left empty. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hosts => namespaces
// will make no effect if EnableFilterKubernetes is set false. | ||
StickyTags string `json:"stickyTags,omitempty"` | ||
// Comma separated list of hosts. Ignored if left empty. | ||
WatchedNamespaces string `json:"watchedNamespaces,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it better to use []string for WatchedNamespaces, WatchedHosts, WatchedContainers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah.
// Fluentd Watcher command line arguments. | ||
Args []string `json:"args,omitempty"` | ||
// MatchCfgSelector defines the selectors to select the fluentd config CRs. | ||
MatchCfgSelector metav1.LabelSelector `json:"matchCfgSelector,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call MatchCfgSelector
FluentdConfigSelector
is better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether FluentdConfigSelector is a bit confusing since there also has a ClusterFluentdConfig.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FluentdCfgSelector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems better.
} | ||
|
||
func (pgr *PluginResources) CombineGlobalInputsPlugins(sl plugins.SecretLoader, inputs []input.Input) []string { | ||
errs := make([]string, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the []input are sorted to make sure the config generated are the same if no changes are made for every reconcile?
The same concerns applies to outputs/filters/configs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will fix it.
// PatchAndFilterClusterLevelResources will combine and patch all the cluster CRs that the fluentdconfig selected, | ||
// convert the related filter/output pluginstores to the global pluginresources. | ||
func (pgr *PluginResources) PatchAndFilterClusterLevelResources(sl plugins.SecretLoader, cfgId string, | ||
clusterfilters []ClusterFilter, clusteroutputs []ClusterOutput) (*CfgResources, []string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clusterfilters/clusteroutputs/filtersList/outputsList all should be sorted using the same standard to make sure the config generated are the same for every reconcile.
filtersList/outputsList => filters/outputs ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah.
|
||
# Use distroless as minimal base image to package the manager binary | ||
# Refer to https://github.com/GoogleContainerTools/distroless for more details | ||
FROM gcr.io/distroless/static:nonroot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd better sync gcr distroless to https://hub.docker.com/repository/docker/kubesphere/distroless-static and use that instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's replace it when the dockerhub address supports both arm/amd arches in one manifest. Because we use buildx to push image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
This is a great enhancement to the upcoming fluent-operator, thanks @zhu733756 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think pretty good initial commit but I think documentation and testing need a fair bit of work. However, this could be done with future PRs.
We also need to consider how to integrate into Github actions well but also allow local development.
My concern is particularly the CRDs are not well spec'd so hard to use - testing will both stabilise them but also highlight issues and usage.
CRD upgrade is hard so we want to avoid it as much as possible and get them good to start with rather than evolve when people have them deployed.
kubectl explain ...
should give a user enough information to fill in their CRD values, having to refer to somewhere else for simple questions is not good. It is also something that shows up in editors, etc. so really very helpful to have good context-sensitive documentation.
@@ -1,7 +1,9 @@ | |||
VERSION?=$(shell cat VERSION | tr -d " \t\n\r") | |||
# Image URL to use all building/pushing image targets | |||
FB_IMG ?= kubesphere/fluent-bit:v1.8.3 | |||
OP_IMG ?= kubesphere/fluentbit-operator:$(VERSION) | |||
FD_IMG ?= kubesphere/fluentd:v1.14.4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably need to sort these to use the fluent organisation ones and likely latest too for dev but with tagged versions for release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we'll change this eventually maybe after we finish adjusting the CI
go run cmd/manager/main.go | ||
go run cmd/fluent-manager/main.go | ||
|
||
# Build amd64/arm64 Fluent Operator container image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't remember if there was a specific reason we could not support arm32 or just that it is uncommon for K8S? That may change though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have few user requirements for arm32 version actually, but we can provide that if there is such requirement in the fluent community
|
||
# Build amd64 Fluent Operator container image | ||
build-op-amd64: | ||
docker build -f cmd/fluent-manager/Dockerfile . -t ${FO_IMG}${AMD64} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit of a mix of BuildKit and classic docker, although this depends on whether DOCKER_BUILDKIT is set as well. Is the intention just to do a local build of the operator images then push later?
Why do we build & push the others rather than offer the same? Just because they're more stable or would we want to use locally too during dev?
// EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN! | ||
// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized. | ||
|
||
// FilterSpec defines the desired state of Filter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a general comment but I think this needs a lot more documentation in the comments with examples, defaults, etc. well covered for every field and structure. It should explain usage rather than just being a simple data dictionary, if there are any rules or exclusions to consider they should be covered. Linkage to in-depth documentation and examples as well.
I'm assuming this all feeds into the CRD as well so we definitely want that being very well documented. The goal should be that kubectl explain ...
gives the user enough information to actually use the CRD rather than having to reference out to other documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, we need better comments
// +kubebuilder:object:root=true | ||
// +genclient | ||
|
||
// Filter defines a Filter configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the kind of comment that should have a linkage to Fluent documentation explaining what filters are but also expanding here to provide a bit more detail to help users of kubectl explain ...
. Are there any things to watch out for, etc.?
} | ||
|
||
type FilterItem struct { | ||
// Grep defines Grep Filter configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments are pretty self-explanatory, they should link out at the very least to reference documentation I think. Currently they do not provide much help to the user, the name tells as much as the comment.
FilterItems []FilterItem `json:"filters,omitempty"` | ||
} | ||
|
||
type FilterItem struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially you can define a FilterItem with multiple fields in it, what happens then? I think the intent is this is similar to a union in that you should only have one of the internal fields set - is that right? Having a single item of multiple filters seems strange if not. Either way I think it needs more documentation explaining what it is and how to create/use it.
Match string `json:"match,omitempty"` | ||
// A regular expression to match against the tags of incoming records. | ||
// Use this option if you want to use the full regex syntax. | ||
MatchRegex string `json:"matchRegex,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Match
and MatchRegex
are mutually exclusive but also required - you must have one and only one? Again, needs a bit more documentation here I think. Is this the right type structure to show that? They're both strings so you could have a regex as well in either so would it be better to have a specific type that is one or the other?
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" | ||
) | ||
|
||
var expected = `[Service] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want to expand this a bit to some data driven tests with the config provided via files or similar. We need to exercise a few different combinations and failure cases I think.
@@ -0,0 +1,76 @@ | |||
# Fluentd watcher agent | |||
FROM golang:1.13.6-alpine3.11 as buildergo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably need to update Golang version here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
@patrick-stephens Good suggestions! We would add multi images/binaries build, enhance docs explanations, and e2e tests to the schedule marked by an issue. Someone would like to contribute it. |
4df73e1
to
3451282
Compare
@@ -1,94 +0,0 @@ | |||
name: Build-PullRequest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems useless, removed. Will add it later.
Hi folks, come to #190 to implement the to-do list. |
That's good. Please also take a look at @patrick-stephens‘ comments and add todo items into #190 if necessary |
metadata: | ||
annotations: | ||
cert-manager.io/inject-ca-from: $(CERTIFICATE_NAMESPACE)/$(CERTIFICATE_NAME) | ||
name: clusterfluentdconfigs.kubesphere.io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clusterfluentdconfigs.kubesphere.io
should be clusterfluentdconfigs.fluentd.fluent.io
?
metadata: | ||
annotations: | ||
cert-manager.io/inject-ca-from: $(CERTIFICATE_NAMESPACE)/$(CERTIFICATE_NAME) | ||
name: clusterfluentds.kubesphere.io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have clusterfluentds
crd?
metadata: | ||
annotations: | ||
cert-manager.io/inject-ca-from: $(CERTIFICATE_NAMESPACE)/$(CERTIFICATE_NAME) | ||
name: clusterinputs.kubesphere.io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have clusterinputs
crd?
apiVersion: apiextensions.k8s.io/v1 | ||
kind: CustomResourceDefinition | ||
metadata: | ||
name: clusterfilters.kubesphere.io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check all crd/patches file if the domain is correct: clusterfilters.kubesphere.io
?
These files are auto-generated?
name: clusterfilter-editor-role | ||
rules: | ||
- apiGroups: | ||
- fluentbit.fluent.io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fluentbit.fluent.io
doesn't have clusterfilters
now
@@ -0,0 +1,7 @@ | |||
apiVersion: kubesphere.io/v1alpha1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the apiVersion in config/samples are not correct
@@ -0,0 +1,7 @@ | |||
apiVersion: kubesphere.io/v1alpha1 | |||
kind: ClusterInput |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ClusterInput doesn't exist
labels: | ||
output.fluentd.fluent.io/enabled: "true" | ||
spec: | ||
outputs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we might need to consider if the outputs:
layer is necessary and if the type
field could be removed。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part will fix after I come back from the holidays. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I think is that the outputs
field is somehow duplicated with the CRD name, maybe we can change it to items
?
The same applies to filters
.
Your original idea of using an array to store outputs or filters is good, we can keep using the array.
Let's see if it's possible to deal with the type part, it's also ok to keep current implementation if we cannot find a better way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the type field combined into the plugin is reasonable, I will try to fix it.
labels: | ||
app.kubernetes.io/component: controller | ||
app.kubernetes.io/name: fluent-operator | ||
name: kubesphere:operator:fluent-operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubesphere:operator:fluent-operator
=> fluent-operator
is ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix these done
labels: | ||
app.kubernetes.io/component: controller | ||
app.kubernetes.io/name: fluent-operator | ||
name: kubesphere:operator:fluent-operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubesphere:operator:fluent-operator => fluent-operator
I think we can check out a branch And then the master branch is used for the development of fluent-operator under the new domain This way, we'll have fewer works to do on each branch and the directory structure will be simplified. |
// EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN! | ||
// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized. | ||
|
||
// FilterSpec defines the desired state of Filter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, we need better comments
Signed-off-by: zhu733756 <zhu733756@kubesphere.io>
Signed-off-by: zhu733756 <zhu733756@kubesphere.io>
Signed-off-by: zhu733756 <zhu733756@kubesphere.io>
2a6ae15
to
6871e64
Compare
Yeah that seems a good idea to me, I would say nothing can go past beta until docs and tests are done. Ideally probably not past alpha as people tend to adopt even beta. |
Signed-off-by: zhu733756 <zhu733756@kubesphere.io>
@@ -211,6 +212,11 @@ spec: | |||
messages | |||
format: int64 | |||
type: integer | |||
multilineParser: | |||
description: This will help to reassembly multiline messages originally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These CRDs copies from the generated CRDs with make manifest
Agree with the comments, a test fluentd config has been added there apis/fluentd/v1alpha1/tests/expected-main-cfgs.cfg |
After changing the domain from logging.kubesphere.io to fluent.io, The new directory structure looks better and simpler, thanks @zhu733756! |
Signed-off-by: zhu733756 <zhu733756@kubesphere.io>
d70f6f5
to
f44774c
Compare
Docs updated. Please help to review the new commit. If no other requested changes, we can push the official images at first, then continue working for the todos #190 |
The new commit looks good to me, thanks @zhu733756 |
OK. |
Merge this first, we'll continue to work on items in #190 in separate PRs |
Signed-off-by: zhu733756 zhu733756@kubesphere.io
This pr add Fluentd-operator to implement the former proposal #138.
Backgrouds
See the former proposal mentioned in #138.
This proposal brought the basic concept to answer how to add fluentd-operator as a forward log layer to collect logs from fluentbit or other apps.
See below:
During the log Aggregation & Forwarding Phase, the Fluentd Operator defines five custom resources using CustomResourceDefinition (CRD):
But we made some changes during actual development.
Namespaced fluentdconfig CR can select the namespaced CRs and the cluster CRs. Cluster fluentdconfig CR only selects clusterfilters or clusteroutputs.
Besides, you will see inputs sections combined into the fluentd CR named as
globalInputs
. Since theglobalInputs
is manily used for collecting the logs from fluentbit through forward plugin or other apps through http plugin. And we will dynamicallyadd the related service ports(forward/http, see demo below) according to this configuration.
Now two reconcilers are brought to finish this work.
Fluentd reconciler watches the related resources like statefulset/secrect/service/buffer PVC etc, it setups fluentd instances once a secret is generated by FluendConfig reconciler. FluendConfig reconciler watches the created fluentd CRs, picks out the fluentdCfgSelector to select the related resources, and combines to a configuration that would be mounted as a secret for this fluentd.
Another useful idea activated by the former work is that we support a hot config reloader for fluentbit-operator. So if the related CRs changed, the fluentd-watcher agent will gracefully reload the fluentd instance.
That's the whole change, the other idea is welcome and if I lost the ideas from the proposal, pls remind me.
The practice part
Since the output plugins are huge, I will use es for example to test the whole logic. And for the input layer, I directly use the forward from fluentbit to test the workflow.
1 Setup elasticsearch cluster
If you use Kubesphere Container platform, you can install it through ks-installer.
2 Setup fluent operator
3 Setup fluentbit logging-stack, deleting the fluentbit output plugin
4 Test the forward demo
Namespaced fluentdconfig example see follows, it only collects the namespaced logs.
CluserFluentdConfig exmaple see follows, it collects logs from the watched namespaces.
I will use the cluster scope CRs to test it. Manifests located at
manifests/forward
.After a while, a default buffer pvc(1G) bound to this statefulset, and a service named fluentd-forward created. You may disable pvc/service, change the volume type according the definition in fluentd CR.
5 apply fluentbit output forward
Forward example:
6 Check results
Todo
See #190
/cc @benjaminhuo @wanjunlei @wenchajun