Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feedback wanted: deprecate our Loki FluentBit plugin in favor of native FluentBit output. #4648

Open
owen-d opened this issue Nov 4, 2021 · 24 comments
Labels
help wanted We would love help on these issues. Please come help us! keepalive An issue or PR that will be kept alive and never marked as stale. need-investigation type/question

Comments

@owen-d
Copy link
Member

owen-d commented Nov 4, 2021

Hello! A long time ago we wrote a plugin for ingesting logs in Loki from fluentbit. These days, there's a native output option for Loki available in fluentbit itself courtesy of @edsiper. I'm opening this issue to solicit feedback from Loki users currently using either of these in hopeful preparation for deprecating our plugin in favor of theirs.

We hope the introduction of out of order support in Loki has helped make this feasible :)

cc @cyriltovena

@owen-d owen-d added type/question help wanted We would love help on these issues. Please come help us! keepalive An issue or PR that will be kept alive and never marked as stale. need-investigation labels Nov 4, 2021
@edsiper
Copy link

edsiper commented Nov 20, 2021

Hi Loki users,

As part of the Fluent Bit team, we want to bring a first-class citizen experience with Loki and we would like to know what are the specific missing features in our built-in connector:

Since the Golang connector will be deprecated, please let us know what is needed to prioritize on our side.

thanks.

@stevehipwell
Copy link
Contributor

@owen-d I swapped over from the native Fluentd implementation to the native Fluent Bit implementation as soon as Loki v2.4 was release and I've been very happy with both parts.

@edsiper I think there are a few outstanding Loki issues on the Fluent Bit repo that need triaging against the latest versions? Off the top of my head the following areas need looking into, but I've not seen any of them since upgrading to the latest versions:

  • Missing logs in Loki
  • Loki push failures causing cascading errors

@cyriltovena
Copy link
Contributor

@edsiper We have out of order now available, so we should be able to change the implementation of fluentbit to send batches in parallel.

@ScarletTanager
Copy link

I would very much like to see this plugin continue to be supported. We prefer having a golang output plugin available, as our team is significantly deeper in Go than C skills, and in addition to supporting the code, we intend to make a couple of small, local modifications.

I understand (correct me if I'm mistaken @edsiper ) that there are a couple of areas in which the native plugin needs to be brought up to par (e.g. support for batch compression), and while those are definitely good things to have, their addition to the C plugin doesn't help our specific case.

So I would ask the Loki team to put off deprecating the golang plugin for now, if possible.

@patrick-stephens
Copy link

I personally agree with deprecation: there is a fair bit of confusion with mismatches in configuration across the two plugins and so people follow a blog post/etc. for the Grafana one but use the Fluent one and then get failures. There is also the duplication of effort required: implement a feature in one then in the other (maybe slightly differently) plus the fragmentation of features. Having a single official plugin is much preferable for support, documentation, development and testing.

@patrick-stephens
Copy link

@owen-d what was the outcome of this? Just curious if the plan is to deprecate or not - and when if so?

@sbocahu
Copy link

sbocahu commented May 5, 2022

First used the fluent-bit native and the had to switch to Loki's one as we had failures to send to loki after a while (maybe after a disconnection / small network interruption)

@krafcima
Copy link

@edsiper Where is functionality of custom labels? In grafana/fluent-bit is in output to loki LabelMapPath available.
It would be cool to use custom_label_map.json.

@krafcima
Copy link

@edsiper Where is functionality of custom labels? In grafana/fluent-bit is in output to loki LabelMapPath available.
It would be cool to use custom_label_map.json.

So, is it possible to implement it?

@edsiper
Copy link

edsiper commented Aug 31, 2022

@nokute78 can you implement the LabelMapPath feature please ? , ref:

https://grafana.com/docs/loki/latest/clients/fluentbit/#labelmappath

@nokute78
Copy link

@edsiper I created a patch to support label_map_path fluent/fluent-bit#6040

@edsiper
Copy link

edsiper commented Sep 15, 2022

awesome! thanks @nokute78 !

@krafcima
Copy link

krafcima commented Sep 16, 2022

@edsiper I created a patch to support label_map_path fluent/fluent-bit#6040

Awesome! Many thanks @nokute78 @edsiper !

@aleonsan
Copy link

aleonsan commented Jul 14, 2023

Hello! It's been almost 2 years since the creation of this issue.
What is the situation now?
Is there a feature roadmap to fill the gaps, if any, between grafana-loki plugin and the Fluentbit's builtin Loki output?

AFAIK, this parameters are not supported by the builtin output:

Parameter Description Default
BatchWait Time to wait before send a log batch to Loki, full or not. 1s
BatchSize Log batch size to send a log batch to Loki (unit: Bytes). 10 KiB (10 * 1024 Bytes)
Timeout Maximum time to wait for loki server to respond to a request. 10s
MinBackoff Initial backoff time between retries. 500ms
MaxBackoff Maximum backoff time between retries. 5m

And some others could be achieved using other non loki output specific FBit output parameters:

e.g. #1

Parameter Description Default
MaxRetries Maximum number of retries when sending batches. Setting it to 0 will retry indefinitely. 10
could be somehow achieved using Retry_Limit parameter

e.g. #2

Parameter Description Default
Buffer Enable buffering mechanism false
-- -- --
BufferType Specify the buffering mechanism to use (currently only dque is implemented). dque
DqueDir Path to the directory for queued logs /tmp/flb-storage/loki
DqueSegmentSize Segment size in terms of number of records per segment 500
DqueSync Whether to fsync each queue change. Specify no fsync with “normal”, and fsync with “full”. “normal”
DqueName Queue name, must be uniq per output dque

buffering could be implemented using own FBit's storage parameters and limiting the size.

Am I right?
Are there any other important differences between the 2 implementations?

@patrick-stephens
Copy link

@aleonsan thanks for the good analysis - any chance you can raise an issue on the OSS repo to track the new features we may need?
https://github.com/fluent/fluent-bit

This is a Grafana repo so I cannot comment on their roadmap but from an OSS perspective I'd really like to make sure we have feature parity and a migration approach. We regularly get issues raised due to using the Grafana docs but the OSS image, plus the current Grafana image is now based on an unsupported 1.9 version of OSS - we're up to 2.1.7 as of today with a load of new features including OTEL compliance.

@ksauzz
Copy link

ksauzz commented Sep 13, 2023

Hello,
First of all, thank you for maintaining cool OSS products.
I have 2 feedback items on the migration from grafana-loki plugin to loki plugin

Compression

After the migration we observed 6-8x higher traffic on loki-gateway than before. According to my investigation, it seems like grafana-loki plugin, actually promtail client, uses application/x-protobuf with snappy compression, but loki plugin uses applicaiton/json with no compression. It would be nice if loki plugin would also support compression to reduce network traffic.

image

https://github.com/grafana/loki/blob/v2.9.0/clients/pkg/promtail/client/client.go#L442-L453
https://github.com/fluent/fluent-bit/blob/v2.1.8/plugins/out_loki/loki.c#L1566-L1569

413 Request Entity Too Large by loki-gateway

loki plugin sometimes send a large data over 1MB which is rejected by loki-gateway on default. To accept such requests, we had to change client_max_body_size of loki-gateway to 3m. According to the fluentbit's docs, a chunk size is usually about 2MB, so we choose 3MB client_max_body_size. Thus, it would be nice to set 3MB client_max_body_size to loki-gateway on default in the helm chart.

https://nginx.org/en/docs/http/ngx_http_core_module.html#client_max_body_size
https://github.com/grafana/helm-charts/blob/loki-distributed-0.74.1/charts/loki-distributed/values.yaml#L1146

@patrick-stephens
Copy link

@ksauzz any feedback on the OSS side needs to be fed back to the OSS repo rather than this Grafana one otherwise it won't be seen.
https://github.com/fluent/fluent-bit

@Turkish
Copy link

Turkish commented Oct 13, 2023

I started with native and had to switch to grafana-loki for this reason:

The native Fluent-bit loki plugin does not support a custom URI , you can only set the Host and the Port, but you have no control over the URI (the path). With grafana-loki plugin, you can set a full Url.

Now I'm struggling with grafana-loki plugin to configure tls, I don't see that it's possible in the documentation, if anyone has an idea please help

@edsiper
Copy link

edsiper commented Oct 13, 2023

@Turkish thanks for your feedback. I have submitted a PR to implement that feature in Fluent Bit:

fluent/fluent-bit#8040

@edsiper
Copy link

edsiper commented Nov 5, 2023

hey folks, just wanted to check what else is needed to complete the transition, last two missing pieces around compression and configurable URI has been addressed. Please report any missing thing here.

@ptr1120
Copy link

ptr1120 commented Dec 9, 2023

hey folks, just wanted to check what else is needed to complete the transition, last two missing pieces around compression and configurable URI has been addressed. Please report any missing thing here.

It would be interesting to find a solution for how to push structured metadata to Loki using the fluent-bit Loki output.

@patrick-stephens
Copy link

OSS Fluent Bit does include an additional optional metadata section in every record now, primarily to support some of the OTEL requirements I believe. This potentially could be used.

@bgarcial
Copy link

Hey guys
I just saw in this website that the fluentbit grafana helm chart is deprecated now and is recommendable to use the official helm chart.
Is it only for the helm chart or is the grafana fluentbit implementation also deprecated?
I asked because the official documentation on grafana their fluentbit implementation is still there

@edsiper
Copy link

edsiper commented Jun 14, 2024

Hi Folks, regarding the initial requirements around batching, is this still highly necessary ? I would like to learn from urgency level of this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted We would love help on these issues. Please come help us! keepalive An issue or PR that will be kept alive and never marked as stale. need-investigation type/question
Projects
None yet
Development

No branches or pull requests