Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fluent-plugin: Improve escaping in key_value format #2434

Merged
merged 1 commit into from
Oct 27, 2020

Conversation

andsens
Copy link
Contributor

@andsens andsens commented Jul 28, 2020

The key_value format does not seem to be well specified.
This change assumes "logfmt" as the format and escapes values accordingly.
The code itself is lifted from here:
https://github.com/csquared/node-logfmt/blob/e279d43cde019acfcca0abe99ea6b7ce8327e1f7/lib/stringify.js

Grafana seems to handle values escaped in this manner much better than before.
e.g.: query="{program="loki"}") was shown as (IIRC) query="{program= and is now displayed correctly.

As an added bonus only values that require quoting will actually be quoted, improving readability and derived field regexes.

@codecov-commenter
Copy link

codecov-commenter commented Jul 28, 2020

Codecov Report

Merging #2434 into master will increase coverage by 0.09%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2434      +/-   ##
==========================================
+ Coverage   61.55%   61.65%   +0.09%     
==========================================
  Files         160      160              
  Lines       13612    13612              
==========================================
+ Hits         8379     8392      +13     
+ Misses       4607     4592      -15     
- Partials      626      628       +2     
Impacted Files Coverage Δ
pkg/querier/queryrange/downstreamer.go 95.87% <0.00%> (-2.07%) ⬇️
pkg/promtail/positions/positions.go 59.64% <0.00%> (+13.15%) ⬆️

The key_value format does not seem to be well specified.
This change assumes "logfmt" as the format and escapes
values accordingly:
The code itself is lifted from here:
https://github.com/csquared/node-logfmt/blob/e279d43cde019acfcca0abe99ea6b7ce8327e1f7/lib/stringify.js
Grafana seems to handle values escaped in this manner
much better than before.
e.g.:
query="{program="loki"}")
was shown as (IIRC)
query="{program=
and is now displayed correctly.

As an added bonus only values that require quoting will
actually be quoted, improving readability and derived
fields regexes.
@andsens andsens force-pushed the fix-fluent-plugin-kv-escaping branch from c5f3c48 to e164396 Compare July 29, 2020 08:36
@CLAassistant
Copy link

CLAassistant commented Jul 29, 2020

CLA assistant check
All committers have signed the CLA.

@andsens
Copy link
Contributor Author

andsens commented Aug 5, 2020

@cyriltovena ping. I've added a comment to the line in question.

@slim-bean
Copy link
Collaborator

Thanks @andsens! Cyril is on leave for a bit and we haven't had a chance to circle back around to this PR. I'm less familiar with the fluentd plugin, is there much risk in this PR to breaking anyone's existing installations?

@andsens
Copy link
Contributor Author

andsens commented Aug 11, 2020

hmm, I doubt it. I have only tested this in a fluentd->loki->grafana setup. The main issue this change fixes is Grafana being able to actually display values with quotes and spaces in them. It seems unlikely that anyone would have processing between fluentd->loki or loki->Grafana.

p.s.: The fix is not complete, e.g. Grafana still has issues showing keys with dots in them. I'll need to examine which issues are on the loki side and which ones on the Grafana side before I do anything about that though.

@andsens
Copy link
Contributor Author

andsens commented Aug 31, 2020

ping

Copy link

@komapa komapa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use the same setup with fluentd -> loki and this showed as immediate problelm with nested json and generally logfmt inside logfmt which is very common with a layers of logging collection that is Kubernetes :)

Comment on lines +262 to +268
# Escape double quotes and backslashes by prefixing them with a backslash
v = v.to_s.gsub(%r{(["\\])}, '\\\\\1')
if v.include?(' ') || v.include?('=')
formatted_labels.push(%(#{k}="#{v}"))
else
formatted_labels.push(%(#{k}=#{v}))
end
Copy link

@komapa komapa Sep 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked some ruby sources for the best way to do this and I came up with this solution instead:

Suggested change
# Escape double quotes and backslashes by prefixing them with a backslash
v = v.to_s.gsub(%r{(["\\])}, '\\\\\1')
if v.include?(' ') || v.include?('=')
formatted_labels.push(%(#{k}="#{v}"))
else
formatted_labels.push(%(#{k}=#{v}))
end
# This regular expression will validate true if the string doesn't need escaping.
v = v.dump unless v.to_s =~ /\A[\w\.\-\+\%\_\,\:\;\/]*\z/i
formatted_labels.push(%(#{k}=#{v}))

Source: https://github.com/polyfox/moon-logfmt/blob/master/lib/moon-logfmt/utils.rb#L25 and https://ruby-doc.org/core-2.5.1/String.html#method-i-dump

@andsens
Copy link
Contributor Author

andsens commented Sep 27, 2020

@kirooshu so in this case ruby's internal string escaping is used?
I must say that in general I feel like logfmt could use a proper formal specification, it seems like every library maintainer comes up with their own escaping rules due to the lack of a canonical reference. I can't even find a BNF anywhere!
So, because of that, and with the fact in mind that the consumer of loki is usually Grafana, I think we should take a very pragmatic approach here: Do whatever makes things look nice in Grafana. If Grafana itself has some parsing bugs, we should probably fix them in Grafana rather than try work around them here, but other than that? 🤷‍♂️

@komapa
Copy link

komapa commented Sep 27, 2020

Yeah, I agree this is why I approved your PR as is. I tried it first and it does work as intended, I just felt like this should be a solved problem like you said and looked for more canonical way. To be honest, fluentd should have some methods for key/value format and I know there are many plugins for key/value and we should probably do what they do already to stay consistent within the fluentd ecosystem.

@komapa
Copy link

komapa commented Sep 30, 2020

For us, if we cannot have this merged, we will need to just "fork" the output by putting it in our version control with these same change. Can we expect to have this fix in the next version release or should we go with the forking for the time being?

@andsens
Copy link
Contributor Author

andsens commented Sep 30, 2020

@kirooshu I think the loki team has quite a bit on their plate, so you should probably just go ahead and build the plugin yourself.

I'm not currently at my work computer, so I don't have the commands to build the plugin itself, but my Dockerfile for fluentd itself is tracked, so maybe this can save you some time :-)

fluentd Dockerfile
ARG FLUENTD_VERSION
FROM fluent/fluentd:${FLUENTD_VERSION} as build
USER root

RUN apt-get update \
	&& apt-get install -y --no-install-recommends \
	sudo make gcc g++ libc-dev automake autoconf libtool ruby-dev libffi-dev

ENV GEM_PATH /fluentd/vendor/bundle/ruby/2.6.0
ENV GEM_HOME /fluentd/vendor/bundle/ruby/2.6.0
RUN gem install bundler --version 2.1.4
RUN bundle config silence_root_warning true
COPY Gemfile /fluentd/Gemfile
RUN bundle install --gemfile=/fluentd/Gemfile
RUN gem install /fluentd/plugins/fluent-plugin-grafana-loki-1.2.14.gem

FROM fluent/fluentd:${FLUENTD_VERSION}
USER root

COPY --from=build /fluentd/vendor/ /fluentd/vendor/
COPY fluent-plugin-grafana-loki-1.2.14.gem /fluentd/plugins

Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cyriltovena cyriltovena merged commit 081cc02 into grafana:master Oct 27, 2020
fpob added a commit to fpob/loki that referenced this pull request Feb 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants