Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add write_syslog output plugin #3019

Merged
merged 1 commit into from May 7, 2019
Merged

Add write_syslog output plugin #3019

merged 1 commit into from May 7, 2019

Conversation

sradco
Copy link
Contributor

@sradco sradco commented Dec 10, 2018

This patch adds an output plugin to send metrics in CEE-enhanced Syslog message format by TCP .
The syslog message includes the metrics data in human readable, structured data or json format.

The plugin is based on the write_tsdb and the write_graphite plugins.

It allows adding additional metadata.

I plan to add to it the option to send the data also as unix socket.

An example of the record after it was received by Rsyslog:

Dec 20 12:11:22 sradco.tlv.csb collectd [time=1545300683][collectd memory.if_dropped.tx=445906944 plugin="memory" plugin_instance="" type_instance="buffered" type="memory" interval=10][hostname="sradco.tlv.csb"][ovirt engine="test" vm="test2"] memory.if_dropped.tx=445906944
Dec 20 12:11:22 sradco.tlv.csb collectd {"time":1545300683, "collectd":{ "memory":{ "if_dropped.tx":445906944 }, "plugin":"memory", "plugin_instance":"", "type_instance":"buffered", "type":"memory", "interval":10 }, "hostname":"sradco.tlv.csb" , "ovirt": {"engine":"test","vm":"test2"}}

Where "ovirt": {"engine":"test","vm":"test2"} is metadata.
Configuration for this output plugin:

LoadPlugin write_rsyslog

<Plugin write_rsyslog>
   <Node "example">
     Host "localhost"
     Port "44514"
     Prefix "collectd"
     AlwaysAppendDS false
     MessageFormat "human"
     HostTags [tag1="value"][tag2="value2"]
   </Node>
 </Plugin>

ChangeLog: Write Syslog plugin: "write_syslog" plugin writes values lists as syslog messages.

Signed-off-by: Shirly Radco sradco@redhat.com

@faxm0dem
Copy link
Contributor

is there anything specific to rsyslog in your plugin? if not, write_tcp_json or similar would make more sense for a name.

@sradco sradco force-pushed the master branch 2 times, most recently from a127ef2 to 03f7463 Compare December 20, 2018 10:14
@rgolangh
Copy link

@faxm0dem If I plan to add the option to output to unixsocket with the same json format. Should I add a different plugin? or should I use this one and give it a more general name(that is the reason I called it write_rsyslog)?

perhaps just "write_json" for a name?

@sradco
Copy link
Contributor Author

sradco commented Dec 31, 2018

@faxm0dem If I plan to add the option to output to unixsocket with the same json format. Should I add a different plugin? or should I use this one and give it a more general name(that is the reason I called it write_rsyslog)?

perhaps just "write_json" for a name?
The format sent is of syslog message. The metric itself is in the msg part in json format.
I plan to have this for rsyslog with tcp/udp/unixsocket. @rgolangh

@rgolangh
Copy link

rgolangh commented Jan 2, 2019

Taking back my comment about naming, it is pretty rsyslog specific as far as I understand.

@faxm0dem general question - I see that across all plugins, every invocation with pthread_mutex_lock is ignoring the error that may return. Why is that?

@sradco sradco force-pushed the master branch 2 times, most recently from 4bcf823 to a1aa19d Compare January 6, 2019 14:02
@sradco
Copy link
Contributor Author

sradco commented Jan 8, 2019

Hi, @octo, @tokkee, @pyr I would really appreciate your review for this patch.
I believe this an important plugin that send the metrics in syslog format like log messages.
The msg contant includes the metrics in json format that is very easy to parse and also allows adding any metadata the user needs in a very simple way.

@faxm0dem
Copy link
Contributor

First of all sorry for not answering your requests earlier.
Could you please explain what your plugin does specifically with regards to rsyslog? Would it also work for syslog? syslog-ng? To me it looks like it's simply a TCP (or UDP) output that writes JSON formatted output.

@sradco
Copy link
Contributor Author

sradco commented Jan 10, 2019

First of all sorry for not answering your requests earlier.
Could you please explain what your plugin does specifically with regards to rsyslog? Would it also work for syslog? syslog-ng? To me it looks like it's simply a TCP (or UDP) output that writes JSON formatted output.

@faxm0dem Thank you for your review. The plugin send the metrics in CEE-enhanced Syslog message format. The syslog message part includes the metrics data in a json format.
This allows a very simple way to get the metrics into Rsyslog and parse the "msg" part of the syslog messages, using the Json parser.

Rsyslog expects syslog messages. Having a simple Json or key-value format does is not handled well in Rsyslog.
I verified the patch and it works really well. This is a very efficient way for us to ship the metrics to elasticsearch, using the rsyslog omelasticsearch output plugin, for example.

I used the following for structuring the data before the "msg":
https://tools.ietf.org/html/rfc5424

This is an example of a similar way we are using the messages that we get from Collectd:
https://www.rsyslog.com/tag/cee-enhanced/

I believe this should work for syslog and syslog-ng as well. But I did not test.

@faxm0dem
Copy link
Contributor

I can confirm syslog-ng can parse JSON and filter using the marker option.

I therefore still think the plugin in its current form shouldn't be called write_rsyslog, but rather write_tcp or write_tcp_json.
Moreover, I believe (without reading the code in detail) that the code:

  • would benefit from deduplication with write_graphite, which it is based on
  • could be improved later by adding the possibility to share the formatters of write_http
  • could be further made more generic by using either tcp or udp transport

So IMHO in its current form it should be renamed to remove the rsyslog specifics, and for the rest I'm kindly asking for a second opinion :-)

@sradco
Copy link
Contributor Author

sradco commented Jan 13, 2019

I can confirm syslog-ng can parse JSON and filter using the marker option.

@faxm0dem First I really appreciate your review. Thank you.

I need to emphasis that Only the msg part of the syslog message is Json.
There is no input plugin that accept Json records in rsyslog. Only syslog messages.
There is a specific syslog message format before the msg that must be used it order for rsyslog to treat the message coming from Collectd correctly. This is what makes this message specific to rsyslog. I did not test with syslog-ng and syslog but I guess it should work the same there.
https://tools.ietf.org/html/rfc5424#section-6

The first example here https://asylum.madhouse-project.org/blog/2013/07/29/json-howto/ for processing plain json records is not available in rsyslog. But having syslog message with the json as the "message" part of the syslog message will work for both rsyslog and syslog-ng.

It needs a HEADER, PRI(combination of Facility and Severity), HOSTNAME, APP-NAME(collectd), PROCID(empty for now), MSGID, STRUCTURED-DATA(no need for now so left empty), MSG. These are the minimal values required for rsyslog to treat the message coming from Collectd as a syslog message.

I therefore still think the plugin in its current form shouldn't be called write_rsyslog, but rather write_tcp or write_tcp_json.

Simple json will not work for the above. Rsyslog/syslog-ng/syslog expect a syslog message format.
I have tested this with Rsyslog.

Moreover, I believe (without reading the code in detail) that the code:

  • would benefit from deduplication with write_graphite, which it is based on
  • could be improved later by adding the possibility to share the formatters of write_http
  • could be further made more generic by using either tcp or udp transport

If we have a generic tcp output with different messages to different outputs it can potentially work, but the code will be very complicated. The way that we build the message for rsyslog is not like for tsdb or graphite. Why do each of them have a different plugin? They both ship with tcp.
Like TSDB expects certain message structure and graphite expects a certain metric name(concatenated with dots), so does Rsyslog.

So IMHO in its current form it should be renamed to remove the rsyslog specifics, and for the rest I'm kindly asking for a second opinion :-)

@faxm0dem
Copy link
Contributor

Thanks for clarifying. I agree on your analysis with respect to factorizing code.
I guess we should make a new pull request that would factor out the common code between all those plugins doing TCP.

Now, that it's clear this plugin targets the syslog format, why bother using JSON as the MSG part? You could use the STRUCTURED-DATA to store collectd plugin, type, instances and metric. Then any RFC5424 capable software would decode the parts automatically. Moreover, we could still use the MSG part to add the data in human-readable (aka unixsock) form:

mockup
<13>1 2019-01-15T14:06:51+01:00 node42.example.com collectd 21525 - [collectd plugin="load"][collectd type="load"][collectd type_instance="relative"][collectd ds_name="shortterm"][collectd metric="6.5e-02"] load/load-relative shortterm=6.500000e-02

In this case the plugin's name should be write_syslog or write_ietf or write_rfc5424

@sradco
Copy link
Contributor Author

sradco commented Jan 15, 2019

Thanks for clarifying. I agree on your analysis with respect to factorizing code.
I guess we should make a new pull request that would factor out the common code between all those plugins doing TCP.

Now, that it's clear this plugin targets the syslog format, why bother using JSON as the MSG part? You could use the STRUCTURED-DATA to store collectd plugin, type, instances and metric. Then any RFC5424 capable software would decode the parts automatically. Moreover, we could still use the MSG part to add the data in human-readable (aka unixsock) form:

mockup
<13>1 2019-01-15T14:06:51+01:00 node42.example.com collectd 21525 - [collectd plugin="load"][collectd type="load"][collectd type_instance="relative"][collectd ds_name="shortterm"][collectd metric="6.5e-02"] load/load-relative shortterm=6.500000e-02

In this case the plugin's name should be write_syslog or write_ietf or write_rfc5424

Json allows nesting and other abilities that structured data does not and for us it really simplified sending the data to elasticsearch.

But, I can definitely add a variable that user can choose between json and structured data formats.
About the name, Perhaps write_syslog would be exceptable, since I think its clearer. We can state the RFC5424 in the description.

What do you think @faxm0dem ?

Thanks again for your comments and review.

@faxm0dem
Copy link
Contributor

write_syslog sounds right, and it would be awesome if you could implement the structured data bit, thanks!

I guess three options would make it flexible:

  • MessageFormat = "human"|"json" (human by default)
  • EnableStructuredData = true|false (true by default)
  • StructuredDataPrefix = "collectd" would affect both RFC5424 SDATA and the JSON object

FWIW when using syslog-ng for sending data to ES, in my example above format-json will create structured JSON automatically.

@sradco
Copy link
Contributor Author

sradco commented Jan 17, 2019

write_syslog sounds right, and it would be awesome if you could implement the structured data bit, thanks!

I guess three options would make it flexible:

  • MessageFormat = "human"|"json" (human by default)
  • EnableStructuredData = true|false (true by default)

Can you please explain what is he difference between the MessageFormat = "human" and EnableStructuredData= true ?
Why do we need the EnableStructuredData ?

Thanks!

  • StructuredDataPrefix = "collectd" would affect both RFC5424 SDATA and the JSON object

FWIW when using syslog-ng for sending data to ES, in my example above format-json will create structured JSON automatically.

I don't believe it is possible in Rsyslog...

@sradco sradco changed the title Add write_rsyslog output plugin Add write_ssyslog output plugin Jan 20, 2019
@sradco sradco changed the title Add write_ssyslog output plugin Add write_syslog output plugin Jan 20, 2019
@sradco sradco force-pushed the master branch 2 times, most recently from 1d2a65b to 9f12b91 Compare January 20, 2019 17:48
@sradco
Copy link
Contributor Author

sradco commented Jan 20, 2019

@faxm0dem Please review my patch.
I update the name to write_rsyslog.
Added:

  • MessageFormat = "human"|"json" (human by default)
  • StructuredDataPrefix = "collectd" -> I already have a "Prefix" variable, and it is used for both structured data and Json.

Not sure why we need the flag:
EnableStructuredData = true|false (true by default)

@sradco
Copy link
Contributor Author

sradco commented Jan 20, 2019

I run contrib/format.sh src/write_syslog.c but there are no changes...

@faxm0dem
Copy link
Contributor

faxm0dem commented Jan 20, 2019

my idea is that the user could choose to disable RFC5425 SDATA

@sradco
Copy link
Contributor Author

sradco commented Jan 20, 2019

I'll be happy to implement but I don't understand hat do you mean by disabling RFC5425 SDATA? How will the message look like?
Can you please share an example? @faxm0dem

@rgolangh
Copy link

rgolangh commented Jan 21, 2019 via email

@sradco
Copy link
Contributor Author

sradco commented Jan 22, 2019

@faxm0dem Hi Fabien, Can we agree that this can later be added for other use cases if needed and approve this PR for now?

src/collectd.conf.pod Outdated Show resolved Hide resolved
src/collectd.conf.pod Outdated Show resolved Hide resolved
src/collectd.conf.pod Outdated Show resolved Hide resolved
It implements the basic syslog protocol, extends it with
content-based filtering, rich filtering capabilities,
flexible configuration options and adds features such as using TCP for transport.
The plugin can connect to a I<Syslog> daemon that will ingest metrics, transform and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps give an example of such a daemon. Which daemons support this input? rsyslog, syslog-ng?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can provide a simple syslog-ng config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please...

src/write_syslog.c Outdated Show resolved Hide resolved
src/write_syslog.c Outdated Show resolved Hide resolved
@rubenk
Copy link
Contributor

rubenk commented Mar 5, 2019

I did another pass, and saw that you missed a few of my earlier comments.
Reviewing this is really hard if all the issues are dismissed because they also exists in other plugins.

I'd rather you spend an hour refactoring this plugin before it hits the tree, there should never be a rush to get things in.
Since @octo promised you it's fine to clean up afterwards, I'll leave the decision to merge up to him, but please address the few new comments I left.

@sradco sradco force-pushed the master branch 3 times, most recently from ed517b2 to 1cd7e38 Compare March 7, 2019 13:42
@sradco
Copy link
Contributor Author

sradco commented Mar 7, 2019

@rubenk @faxm0dem Hi, I made the fixes for the issues you have raised. Please review.
If you can help me with syslog-ng base configuration it will be great. Thanks.

This patch adds an output plugin to send metrics
as CEE-enhanced Syslog log messages by TCP .

The syslog message includes the metrics data in
human readable structured data format and in json
format.

It allows adding additional metedata.

This plugin is based on the write_tsdb plugin.

Signed-off-by: Shirly Radco <sradco@redhat.com>
@faxm0dem
Copy link
Contributor

faxm0dem commented Mar 9, 2019

sure thing I'll do that on Monday

@faxm0dem
Copy link
Contributor

Sorry for the delay. Here's an example source config for syslog-ng.
It configures collectd-write_syslog in human/rfc5424/SDATA mode.

collectd config

MessageFormat human

syslog-ng config


destination d_collectd {
    file(
      "/var/log/collectd.log"
      template("$(format-json -s rfc5424 -s all-nv-pairs)\n")
    );
  };

source s_collectd {
   network(
      ip("127.0.0.1")
      port(44514)
      transport(tcp)
      flags(syslog-protocol)
    );
}

log {
  source(s_collectd);
  destination(d_collectd);
};

@sradco
Copy link
Contributor Author

sradco commented Mar 21, 2019

To add this to the src/collectd.conf.pod ? @faxm0dem

@faxm0dem
Copy link
Contributor

@rubenk what do you think?

@sradco
Copy link
Contributor Author

sradco commented Mar 25, 2019

Hi, Please update me about the place to add the example.
I don't believe I saw these kind of examples in the collectd documentation before.
If there is, I'll be happy if you can point me to it for reference.
@faxm0dem @rubenk @octo

@sradco
Copy link
Contributor Author

sradco commented Apr 3, 2019

Ping @faxm0dem @rubenk @octo

1 similar comment
@sradco
Copy link
Contributor Author

sradco commented May 5, 2019

Ping @faxm0dem @rubenk @octo

@mrunge mrunge requested a review from rubenk May 6, 2019 08:06
@mrunge
Copy link
Member

mrunge commented May 6, 2019

Ruben, I believe, Shirly did the changes you requested? Is there anything else missing?

@faxm0dem
Copy link
Contributor

faxm0dem commented May 7, 2019

LGTM

  • the following can be added to the wiki: syslog-ng and rsyslog examples (see above comment)
  • the json model isn't perfect from a normalizing perspective, but users will be able to change it in the future (see next point), or they can use the SDATA alternative right away
  • some code deduplication will be needed at some point, with all the tcp/udp write plugins (graphite, tsdb, etc.) but this can be done in separate PR as stated by @octo

So again, LGTM as-is and can be merged for now as far as I'm humbly concerned :)

@rubenk rubenk merged commit dfb1782 into collectd:master May 7, 2019
@mrunge mrunge modified the milestones: Features, 5.9 May 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants