Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syslog sink #133

Merged
merged 7 commits into from
Apr 3, 2023
Merged

Syslog sink #133

merged 7 commits into from
Apr 3, 2023

Conversation

syedriko
Copy link

@syedriko syedriko commented Mar 22, 2023

Initial implementation of the syslog sink

JIRA: https://issues.redhat.com/browse/LOG-2288

@openshift-ci
Copy link

openshift-ci bot commented Mar 22, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: syedriko
Once this PR has been reviewed and has the lgtm label, please assign alanconway for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Comment on lines 264 to +270
pub enum Encoding {
Text,
Json,
Syslog(SyslogConf),
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extending the enum and adding a new encoding scheme will perhaps break any code which does a match on this enum.
This enum is used in papertrail sink and it does a match on the enum, perhaps this sink should be changed to handle the new encoding type.

This will also force any new sink to consider syslog encoding, is this desired?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vimalk78 do you mean 'consider' or 'require'. Does this enum represent the encodings which a sink "may" support or "must" support?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since it is a part of enum, and if a match is done on the enum the enum MUST be matched on all variants.

Copy link
Author

@syedriko syedriko Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler would've flagged a mismatch between an enum and a non-exhaustive match but it didn't because we don't compile the papertrail sink.
But please consider the larger context. This PR will not become an upstream submission. In the current version of the upstream encodings are handled differently (https://github.com/vectordotdev/vector/tree/master/lib/codecs/src/encoding/format) and the syslog code would live there.

Copy link

@vimalk78 vimalk78 Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this implementation is specifically for us, just to be safe, we should put a feature flag to isolate these changes

#[cfg(feature = "sinks-syslog-rh")]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in more recent releases, (e.g. master which now is 0.27) , this enum does not exist. What is there is more sophisticated way of specifying EncodingWithFraming , and encoding can be chosen from a much larger set of serializers in SerilizerConfig

One of these is LogFmt , and this code seems to be suggesting that LogFmt can be used as serializer for syslog. So perhaps in future releases, we can reuse what already exists to realize a SyslogEncoder.

@@ -372,7 +372,8 @@ ocp-logging = [
"sinks-prometheus",
"sinks-gcp",
"sinks-splunk_hec",
"sinks-http"
"sinks-http",
"sinks-socket",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, we are not introducing any new sink for syslog, we are using socket sink with syslog encoding.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a thought, perhaps a dedicated sink will give us better control over having parity with fluentd syslog output.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, technically speaking I did not add a new sink, just a new encoding.


#[derive(Clone, Debug, Deserialize, Eq, PartialEq, Serialize)]
#[serde(rename_all = "snake_case")]
pub struct SyslogConf {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are some Syslog structures defined in tests/syslog.rs
It already seems to contain same structures as added here, perhaps chance of reuse.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably should consider flipping it around so the tests depend upon code that does not live in "tests"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our case the SyslogConf struct reflects the syslog output definition in the CLF and maps directly to the syslog encoding configuration in vector config file, with input validation tied to it. I don't think it's worth the effort trying to combine the two.

Encoding::Syslog(config) => {
let mut buf = String::from("<");
let pri = get_num_facility(&config.facility, &log) * 8 + get_num_severity(&config.severity, &log);
buf.push_str(&pri.to_string());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doing push_str so many times does not seem very efficient. We can do some pre-allocation, or use concat_string macro.
https://github.com/hoodie/concatenation_benchmarks-rs#concat_string_macro

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an interesting find, there's a benchmark for everything ;)
Agreed, this is naive and I hoped push_str() reallocates exponentially.

Copy link
Member

@jcantrill jcantrill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we missing some unit test?

_ => 24,
};
if num == 24 {
return Err(de::Error::invalid_value(de::Unexpected::Unsigned(num as u64), &"unknown facility"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this captured by the earlier check?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The earlier check was for a facility specified as a number, facility = "12", this one is for a name, facility = "user". There's probably space for more elegance here.

Comment on lines 264 to +270
pub enum Encoding {
Text,
Json,
Syslog(SyslogConf),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vimalk78 do you mean 'consider' or 'require'. Does this enum represent the encodings which a sink "may" support or "must" support?

Syslog(SyslogConf),
}

fn add_log_source(log: &LogEvent, buf: &mut String) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks kubernetes specific. That seems like a concern for a general syslog sink. Do we need to abstract it someplace else or at least we should add TODO to do something with it when we do an upstream pull

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is not going upstream as is, including for this ^^^ reason.

}
}
Err(_) => {
let num = match field_value_string.to_uppercase().as_str() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there some duplication here to deserialization that can be reused?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this and deserialization, for both severity and facility are asking for refactoring.

}

fn get_timestamp(log: &LogEvent) -> DateTime::<Local> {
match log.get("@timestamp") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this specific to our implementation needs for openshift?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vector defines a timestamp key as part of its log schema in src/config/log_schema.rs we can use that.

    pub fn timestamp_key(&self) -> &str {
        &self.timestamp_key
    }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is not going upstream as it is, there are things in it that don't make sense outside of our use case.

@syedriko
Copy link
Author

/test unit

@syedriko
Copy link
Author

/test cluster-logging-operator-e2e-5-6

1 similar comment
@syedriko
Copy link
Author

/test cluster-logging-operator-e2e-5-6

@syedriko syedriko merged commit 2929973 into ViaQ:v0.21-rh Apr 3, 2023
25 of 42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants