Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(new sink): Initial loki sink #1783

Merged
merged 8 commits into from
Feb 17, 2020
Merged

feat(new sink): Initial loki sink #1783

merged 8 commits into from
Feb 17, 2020

Conversation

LucioFranco
Copy link
Contributor

This adds our initial implementation of the loki sink. This sink is a bit unique in that we need to do some sorting before we send events to ensure that loki will accept them. We do this in the build_request phase. This sink also doesn't use our partition batching techniques but instead internally multiplexes events since a single request can contain multiple "streams". Stream labels are also generated via a HashMap that allows templatable values to allow users to route logs to specific streams based on event attributes.

The docs are mostly complete but they could use a good review. Mostly, I am not sure if I did the labels section correctly.

To test this sink ensure that you have started the loki docker container and then run cargo test --features docker loki.

Closes #557

Signed-off-by: Lucio Franco <luciofranco14@gmail.com>
Signed-off-by: Lucio Franco <luciofranco14@gmail.com>
Signed-off-by: Lucio Franco <luciofranco14@gmail.com>
Copy link
Member

@lukesteensen lukesteensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments, but once those are addressed this looks good to me!

src/sinks/loki.rs Outdated Show resolved Hide resolved
src/sinks/loki.rs Outdated Show resolved Hide resolved
src/sinks/loki.rs Outdated Show resolved Hide resolved
src/sinks/loki.rs Outdated Show resolved Hide resolved
src/sinks/loki.rs Outdated Show resolved Hide resolved
src/sinks/loki.rs Outdated Show resolved Hide resolved
let events = events
.into_iter()
// The final json output should be: `[ts, line]`
.map(|e| json!([format!("{}", e.0), e.1]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to explicitly format the timestamp, e.g. to_rfc3339()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already an i64 but Loki wants it as a string instead of a json number.

}))
.unwrap();

let uri = format!("{}loki/api/v1/push", self.host);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we checking that the configured value has a trailing slash anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A valid Uri should always provide the / is what I've found in my experience. Since, our config uses UriSerde then this should be safe to assume.

}

async fn healthcheck(config: LokiConfig, resolver: Resolver) -> Result<(), crate::Error> {
let uri = format!("{}ready", config.host);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here with the trailing slash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as #1783 (comment)

@lukesteensen
Copy link
Member

And obviously once CI is happy as well 😄

@binarylogic
Copy link
Contributor

binarylogic commented Feb 13, 2020

@tomwilkie @slim-bean @cyriltovena @rfratto good news! We're coming in for a landing with our Loki integration 🎉 . We'd love it if someone at Grafana could review/QA this.

For context, Vector is an observability data router, that is oh-so-perfect for the Prometheus/Loki/Grafana stack. Check it out and let me know what you think. I'd love to chat more about this integration. Feel free to reach out to me at ben@timber.io .

@binarylogic binarylogic changed the title chore(new sink): Initial loki sink feat(new sink): Initial loki sink Feb 13, 2020
required = true
examples = [{label = "value"}]
description = """\
A set of labels that will be attached to each batch of events. These values\

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should remind users to avoid to use high cardinality labels like they would in Prometheus. I mean may be this is obvious.

Do you guys have some sort of label/tag discovery from k8s ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're about to. @ktff is working on a kubernetes_metadata transform that should be completed in the next week or 2, and we currently have these enrichment transforms.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool this should make it to Loki sink !

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I will add a note about high cardinality labels.

So we don't really have a concept of a schema for Kubernetes labels but we are about to provide a transform that can enrich your events with kube labels. It may make sense to add this to our docs to explain how you can combine that transform with this sink.

The idea is that you can configure loki like so:

[sinks.my_loki]
# ~~ snip other config items ~~
labels = {pod_id = "{{ labels.pod_id }}"}

Related k8 issue #1072

cc @binarylogic this is kinda in the realm of guide vs sink doc page, I think it makes sense to add a note about kube possibly within the loki sink docs? What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it is. I've got it and thanks for the heads up.

}));
}

let body = serde_json::to_vec(&json!({

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: We also support snappy compressed proto.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 that is good to know, do you know if any of the current clients support this? I didn't see anything for this in fluentd/fluentbit but I did not dive super deep into promtail.

We currently don't support snappy within any of our sinks but it should be fairly easy to support.

Copy link

@cyriltovena cyriltovena Feb 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not saying you should use it, although some clients can't and that could be an advantage.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only promtail/docker and fluentbit are using it, it's part of our go client.


### Decentralized deployments

Loki currently does not support out-of-order inserts. If Vector is deployed in a decentralized setup

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call !

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: we're trying to improve this. but it's hard cause it's the foundation of the db behind. One option we're looking into is being able to let loki rework/create timestamp server side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit off topic, but I am a huge fan of how facebooks logdevice handles multiple sequencers (ingestors) by giving up on a global ordering of timestamps but using a combination of (seq_id, ts) which would help with this problem.

https://logdevice.io/docs/Writepath.html#sequencer

Excited to see what you all come up with! :) For now I think the best is to just run things through a central vector instance, there might be some funky hacky solutions around this but I think centralized is good enough for now.

LucioFranco and others added 3 commits February 13, 2020 11:27
Signed-off-by: Lucio Franco <luciofranco14@gmail.com>
Signed-off-by: binarylogic <bjohnson@binarylogic.com>
Signed-off-by: binarylogic <bjohnson@binarylogic.com>
Copy link
Contributor

@binarylogic binarylogic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs look great! Once you rebase feel free to merge.

binarylogic and others added 2 commits February 17, 2020 17:18
Signed-off-by: binarylogic <bjohnson@binarylogic.com>
Signed-off-by: Lucio Franco <luciofranco14@gmail.com>
@binarylogic binarylogic merged commit 19f06eb into master Feb 17, 2020
@binarylogic binarylogic deleted the lucio/loki branch February 17, 2020 22:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New loki sink
4 participants