Proposal: Promtail Push API #1627

rfratto · 2020-02-04T16:02:32Z

This PR adds a proposal for a Promtail push API.

Feedback is welcome; please insert feedback as a comment in this PR or as a code review against the PR's added doc.

docs/design-documents/2020-02-Promtail-Push-API.md

Co-Authored-By: Owen Diehl <ow.diehl@gmail.com>

owen-d · 2020-02-04T18:38:58Z

docs/design-documents/2020-02-Promtail-Push-API.md

+
+### Considerations
+
+Users will be able to define multiple `http` scrape configs, but the base URL


Would it make more sense to have pipelines available via suffix instead of base url?
loki/api/v1/push/{A,B,C} instead of {A,B,C}/loki/api/v1/push?.

Hmm, could we still consider that Loki-compatible if it's a suffix?

I'm not sure this endpoint difference would actually be a problem, but I may be missing something here.

I'm just thinking of the Grafana use-case where it adds the Loki API as a suffix, and allows a base URL with whatever prefix you want. If other endpoints had custom suffixes, Grafana wouldn't be able to function properly.

OTOH, Grafana doesn't push logs to Loki, so this isn't a real concern; I'm just assuming there might be other tooling that works this way.

owen-d · 2020-02-04T18:42:57Z

docs/design-documents/2020-02-Promtail-Push-API.md

+value must be different for each instance. This allows to cleanly separate
+pipelines through different push endpoints.
+
+Users must also be aware about problems with running Promtail with an HTTP


The load balancing issue is very annoying and makes scaling difficult (users have to implement sharding or something else themselves).

I'm not sure I like it, but we could always support a simple sharding config where you can specify multiple promql endpoints and we can shard between them via labels. Changing this hash ring in such a primitive mode would require service restarts, though.

Yeah, that's a good point. I'm tempted to call this out of scope and handle it later on. IMO, the most flexible thing to do would be to use a DNS lookup to find all of our promtails and use that for sharding; we want to be able to support non-Kubernetes.

I'd be afraid to support DNS in sharding because of how ephemeral it may be. You'd still likely run into the same problems as you would with traditional round robin load balancing, albeit less often. That's why I was considering a simple inline config that hardcodes a number of endpoints, like:

promtails: prom1.<namespace>.svc.cluster.local prom2.<namespace>.svc.cluster.local prom3.<namespace>.svc.cluster.local

Note: I'm definitely not in love with this, but I'm trying to think of an expedient solution here.

Owen and I had a quick chat offline and agreed that we shouldn't try to deal with load balancing ourselves right now. There's a few points to be made here:

If you hardcode endpoints, you can't dynamically scale without rolling all your instances.

If you use DNS, the sharding is eventually consistent and might cause improper writes.

If instances rotate, the old instance might still be buffering some streams. If a new instance sends data for new values for streams before the old instance finishes flushing its buffer, you'll get out of order errors.

This is definitely a problem, but it's really complicated and not worth solving in Promtail right now. The (unfortunate) recommendation will be for people to write their own sharding mechanism in front of Promtail if they need to scale, and just use one Promtail instance otherwise.

I think considering it out of scope until you have feedback on your first version is the right way to go, but since we are discussing possibilities, @owen-d has a good idea. Round robin over configured list is a solid option - it's flexible enough to work, and is simple.

That said, I always like when things seamlessly integrate with consul :)

Not happy with the consensus here. We are going to make it worse and it's already a problem.

We may want to revisit the assumption that we need to be able to forward logs to multiple promtails.

Technically with the proposed API, you could chain together Promtails by having one Promtail write to another.

What use case are you thinking about?

owen-d · 2020-02-04T18:48:49Z

docs/design-documents/2020-02-Promtail-Push-API.md

+to be parsed, but this will generally be faster than the reflection requirements
+imposed by JSON.
+
+However, note that this API limits Promtail to accepting one line at a time and


I think we could support batching easily via multiline http bodies, like

<ts> <log-line> <ts> <log-line> ...

This is similar to the elasticsearch bulk api which uses ndjson:
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html

@rfratto does ^ seem reasonable to you?

I'm not sure. I'm still partial to just having Promtail expose the same push API as Loki and leaving decisions about text-based APIs to external tooling.

For me, I really want to be able to run this when doing local testing for something:

loki -log.level=debug -config.file=/my/config/file | \ promtail-cat -lapp=loki http://localhost:1234/loki/api/v1/push

I think your batching idea makes sense, it just drifts a little bit from what I'm hoping to eventually be able to do with this API. (Although I suppose my promtail-cat tool could still exist on top of what you're suggesting here)

Hmm, that's a good point. It seems weird to expose an unbatched endpoint, but I guess it's also weird/redundant to do double buffering here and in promtail. I guess the nice part is it's easy to scale out promtails linearly across infra/streams, so the cost of unbuffered pushes is amortized over that. I think you're taking the right approach here and we probably don't want to use the multiline approach.

You're taking a big bet based on preference and possibility to cat logs to promtail.

I would like a benchmark here, batching vs non batching on high throughput. We don't have a good solution for scaling and we're saying it's fine we will scale easily promtail.

Also you can now pipe data to promtail.

Maybe I'm miscommunicating? I do want the solution we build here to support high throughput, and I do want it to be able to support batching.

The idea of a promtail-cat is a completely separate project that lives outside of the Loki repo and that builds on top of an existing high-throughput solution. That separate tool isn't meant for high throughput. It's out of scope for the design here and I probably shouldn't have brought it up.

cyriltovena · 2020-02-14T16:20:23Z

docs/design-documents/2020-02-Promtail-Push-API.md

+### Option 3: Plaintext Payload
+
+Prometheus' [Push Gateway API](https://github.com/prometheus/pushgateway#command-line)
+is cleverly designed and we should consider implementing our API in the same


The Push Gateway has been designed only for difficult metrics collection. Not to support big throughput of metrics. Batching is and will always perform better than this.

This is probably why we don't use this API in cortex to push metrics.

Yeah, I agree it's a bad primary approach for the lack of throughput.

cyriltovena

Good document overall ! I think you really pointed out well the problem.

But I'm not satisfied with solution, there is also other alternative where we could run pipelines in Loki configured per tenant/stream this would have at least the benefit of not affecting the roundtripping latency.

rfratto · 2020-02-14T16:35:14Z

there is also other alternative where we could run pipelines in Loki configured per tenant/stream this would have at least the benefit of not affecting the roundtripping latency.

That's a good point. My concern with that is users would not have the ability to easily utilize metrics stages. I guess unless Loki could act as a Prometheus remote_read or remote_write?

cyriltovena · 2020-02-21T16:32:17Z

I think the next step, is for @slim-bean to take action with all feedback, we should not wait for consensus. All feedback have been given.

slim-bean · 2020-02-21T16:42:29Z

I will take a look at this next week!

The preferred implementation choice is copying the Loki API, but that wasn't very clear in the document.

slim-bean · 2020-03-23T20:02:16Z

We've had a few discussions about this and I would like to move forward with Option 1 for the purpose of the original requirements (mainly being able to send from the docker logging driver to promtail).

I believe Option 4 is also valuable but should be submitted as a separate proposal as it solves a different use case.

slim-bean

LGTM thanks @rfratto!

design doc: Promtail Push API

1de52aa

pull-request-size bot added the size/L label Feb 4, 2020

fix formatting

4382626

rfratto added the proposal A design document to propose a large change to Promtail label Feb 4, 2020

rfratto assigned cyriltovena and slim-bean Feb 4, 2020

owen-d reviewed Feb 4, 2020

View reviewed changes

docs/design-documents/2020-02-Promtail-Push-API.md Outdated Show resolved Hide resolved

Update docs/design-documents/2020-02-Promtail-Push-API.md

c728415

Co-Authored-By: Owen Diehl <ow.diehl@gmail.com>

owen-d reviewed Feb 4, 2020

View reviewed changes

grafana deleted a comment from codecov-io Feb 4, 2020

cyriltovena reviewed Feb 14, 2020

View reviewed changes

cyriltovena requested review from slim-bean and owen-d and removed request for owen-d February 21, 2020 17:01

cyriltovena removed their assignment Feb 21, 2020

rfratto assigned rfratto and unassigned rfratto Feb 24, 2020

mattmendick self-assigned this Feb 24, 2020

mattmendick removed their assignment Mar 4, 2020

Clarify preferred impl choice

81fa88f

The preferred implementation choice is copying the Loki API, but that wasn't very clear in the document.

split implementation options into two sections

8d1ae99

slim-bean approved these changes Mar 23, 2020

View reviewed changes

rfratto merged commit 18e828c into grafana:master Mar 23, 2020

rfratto deleted the rfc-promtail-push-api branch March 23, 2020 21:43

slim-bean mentioned this pull request Jul 3, 2020

Promtail: Loki Push API #2296

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Promtail Push API #1627

Proposal: Promtail Push API #1627

rfratto commented Feb 4, 2020

owen-d Feb 4, 2020

rfratto Feb 4, 2020

owen-d Feb 4, 2020

rfratto Feb 4, 2020

owen-d Feb 4, 2020

rfratto Feb 4, 2020

owen-d Feb 4, 2020

rfratto Feb 4, 2020

randomchance Feb 4, 2020

cyriltovena Feb 14, 2020

owen-d Mar 17, 2020

rfratto Mar 17, 2020

owen-d Feb 4, 2020 •

edited

owen-d Feb 13, 2020

rfratto Feb 13, 2020

owen-d Feb 13, 2020 •

edited

cyriltovena Feb 14, 2020 •

edited

rfratto Feb 14, 2020 •

edited

cyriltovena Feb 14, 2020

rfratto Feb 14, 2020

cyriltovena left a comment

rfratto commented Feb 14, 2020 •

edited

cyriltovena commented Feb 21, 2020 •

edited

slim-bean commented Feb 21, 2020

slim-bean commented Mar 23, 2020

slim-bean left a comment


		### Considerations

		Users will be able to define multiple `http` scrape configs, but the base URL

Proposal: Promtail Push API #1627

Proposal: Promtail Push API #1627

Conversation

rfratto commented Feb 4, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

owen-d Feb 4, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

owen-d Feb 13, 2020 • edited

Choose a reason for hiding this comment

cyriltovena Feb 14, 2020 • edited

Choose a reason for hiding this comment

rfratto Feb 14, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cyriltovena left a comment

Choose a reason for hiding this comment

rfratto commented Feb 14, 2020 • edited

cyriltovena commented Feb 21, 2020 • edited

slim-bean commented Feb 21, 2020

slim-bean commented Mar 23, 2020

slim-bean left a comment

Choose a reason for hiding this comment

owen-d Feb 4, 2020 •

edited

owen-d Feb 13, 2020 •

edited

cyriltovena Feb 14, 2020 •

edited

rfratto Feb 14, 2020 •

edited

rfratto commented Feb 14, 2020 •

edited

cyriltovena commented Feb 21, 2020 •

edited