Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Plugin Model #2037

Closed
brancz opened this Issue Sep 27, 2016 · 14 comments

Comments

Projects
None yet
8 participants
@brancz
Copy link
Member

brancz commented Sep 27, 2016

As we keep seeing more and more PRs for additional service discovery mechanisms (in Prometheus) and notification integrations (in Alertmanager), it seems like supporting all the providers out there in a single code base would get out of hand quickly.

I would like to start a discussion around how to manage this in a sane way. An solution could be a plugin model which recently landed in go tip.

This is an open discussion, so any solutions welcome, I'm just making an example.

Thoughts?

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Sep 27, 2016

I had similar thoughts for node exporter collector modules, Alertmanager notifiers, or other interfaces. Personally I'd wait and see a bit where the Go community goes with plugins and how they use them. It'll still be separate files you'll have to bundle up with Prometheus, but at least you don't need to manage a full exec-ed binary anymore. I don't have any real opinion either way yet, but it's going to be exciting to see what happens.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Sep 27, 2016

This is a very new feature, so I'd be holding off.

I'm generally wary of this from a support&reliability standpoint, as we know that SDs are hard to write correctly and having random code in a Prometheus process isn't going to help with that. Files on disk on the other hand are quite easy to debug, with a clear demarcation point.

@brancz

This comment has been minimized.

Copy link
Member Author

brancz commented Sep 27, 2016

Sorry if I wasn't clear enough, I didn't mean to go and do it right now with the method mentioned, I just used it as a starter for the discussion. :)

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Sep 28, 2016

Yes, this is not to propose jumping at it right now. But it generally seems people really like having it directly integrated and are not too happy to write open source tools using file SD.

I also find the decision procedure rather blurry and am not happy at all with the code quality in SD and AM integrations. To the point where I think it's not much worse than executing dynamically linked code. In the end people have to decide what kind of code they load into their binary.

Of course it complicates distribution. If we have "supported SDs" living as plugins directly in our repo, we can just as well directly compile them in.

The question is why people prefer a native integration, and what we actually expect from contributions in terms of testing and code quality.

Maybe dynamically loading would make it easier for people forking and customizing existing SDs if they run into issues.

Just listing a few points to consider here. Curious where the Go eco system will move in terms of using plugins.

@dominikschulz

This comment has been minimized.

Copy link
Contributor

dominikschulz commented Oct 20, 2016

While we understand the problems these integrated SD plugins bring (dependencies, maintenance, etc.), still from a users PoV (as a company using Prometheus) we prefer compiled-in SD plugins for multiple reasons:

  • ease of use
  • good documentation
  • easy discoverability
  • high level of trust in the code quality ensured by the Prometheus core team

Ease of use

We primarily deploy Prometheus in Containers (Docker and others). This makes file SD a little harder to use for us. Of course it's possible, but we prefer on concentrating our monitoring efforts on instrumenting our code and writing exporters than on complex deployment setups.

The compiled in SD methods are super easy to use in this case as we can just use the official Docker images and run it (either mounting the prometheus.yml or building our own image with a one-layer Dockerfile).

Having to build Prometheus completely on our own and orchestrate multi-container deployments would multiply the effort needed to run it.

Documentation / Discoverability

The reference docs on prometheus.io are a little brief at times, but offer a very consistent (high) level of documentation. This would probably vary very much with lots of external SD plugins.

Also selecting the right discovery plugins, if there are many for the same source, could be as hard as selecting e.g. the right Go library among competing ones. Using official plugins makes this step unnecessary.

High level of trust

We are very confident in the code quality produced by the Prometheus team. We follow the code and development very closely are very happy with what and how this project is producing. This would be a lot harder with multiple third party plugin authors.

Alternatives

We hope that Prometheus can retain these qualities however you decide to proceed. Most of these could probably also achieved by other means. Some I can think of right now include:

  • providing official builds with endorsed third-party plugins, like e.g. Caddy does
  • support exec-style plugins which support very easy integration into Container (Docker) deployments, like e.g. kube-cert-manager
  • provide official file SD plugins, e.g. an AWS to File-SD generator
  • the aforementioned Go plugin mechanism
@brancz

This comment has been minimized.

Copy link
Member Author

brancz commented Oct 24, 2016

We are very confident in the code quality produced by the Prometheus team. We follow the code and development very closely are very happy with what and how this project is producing. This would be a lot harder with multiple third party plugin authors.

I agree that that is probably one of the reasons Prometheus got very popular. I'm not arguing that all plugins should be developed by a third party, however, I have seen multiple PRs where people put quite some effort into and they were not accepted, because the proposed SD mechanism is relatively unknown or unused. The argument why people want to get it directly into Prometheus is to not have another moving part (exec style, file sd) but tie it directly into Prometheus.

If we can't agree on something, then using the file sd as a mechanism to hook into is the next best option in my opinion.

providing official builds with endorsed third-party plugins, like e.g. Caddy does

I was not aware of their build system - very interesting, but also lots of work to maintain.

I can imagine it going something like this where there is a set of plugins maintained by the Prometheus maintainers and and third party plugins can be plugged in the same way as the official ones are. If dynamically linked or compiled directly into the binary doesn't really matter.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Oct 24, 2016

It would certainly help with the binary size bloat we get from native SD client libraries.

@msiebuhr

This comment has been minimized.

Copy link
Contributor

msiebuhr commented Nov 22, 2016

After having played around with a half-baked mDNS SD client, I've played around with doing it outside the Prometheus-tree and keeping my sanity.

(Aside: I don't use prometheus for "serious" stuff right now; is spinning on/off on a small server at home; just so you may have a better understanding of the use-case I'm trying to solve)

I thought about using Go's plugin mechanism, but ultimately ended up thinking about just having Prometheus running an external process every N seconds, pick up whatever it dumps on stdout as configuration input. Conceptually, running an external program and dumping output into file_sd_config...

In particular, I don't really like having to deal with the complexity of running a "sidecar" daemon that has to know where prometheus is at, maintaining both configurations so they agree on where file_sd_config should find it's files &c. Case in point: none of the stand-alone SD clients I've looked at so far does this by themselves; they just write a file and leave configuration and reloading Prometheus as "an exercise for the reader".

There are some questions wrt. expecting a long-running process (that continually sends things to stdout) or one-off processes that are re-run for every scrape-interval (or whatever other configurable timespan). But I think those are solvable.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Nov 22, 2016

Mh, Prometheus auto-detects file changes to files included via file SD config – there should be no need to trigger a reload from the sidecar.

@msiebuhr

This comment has been minimized.

Copy link
Contributor

msiebuhr commented Nov 22, 2016

Ah. Sorry for that oversight. But I believe my points about setup/config and operating an external program stands.

One point I forgot: It'll be language-agnostic. Dunno if that's important, tho.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Nov 22, 2016

We already had a discussion of this nature for the node exporter. The outcome was not to reinvent cron, and instead read files off disk.

@supershabam

This comment has been minimized.

Copy link

supershabam commented Jul 15, 2018

I spent some time investigating making a discovery plugin while still having prometheus be a single binary (plugins included in the binary). I'm not going to pursue it any further, but wanted to document some of my findings.

Building a Plugin Requires CGO

Because plugins use c dynamic linking under the hood, there's no way to avoid a dependency on cgo. This makes the linux binaries have dependencies on glibc and makes it harder to simply put the binary into a container and run.

It Seems Plugins Not As Supported As They're Supposed To Be

Though golang says that macOS is supported, I couldn't get macOS to work running golang v1.10.3 on macOS, but was successful on Linux. After googling around, it seems a common struggle to actually support plugin on macOS. Here's one other example golang/go#23369

Plugin.Open MUST Read From Disk

I was hoping to use something like statik to build the plugin into the golang binary, but the method to open a Plugin MUST read the content off disk. https://golang.org/src/plugin/plugin_dlopen.go

I can do a hacky workaround to build the .so plugin into the binary and write it to a temporary file, but it's quite hacky.

Plugin Interface MUST Be Exact Same Code And Golang Version As Consumer

The interface that the plugin exposes can be complicated types, but those complicated types MUST be the exact same code (I believe the plugin hashes the code and stores that hash to verify being the same code...).

The plugin and consumer must also be built with the exact same golang version. So, in the case of prometheus adopting this idea, if the prometheus binary started being built with golang v1.11.0, each plugin would then need to be rebuilt with this exact golang version in order to be compatible.

@krasi-georgiev

This comment has been minimized.

Copy link
Member

krasi-georgiev commented Jul 16, 2018

@supershabam thanks for the amazing summary. After some discussion the current agreement is to use a custom SD adapter as per this example.
https://github.com/prometheus/prometheus/tree/master/documentation/examples/custom-sd
which just queries for the targets and converts the result into json file and Prometheus consumes it via the file SD.

Closing this one for now but of anyone thinks we should discuss more feel free to reopen.

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.