-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start prometheus and pushgatway on local docker #573
Conversation
License: MIT Signed-off-by: Cory Schwartz <cory@protocol.ai>
License: MIT Signed-off-by: Cory Schwartz <cory@protocol.ai>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe these ensure_xxx containers should be simplified, especially if we do add a lot more containers in the future, but this works for now I think.
@@ -337,7 +349,7 @@ func ensureControlNetwork(ctx context.Context, cli *client.Client, log *zap.Suga | |||
ctx, | |||
log, cli, | |||
"testground-control", | |||
true, | |||
false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with PublishAllPorts set true, ports are not exposed through the NAT when the network is internal. I switched this to enable us to view these containers in our local web browser.
}, | ||
HostConfig: &container.HostConfig{ | ||
NetworkMode: container.NetworkMode(controlNetworkID), | ||
PublishAllPorts: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This exposes an ephemeral port accessible by the host.
License: MIT Signed-off-by: Cory Schwartz <cory@protocol.ai>
I would say this works but the experience is less than ideal. Unlike the prometheus operator on kubernetes, which can perform auto-discovery for nodes to scrape, the prometheus container running by itself, such as basic docker incantation, does not discover the pushgateway. I am getting around this by bind mounting the config into the container, but really what should happen is just replace the config by adding another layer. I think I'm going to back out the bind mount change and instead just build another local container with the Makefile so it's properly configured. Don't approve this yet, is what I'm saying. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, @coryschwartz, I think it's totally fine this way. I rather prefer not having yet another ad-hoc container that users need to build with the Makefile. The way you've implemented it is self-contained: we modify the Prometheus config, and it'll just be picked up both if the container is running, or if it isn't.
What I'm not finding in this PR is the prometheus.yml
file you're using. I suggest the following:
- add the
prometheus.yml
file to thepkg/runner
package. - add a healthcheck that verifies if the
<WorkDir>/prometheus.yml
file exists, and if it's equal in content to the one in our source tree. If so, the healthcheck succeeds. If not, the fix would be to overwrite this file. - if the Prometheus container is running, it should pick up the change immediately (Prometheus hot-reloads, right?)
- if it's not running, we'd execute the fix action, which is to start the Prometheus container -- which would now be using the config file we copied over above.
BTW -- more generally, this needs to be adapted to healthchecks!
For the kubernetes prometheus operator, the configuration is not required since it can learn about scrape endpoints from the ServiceMonitor CRD. Of course, we do not have such concepts available in a local docker environment. I'm going to close this one and open a nearly-identical PR so I don't have to deal with the merge conflicts that have built up since this was initially opened, but I'll continue with your suggestions there. Thanks @raulk. |
Thanks @coryschwartz, I’ve tagged this as Discarded. So when you open the new ticket, please remember to link to this one so that we can track the progress from when the issue was originally targeted for dev. Good job recognizing not to bang your head against the wall! |
For feature parity with the the k8s runner.
Prometheus and pushgateway start and attach to the control network.
Right now I haven't considered how to make it resolvable, but we should make it the same as the k8s runner as much as possible.