New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scrape external service with FQDN #3204
Comments
I think externalNames are not supported there is few issues about it e.g. #218 |
Thanks @sebarys for direct me to a similar issue #218. BUT it looks like this is very long thread without a solution: The last summary of this thread are #218 (comment) and prometheus/prometheus#2791 (comment) but again no formal solution. If I understood correctly to scrape external service is to use EndpointIP as mentioned #834 (comment) (or in blog), But its not help if you need to use serviceFQDN. Another way to solve it is just to use the old way(not the k8s way) define additional scrape configs with regular
@gouthamve \ @brancz \ @sebarys |
@gouthamve \ @brancz \ @sebarys - can anyone help on this please? |
In our project we've added this using |
You cannot in a meaningful way monitor external services as prometheus needs to scrape each instance/process individually. That’s why you need to use a separate discovery mechanism that actually does discover all processes. |
ok so based on your feedback it looks like there is no plan to support scrape ExternalName k8s service type. And your recommendation is to use Should I close this ticket? |
The way the issue is created it won't happen. That said, we have thought about having more generic scrape configs available through some new CRD in the prometheus operator. That could be something that could be used for that. As far as I know there is no one working on this currently though. |
One use case I see is federation, where I want to configure another Prometheus instance as target to be scraped. It'd be great if that were possible by means of a ServiceMonitor. |
@elsbrock for a Prometheus in the same cluster this is perfectly possible. The federation endpoint on prometheus is no different from any other |
Right, but in our case the Prometheus instance is running in an entirely different network segment, so we either need to use the global config (which I don't find to be nice from dependency point of view, a |
Yes for those cases an additionalScrapeConfig is best. |
@brancz I wonder if there's a way to provide additionalScrapeConfigs as a Kubernetes CRD object? |
This is not possible today, but I would like to get there one day. I would like to essentially introduce a lower level CRD "ScrapeConfig", which all the other config generation CRs are ultimately converted to. The difficulty is maintaining the types for such a CR, this would need to be automated, by inspecting the types from Prometheus and converting.All of this is not impossible but it will need some non trivial amount of work, which I currently don't have time for. If anyone from the community would like to invest time into this though I'd be happy to discuss possible designs and caveats that I think of. |
@brancz I would be interested to look into implementing this CRD. How should we proceed? |
I think a design doc would be in order, as what I'm imagining would involve synchronizing types from the prometheus repo. |
Any news on this feature? (or the design doc?) |
Running into this trying to scrape AWS MSK (see: https://docs.aws.amazon.com/msk/latest/developerguide/open-monitoring.html). MSK provides a FQDN for each broker, and we also have them aliased to consistent in-cluster names using ExternalName services. The underlying IP addresses might be stable, but I don't see sufficient documentation to rely on that, and in any case they would have to be hardcoded per cluster. So now the options are (a) bypass the CRD setup and use config files (aka additionalScrapeConfigs) or (b) set up reverse proxies just to scrape existing endpoints that are available to scrape.
This is an example of a case in which you can (as there is an FQDN provided per instance). |
@jasonstitt have you tried
it worked for us |
If anyone is willing to do a design doc for this they are more than welcome to create a PR for this! 🎉 |
Hi. I came across this issue in our OpenShift clusters. Here's how I solved it:
...
spec:
endpoints:
- path: /metrics
scheme: https
tlsConfig:
insecureSkipVerify: true
relabelings:
- sourceLabels: [__address__]
targetLabel: __address__
regex: (.*)
replacement: "$FQDN:$PORT"
action: replace
... I was then able to scrape the FQDN! |
@alexisph that's great. Can you share a little bit more of your configuration. I'm trying configuring a Service with ExternalName property to reach the FQDN outside Kubernetes with your recommendations, but had no luck. kind: "Service"
apiVersion: "v1"
metadata:
namespace: workload
name: nfs-centralus-001
labels:
workload.stateful: nfs-centralus-001
spec:
type: ExternalName
externalName: nfs-centralus-001.c.saas-workload-io.internal
selector:
workload.stateful: nfs-centralus-001 and the Service Monitor looks like apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
ops.workload.io/component: nfs-centralus-001
ops.workload.io/category: infrastructure
name: nfs-centralus-001
namespace: workload
spec:
endpoints:
- path: /metrics
interval: 15s
targetPort: 9100
scheme: http
relabelings:
- sourceLabels: [__address__]
targetLabel: __address__
regex: (.*)
replacement: "$FQDN:$PORT"
action: replace
jobLabel: ops.workload.io/nfs-centralus-001
namespaceSelector:
matchNames:
- workload
selector:
matchExpressions:
- key: workload.stateful
operator: In
values: ["nfs-centralus-001"] |
@miguel-callejas-coderoad-com , you're missing the Endpoints resource. Based on your example: apiVersion: v1
kind: Endpoints
metadata:
name: nfs-centralus-001
namespace: workload
labels:
workload.stateful: nfs-centralus-001
subsets:
- addresses:
- ip: 1.2.3.4
- ip: 1.2.3.5
ports:
- name: metrics
port: 9100
protocol: TCP |
A quick note on this, as I was struggling with finding the same config: You don't even need to specify the real IP of your destination FQDN in the Endpoint, it can be just any IP, because by relabeling This would be usually populated by the IP address that is defined in the |
Can you give an example? |
@alexisph @miguel-callejas-coderoad-com do you literally use It works for me if I put real values there (in.e. |
@hryamzik , just use the FQDN and port of the service you'd want to scrape, like in your example. |
That's what I already do, hoped for a more elegant solution as |
This issues became more important after k8s 1.22 in which write access to Endpoints has been disabled by default in admin roles due to CVE-2021-25740: |
The main solution for this would be implementing Generic ScrapeConfig CRD described in #2787. Contributions welcome. |
@hryamzik I found a better solution that does not require to duplicate domains. apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
...
spec:
endpoints:
- interval: 30s
path: /_prometheus/metrics/
port: web
relabelings:
- action: replace
regex: (.*)
replacement: $1
sourceLabels:
- __meta_kubernetes_endpoint_node_name
targetLabel: __address__
selector:
... apiVersion: v1
kind: Endpoints
metadata:
...
subsets:
- addresses:
- ip: 1.2.3.5
nodeName: www.example.com
ports:
... or you can use |
Hi.
This solved the issue and everything was perfect. Unfortunately we noticed that after some short (few hours) but random time targets are disappearing in effect disabling this monitoring setup. Here's an illustration of the event: We have 10 endpoints from prometheus operator such as node exporters, alert manager and kube-prometheus-stack. We are adding 8 custom endpoints as described above. You can see that all our custom external endpoints are gone at the same time. Our setup: What we verified:
Has anyone seen anything similar? Can anyone give some hints into what else can we check? Thanks, appreciate for any feedback. |
Closing this issue in favor of #2787 (generic scrape config CRD) which should resolve the original request eventually. |
What happened?
Cannot scrape a service with its FQDN that is out side of the k8s cluster. It only works if you set the service IP, but I prefer not to use IP(s) which may change.
See the prometheus UI that just show the servicemonitor name but nothing inside the endpoints list:
Here is the yaml that define the Prometheus CR + MonitorService + External service with the SERVICE-FQDN. But when you open prometheus it will not scrape the external service.
The only way to scrape the SERVICE-FQDN is by adding also Endpoints object that point to the SERVICE-FQDN specific IP(s). Only then you can see the target in prometheus working. But the whole point it to use only with SERVICE-FQDN and not with specific IPs.
Did you expect to see something different?
I would expect to have an option to scrape also by SERVICE-FQDN and not only by the IPs.
Here are some blogs that explain how to scrape external service but again only with endpoints with specific IPs:
But again none of them use the FQDN, and I would expect to have such way.
How to reproduce it (as minimally and precisely as possible):
Just use the yaml above and you will see that the service (ExternalName) is not visible as target in promethes.
Environment
quay.io/coreos/prometheus-operator:v0.37.0
`
Name: chart1-prometheus-operator-operator
Namespace: default
CreationTimestamp: Wed, 06 May 2020 21:31:34 +0300
Labels: app=prometheus-operator-operator
app.kubernetes.io/managed-by=Helm
chart=prometheus-operator-8.12.12
heritage=Helm
release=chart1
Annotations: deployment.kubernetes.io/revision: 2
meta.helm.sh/release-name: chart1
meta.helm.sh/release-namespace: default
Selector: app=prometheus-operator-operator,release=chart1
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=prometheus-operator-operator
chart=prometheus-operator-8.12.12
heritage=Helm
release=chart1
Service Account: chart1-prometheus-operator-operator
Containers:
prometheus-operator:
Image: quay.io/coreos/prometheus-operator:v0.37.0
Port: 8080/TCP
Host Port: 0/TCP
Args:
--manage-crds=true
--kubelet-service=kube-system/chart1-prometheus-operator-kubelet
--logtostderr=true
--localhost=127.0.0.1
--prometheus-config-reloader=quay.io/coreos/prometheus-config-reloader:v0.37.0
--config-reloader-image=quay.io/coreos/configmap-reload:v0.0.1
--config-reloader-cpu=100m
--config-reloader-memory=25Mi
--log-level=debug
Environment:
Mounts:
tls-proxy:
Image: squareup/ghostunnel:v1.5.2
Port: 8443/TCP
Host Port: 0/TCP
Args:
server
--listen=:8443
--target=127.0.0.1:8080
--key=cert/key
--cert=cert/cert
--disable-authentication
Environment:
Mounts:
/cert from tls-proxy-secret (ro)
Volumes:
tls-proxy-secret:
Type: Secret (a volume populated by a Secret)
SecretName: chart1-prometheus-operator-admission
Optional: false
Conditions:
Type Status Reason
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets:
NewReplicaSet: chart1-prometheus-operator-operator-746d86bbb7 (1/1 replicas created)
Events:
`
Kubernetes version information:
+ kubectl version Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.0", GitCommit:"70132b0f130acc0bed193d9ba59dd186f0e634cf", GitTreeState:"clean", BuildDate:"2019-12-07T21:20:10Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.10-gke.27", GitCommit:"145f9e21a4515947d6fb10819e5a336aff1b6959", GitTreeState:"clean", BuildDate:"2020-02-21T18:01:40Z", GoVersion:"go1.12.12b4", Compiler:"gc", Platform:"linux/amd64"}
Kubernetes cluster kind:
GKE
Anything else we need to know?:
this will work, but again i want to do it k8s way, by setting externalName service with its FQDN.
The text was updated successfully, but these errors were encountered: