New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Kubernetes discovery for Gateway Prometheus scrapping #363

Merged
merged 8 commits into from Feb 15, 2019

Conversation

Projects
None yet
2 participants
@stefanprodan
Copy link
Member

stefanprodan commented Jan 27, 2019

This PR does the following:

  • upgrade Prometheus to v2.6.1
  • add Prometheus Kubernetes service account, role and RBAC
  • enable pod scraping for OpenFaaS Gateway on port 8080
  • disable Istio sidecar injection except for Gateway and QueueWorker

Fix: #354

Motivation and Context

  • I have raised an issue to propose this change (required)

How Has This Been Tested?

Helm install tested on GKE with Istio 1.0.3 and two Gateway pods

Prometheus:

screenshot 2019-01-27 at 12 02 22

Gateway promql:

screenshot 2019-02-03 at 12 19 37

Grafana:

screenshot 2019-02-03 at 12 20 38

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I've read the CONTRIBUTION guide
  • I have signed-off my commits with git commit -s
  • I have added tests to cover my changes.
  • All new and existing tests passed.

stefanprodan added some commits Jan 27, 2019

Upgrade Prometheus to v2.6.1
- fix for kubernetes_sd_configs namespace restriction

Signed-off-by: stefanprodan <stefan.prodan@gmail.com>
Use Kubernetes discovery for Gateway scrapping
- add Prometheus Kubernetes service account, role and RBAC
- enable pod scraping for OpenFaaS main namespace

Signed-off-by: stefanprodan <stefan.prodan@gmail.com>
Enable Gateway Prometheus scrapping
- restrict scraping to port 8080

Signed-off-by: stefanprodan <stefan.prodan@gmail.com>
Disable Istio sidecar except for Gateway and QueueWorker
- fix for OF async: NATS protocol is not supported by Envoy
- tested on GKE with Istio 1.0.3

Signed-off-by: stefanprodan <stefan.prodan@gmail.com>
Add Prometheus Kubernetes SD to yamls
Signed-off-by: stefanprodan <stefan.prodan@gmail.com>
Upgrade Prometheus to v2.6.1 in yamls
Signed-off-by: stefanprodan <stefan.prodan@gmail.com>
@alexellis

This comment has been minimized.

Copy link
Member

alexellis commented Jan 29, 2019

disable Istio sidecar injection except for Gateway and QueueWorker

Should it not be on for Prometheus and AlertManager?

@alexellis

This comment has been minimized.

Copy link
Member

alexellis commented Jan 29, 2019

What about faas-idler?

@alexellis

This comment has been minimized.

Copy link
Member

alexellis commented Jan 29, 2019

Please can you confirm if the name or labels of metrics has changed by switching to this mode of discovery?

Maybe showing the Grafana dashboard from the labs would be sufficient?

https://github.com/openfaas/workshop/blob/kubernetes-preview/lab2.md

@alexellis

This comment has been minimized.

Copy link
Member

alexellis commented Jan 29, 2019

@stefanprodan can you raise an issue to track the problems you had with Istio? I don't think those changes belong in this PR, but it might be because I don't have all the context.

Upgrade Prometheus to v2.7.1
- tested on GKE 1.11 and Istio 1.0.3

Signed-off-by: stefanprodan <stefan.prodan@gmail.com>
@alexellis

This comment has been minimized.

Copy link
Member

alexellis commented Feb 2, 2019

Hi Stefan thank you for updating and testing the newer version. Please could you look at my comments about the istio changes? ideally I would prefer to take these in a separate PR and have an issue explaining more about the context.

Thanks again for your contribution

Alex

@stefanprodan

This comment has been minimized.

Copy link
Member Author

stefanprodan commented Feb 3, 2019

I've added an issue for Istio see #369

@stefanprodan

This comment has been minimized.

Copy link
Member Author

stefanprodan commented Feb 3, 2019

@alexellis I've updated the PR with the Gateway and Grafana print screens.

@@ -16,6 +16,7 @@ spec:
labels:
app: alertmanager
annotations:
sidecar.istio.io/inject: "false"

This comment has been minimized.

@alexellis

alexellis Feb 5, 2019

Member

Just curious.. why doesn't this work for AlertManager? Because of its HA discovery mechanism?

type: A
refresh_interval: 5s
honor_labels: false
kubernetes_sd_configs:

This comment has been minimized.

@alexellis

alexellis Feb 5, 2019

Member

Thanks this looks pretty complicated compared to before. Do you have a source of where this comes from?

@@ -16,8 +16,10 @@ spec:
labels:
app: prometheus
annotations:
sidecar.istio.io/inject: "false"

This comment has been minimized.

@alexellis

alexellis Feb 5, 2019

Member

How come?

This comment has been minimized.

@stefanprodan

stefanprodan Feb 7, 2019

Author Member

The K8s SD returns pod IPs and those don't work inside the mesh

This comment has been minimized.

@stefanprodan

stefanprodan Feb 7, 2019

Author Member

There is a way to make it work but it needs the Istio cert, here is an example https://github.com/stefanprodan/flagger/blob/master/artifacts/gke/istio-prometheus.yaml#L330

@@ -51,7 +51,7 @@ operator:

# monitoring and auto-scaling components
prometheus:
image: prom/prometheus:v2.3.1
image: prom/prometheus:v2.7.1

This comment has been minimized.

@alexellis

alexellis Feb 5, 2019

Member

@rdimitrov looks like we should use v2.7.1 for Docker Swarm, we used 2.7.0 which may be my fault.

Enable Istio sidecar for faas-idler
Signed-off-by: stefanprodan <stefan.prodan@gmail.com>

@alexellis alexellis merged commit d4d7237 into openfaas:master Feb 15, 2019

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@alexellis

This comment has been minimized.

Copy link
Member

alexellis commented Feb 15, 2019

Merged, thank you Stefan.

@alexellis

This comment has been minimized.

Copy link
Member

alexellis commented Feb 15, 2019

Tested on GKE with 4x GWs and hey - stats reported correctly 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment