New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export Prometheus metrics to StackDriver #996

Merged
merged 12 commits into from Jun 26, 2018

Conversation

Projects
None yet
2 participants
@gdbelvin
Collaborator

gdbelvin commented Jun 20, 2018

Closes #942
Closes #385

@gdbelvin gdbelvin added this to the Productionize milestone Jun 20, 2018

@gdbelvin gdbelvin requested a review from AlCutter Jun 20, 2018

gdbelvin added some commits Jun 20, 2018

Prometheus to Stackdriver
Add stackdriver monitoring for

- logserver
- logsigner
- keyserver
- sequencer
@codecov

This comment has been minimized.

codecov bot commented Jun 20, 2018

Codecov Report

Merging #996 into master will increase coverage by 0.04%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #996      +/-   ##
==========================================
+ Coverage      50%   50.04%   +0.04%     
==========================================
  Files          29       29              
  Lines        2038     2038              
==========================================
+ Hits         1019     1020       +1     
+ Misses        837      836       -1     
  Partials      182      182
Impacted Files Coverage Δ
core/client/client.go 30% <0%> (+0.62%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4b029f5...d408795. Read the comment docs.

@gdbelvin

This comment has been minimized.

Collaborator

gdbelvin commented Jun 20, 2018

Screenshot of the new stack driver dashboard :-)

screen shot 2018-06-20 at 18 15 19

gdbelvin added some commits Jun 20, 2018

- command:
- /monitor
- --stackdriver-prefix=custom.googleapis.com
- --source=logserver:http://log-server:8091/metrics

This comment has been minimized.

@AlCutter

AlCutter Jun 22, 2018

How will this cope if you run multiple replicas of some of these monitored services?
I believe that accessing deployment names like this is going to "load-balance" across the actual instances, and I'm wondering if you won't then end up with a bunch of metrics coming from different machines being lumped together in a single timeseries?

e.g. if one of two logservers is getting all the load, but the metrics are being collected in a round-robin fashion you'll get 100%-0%-100%-0% spikes.

This comment has been minimized.

@gdbelvin

gdbelvin Jun 22, 2018

Collaborator

Thanks for spotting this. You make an excellent point.

docker-compose can't support the "sidecar" / multiple containers-per-pod model that kubernets supports (and prometheus-to-sd) assumes. As long as there's only one replica (the current state) it should be fine.

I'll make a note of it in the docker-compose file, and do the proper sidecar setup in kubernetes.

This comment has been minimized.

@AlCutter

AlCutter Jun 22, 2018

oke doke, LGTM otherwise - I'll approve now.

@@ -27,9 +27,19 @@ spec:
- --alsologtostderr
- --v=5
image: us.gcr.io/key-transparency/keytransparency-sequencer:latest
livenessProbe:

This comment has been minimized.

@AlCutter

AlCutter Jun 22, 2018

Would this capture what you're trying to do a bit more concisely?

livenessProbe:
         httpGet:
           path: /metrics
           port: 8081

This comment has been minimized.

@gdbelvin

gdbelvin Jun 22, 2018

Collaborator

Yes, that's probably the more efficient, kubernets specific, way of doing it.
However, in an effort to only maintain one set of deployment configs, I'm auto generating the kubernetes configs with kompose from the docker-compose.yml file, and docker-compose uses a cmd format for liveness checks.

This comment has been minimized.

@AlCutter

AlCutter Jun 22, 2018

Makes sense, though I suspect you're going to have to diverge at some point for a more nuanced kubernetes config, but no need to burden yourself with that now!

This comment has been minimized.

@gdbelvin

gdbelvin Jun 22, 2018

Collaborator

Yep. I'll be diverging now.

docker-compose doesn't support the side-car strategy that prometheus-to-sd requires.

This comment has been minimized.

@AlCutter

This comment has been minimized.

@AlCutter

AlCutter Jun 22, 2018

Will you be changing the livenessProbe too then?

This comment has been minimized.

@gdbelvin

gdbelvin Jun 26, 2018

Collaborator

done

@@ -0,0 +1,19 @@
apiVersion: v1

This comment has been minimized.

@AlCutter

AlCutter Jun 22, 2018

Could you help me understand the reason for the service (and corresponding ports: on the deploy?
My understanding is the prom-to-sd is active (i.e. scrapes metrics and then pushes to stackdriver APIs), is that not the case, or is there another reason you need the service?

This comment has been minimized.

@gdbelvin

gdbelvin Jun 22, 2018

Collaborator

Woops, prometheus-to-sd job doesn't need a port.

gdbelvin added some commits Jun 22, 2018

Merge branch 'f/stackdriver' of github.com:gdbelvin/keytransparency i…
…nto f/stackdriver

* 'f/stackdriver' of github.com:gdbelvin/keytransparency:
  Add batch-size flag and metric (#999)
  Serve metrics on http addr (#997)

gdbelvin added some commits Jun 22, 2018

Use sidecar strategy for prometheus-to-sd in kubernets
Because docker-compose doesn't support side-cars, we also can't use
kompose to auto translate between the two.
@@ -27,9 +27,19 @@ spec:
- --alsologtostderr
- --v=5
image: us.gcr.io/key-transparency/keytransparency-sequencer:latest
livenessProbe:

This comment has been minimized.

@AlCutter

AlCutter Jun 22, 2018

Will you be changing the livenessProbe too then?

@@ -13,6 +13,9 @@ spec:
- name: "8080"
port: 8080
targetPort: 8080
- name: "8081"

This comment has been minimized.

@AlCutter

AlCutter Jun 22, 2018

Do you need to export the metrics port on the service too?

This comment has been minimized.

@gdbelvin

gdbelvin Jun 26, 2018

Collaborator

I think I did? sequencer-deployment.yaml has a new - containerPort: 8081 line.

@gdbelvin gdbelvin merged commit c4046a3 into google:master Jun 26, 2018

5 checks passed

GolangCI No issues found!
Details
cla/google All necessary CLAs are signed
codecov/patch Coverage not affected when comparing 4b029f5...d408795
Details
codecov/project 50.04% (+0.04%) compared to 4b029f5
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@gdbelvin gdbelvin deleted the gdbelvin:f/stackdriver branch Jun 26, 2018

gdbelvin added a commit to gdbelvin/keytransparency that referenced this pull request Jun 26, 2018

Merge branch 'master' into test/sequencer_dup
* master:
  Export Prometheus metrics to StackDriver (google#996)
  Write glog.Errorf for errors (google#1001)
  Add code coverage badge (google#1000)
  Add batch-size flag and metric (google#999)

gdbelvin added a commit to gdbelvin/keytransparency that referenced this pull request Jun 26, 2018

Merge branch 'master' into f/invalid_mutation_cnt
* master:
  Export Prometheus metrics to StackDriver (google#996)
  Write glog.Errorf for errors (google#1001)
  Add code coverage badge (google#1000)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment