Skip to content
This repository has been archived by the owner on Feb 9, 2022. It is now read-only.

Add basic monitoring server to Mixer. #708

Merged
merged 2 commits into from
May 12, 2017

Conversation

douglas-reid
Copy link
Contributor

@douglas-reid douglas-reid commented May 12, 2017

This PR establishes a server within Mixer for providing some basic
monitoring and version information for debugging. This solution is
meant ONLY to be a short term patch while we iterate on a more
thorough and complete design for health and status reporting for Mixer
as well as for self-monitoring of Istio components.

It provides two basic endpoints:

  • /metrics which exposes data from the default collector for
    prometheus. This should include stats on number of goroutines and cpu
    usage, etc.
  • version which provides a quick way to get a snapshot of the version
    info for Mixer.

Example snippet from /metrics:

# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 9.9604e-05
go_gc_duration_seconds{quantile="0.25"} 0.00012713
go_gc_duration_seconds{quantile="0.5"} 0.000288486
go_gc_duration_seconds{quantile="0.75"} 0.00030185
go_gc_duration_seconds{quantile="1"} 0.000349945
go_gc_duration_seconds_sum 0.001295973
go_gc_duration_seconds_count 6
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 2061
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 4.922944e+06
...

Example output from /version:

version: 0.1.1-2-gb35ce11 (build: 2017-05-12-b35ce11, status: Clean)

This change is Reviewable

This PR establishes a server within Mixer for providing some basic
monitoring and version information for debugging. This solution is
meant ONLY to be a short term patch while we iterate on a more
thorough and complete design for health and status reporting for Mixer
as well as for self-monitoring of Istio components.

It provides two basic endpoints:
- `/metrics` which exposes data from the default collector for
prometheus. This should include stats on number of goroutines and cpu
usage, etc.
- `version` which provides a quick way to get a snapshot of the version
info for Mixer.
@douglas-reid douglas-reid added this to the mixer alpha milestone May 12, 2017
@istio-testing
Copy link
Contributor

Jenkins job mixer/presubmit passed

1 similar comment
@istio-testing
Copy link
Contributor

Jenkins job mixer/presubmit passed

@istio-testing
Copy link
Contributor

Jenkins job mixer/manager-regression passed

1 similar comment
@istio-testing
Copy link
Contributor

Jenkins job mixer/manager-regression passed

@@ -89,6 +94,9 @@ func serverCmd(printf, fatalf shared.FormatFn) *cobra.Command {
}
serverCmd.PersistentFlags().Uint16VarP(&sa.port, "port", "p", 9091, "TCP port to use for Mixer's gRPC API")
serverCmd.PersistentFlags().Uint16VarP(&sa.configAPIPort, "configAPIPort", "", 9094, "HTTP port to use for Mixer's Configuration API")
serverCmd.PersistentFlags().Uint16Var(&sa.monitoringPort, "monitoringPort", 1337, "HTTP port to use for the exposing mixer self-monitoring information")
serverCmd.PersistentFlags().StringVar(&sa.metricsPath, "metricsPath", "/metrics", "Request path for metrics data for mixer self-monitoring")
serverCmd.PersistentFlags().StringVar(&sa.versionPath, "versionPath", "/version", "Request path for version info for mixer self-monitoring")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think all these paths should be configurable.
We may want the context root to be configurable so "/health" or /status"
and then we can have
/status/metrics and /status/version underneath it.

we should just use the statusz/healthz endpoint paths that we will eventually support.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we know what paths we will eventually support. These don't have to have a common root. I can hardcode them for now, if that is preferable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on hardcoding (also /metrics).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hardcoded is good.

// is coming. that design will include proper coverage of statusz/healthz type
// functionality, in addition to how mixer reports its own metrics.
srvMux := http.NewServeMux()
srvMux.Handle(sa.metricsPath, promhttp.Handler())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the default promhttp handler give us all the goodies we want?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It uses the default gatherer, which includes two basic collectors for free:

NewProcessCollector returns a collector which exports the current state of process metrics including cpu, memory and file descriptor usage as well as the process start time for the given process id under the given namespace.

NewGoCollector returns a collector which exports metrics about the current go process.

I think this is good enough to start (especially for a bridge solution to a more full-fledged solution).

monitoring := &http.Server{Addr: fmt.Sprintf(":%d", sa.monitoringPort), Handler: srvMux}
printf("Starting self-monitoring on port %d", sa.monitoringPort)
go func() {
if monErr := monitoring.Serve(monitoringListener.(*net.TCPListener)); monErr != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason why this cannot be glog.Fatal (monitoring.Server ...) or similar ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. but, do we want to make a failure to start the monitoring service fatal?

@@ -89,6 +94,9 @@ func serverCmd(printf, fatalf shared.FormatFn) *cobra.Command {
}
serverCmd.PersistentFlags().Uint16VarP(&sa.port, "port", "p", 9091, "TCP port to use for Mixer's gRPC API")
serverCmd.PersistentFlags().Uint16VarP(&sa.configAPIPort, "configAPIPort", "", 9094, "HTTP port to use for Mixer's Configuration API")
serverCmd.PersistentFlags().Uint16Var(&sa.monitoringPort, "monitoringPort", 1337, "HTTP port to use for the exposing mixer self-monitoring information")
serverCmd.PersistentFlags().StringVar(&sa.metricsPath, "metricsPath", "/metrics", "Request path for metrics data for mixer self-monitoring")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use consecutive ports by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use whatever ports we think are reasonable. is your suggestion:

  • grpc: 9091
  • config: 9092
  • monitoring: 9093

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid adding a new port instead ? AFAIK grpc allows adding http handlers.

( I assume the config API will move/migrate to manager or run as separate service at some point)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grpc-go reusing the same port is a mess. perhaps most troubling: grpc/grpc-go#586.

@istio-testing
Copy link
Contributor

Jenkins job mixer/e2e-smoketest passed

1 similar comment
@istio-testing
Copy link
Contributor

Jenkins job mixer/e2e-smoketest passed

Copy link
Contributor

@costinm costinm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - few cosmetic comments.

@@ -89,6 +94,9 @@ func serverCmd(printf, fatalf shared.FormatFn) *cobra.Command {
}
serverCmd.PersistentFlags().Uint16VarP(&sa.port, "port", "p", 9091, "TCP port to use for Mixer's gRPC API")
serverCmd.PersistentFlags().Uint16VarP(&sa.configAPIPort, "configAPIPort", "", 9094, "HTTP port to use for Mixer's Configuration API")
serverCmd.PersistentFlags().Uint16Var(&sa.monitoringPort, "monitoringPort", 1337, "HTTP port to use for the exposing mixer self-monitoring information")
serverCmd.PersistentFlags().StringVar(&sa.metricsPath, "metricsPath", "/metrics", "Request path for metrics data for mixer self-monitoring")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid adding a new port instead ? AFAIK grpc allows adding http handlers.

( I assume the config API will move/migrate to manager or run as separate service at some point)

"os"
"strings"
"time"

bt "github.com/opentracing/basictracer-go"
"github.com/prometheus/client_golang/prometheus/promhttp"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also add _ "expvar" ? It doesn't hurt, and mostly free. Just needs the default http handler to be exposed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added it for now, though I am skeptical it provides more detail than the newly-exposed /metrics endpoint.

@@ -89,6 +94,9 @@ func serverCmd(printf, fatalf shared.FormatFn) *cobra.Command {
}
serverCmd.PersistentFlags().Uint16VarP(&sa.port, "port", "p", 9091, "TCP port to use for Mixer's gRPC API")
serverCmd.PersistentFlags().Uint16VarP(&sa.configAPIPort, "configAPIPort", "", 9094, "HTTP port to use for Mixer's Configuration API")
serverCmd.PersistentFlags().Uint16Var(&sa.monitoringPort, "monitoringPort", 1337, "HTTP port to use for the exposing mixer self-monitoring information")
serverCmd.PersistentFlags().StringVar(&sa.metricsPath, "metricsPath", "/metrics", "Request path for metrics data for mixer self-monitoring")
serverCmd.PersistentFlags().StringVar(&sa.versionPath, "versionPath", "/version", "Request path for version info for mixer self-monitoring")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on hardcoding (also /metrics).

// is coming. that design will include proper coverage of statusz/healthz type
// functionality, in addition to how mixer reports its own metrics.
srvMux := http.NewServeMux()
srvMux.Handle(sa.metricsPath, promhttp.Handler())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious - any reason we're not using the default mux ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not particularly. updating to use default.

@douglas-reid
Copy link
Contributor Author

PTAL. I believe I have addressed most of the review comments.

Changes:

  • use port 9093
  • expose /debug/vars via expvar
  • use default mux
  • hard-coded paths for /metrics and /version

@istio-testing
Copy link
Contributor

Jenkins job mixer/presubmit passed

Copy link
Contributor

@mandarjog mandarjog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@istio-testing
Copy link
Contributor

Jenkins job mixer/manager-regression passed

@istio-testing
Copy link
Contributor

Jenkins job mixer/e2e-smoketest passed

@douglas-reid douglas-reid merged commit 4dfa637 into istio:master May 12, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants