Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADDED] Making monitoring endpoints available via system services. #1362

Merged
merged 1 commit into from
May 7, 2020

Conversation

matthiashanel
Copy link
Contributor

I am exposing these as there should not be two different code paths and that enhancements to one are available in the other as well.

I'm also establishing this as the pattern so that we can inspect server state via http as well as nats.

Available via $SYS.REQ.SERVER.%s.%s and $SYS.REQ.SERVER.PING.%s
Last token is the endpoint name.

Since this is in system events I went for less code and thus use of closures over avoiding them
This can change if you want to.
Input can be empty or a json of the corresponding options.
return always includes server. error and data are mutually exclusive.
I can rename data to smth. more specific (connz/subsz/...) as well.

connz and subz have a limit option. Limit and offset are honored by this.
We can consider adding code to return a chunked response.
But I would only do that in addition to and not instead of.
existing tooling such as nats top should be easily adaptable.

We should also consider adding cluster subjects.
fmt.Sprintf("$SYS.REQ.CLUSTER.%s.PING.%s", s.info.Cluster, name)
Such that we can query one particular cluster by name.
The cluster name is only set when using gateways.
Maybe we should allow it to be specified even when gateways are not used, so that a request can refer to the same group of server. Opinions?

Signed-off-by: Matthias Hanel mh@synadia.com

Sending a request for each type. (ping and by server id)

> for OP in VARZ SUBSZ CONNZ ROUTEZ GATEWAYZ LEAFZ ; do nats -s nats://admin:changeit@localhost:4222 req "\$SYS.REQ.SERVER.PING.$OP" {} ; done
18:33:10 Sending request on [$SYS.REQ.SERVER.PING.VARZ]
18:33:10 Received on [_INBOX.SS5OqtBx55e2d2iexa8a9l]: '{
  "data": {
    "server_id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "server_name": "test",
    "version": "2.1.6",
    "proto": 1,
    "go": "go1.13.6",
    "host": "localhost",
    "port": 4222,
    "auth_required": true,
    "tls_required": true,
    "max_connections": 65536,
    "ping_interval": 120000000000,
    "ping_max": 2,
    "http_host": "localhost",
    "http_port": 8000,
    "https_port": 0,
    "auth_timeout": 1,
    "max_control_line": 4096,
    "max_payload": 1048576,
    "max_pending": 67108864,
    "cluster": {},
    "gateway": {},
    "leaf": {},
    "tls_timeout": 0.5,
    "write_deadline": 2000000000,
    "start": "2020-04-29T18:31:59.406943-04:00",
    "now": "2020-04-29T18:33:10.878842-04:00",
    "uptime": "1m11s",
    "mem": 27062272,
    "cores": 16,
    "gomaxprocs": 16,
    "cpu": 1.7,
    "connections": 1,
    "total_connections": 3,
    "routes": 0,
    "remotes": 0,
    "leafnodes": 0,
    "in_msgs": 2,
    "out_msgs": 5,
    "in_bytes": 4,
    "out_bytes": 2087,
    "slow_consumers": 0,
    "subscriptions": 0,
    "http_req_stats": {
      "/": 0,
      "/connz": 0,
      "/gatewayz": 0,
      "/routez": 0,
      "/subsz": 0,
      "/varz": 0
    },
    "config_load_time": "2020-04-29T18:31:59.406943-04:00"
  },
  "server": {
    "name": "test",
    "host": "localhost",
    "id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "ver": "2.1.6",
    "seq": 24,
    "time": "2020-04-29T18:33:10.879006-04:00"
  }
}'
18:33:11 Sending request on [$SYS.REQ.SERVER.PING.SUBSZ]
18:33:11 Received on [_INBOX.yL7oax9OxmozjgVMJU6Zn8]: '{
  "data": {
    "num_subscriptions": 0,
    "num_cache": 0,
    "num_inserts": 0,
    "num_removes": 0,
    "num_matches": 0,
    "cache_hit_rate": 0,
    "max_fanout": 0,
    "avg_fanout": 0,
    "total": 0,
    "offset": 0,
    "limit": 1024
  },
  "server": {
    "name": "test",
    "host": "localhost",
    "id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "ver": "2.1.6",
    "seq": 30,
    "time": "2020-04-29T18:33:11.031423-04:00"
  }
}'
18:33:11 Sending request on [$SYS.REQ.SERVER.PING.CONNZ]
18:33:11 Received on [_INBOX.D57nEcf86rSQ4rwwyYjAHl]: '{
  "data": {
    "server_id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "now": "2020-04-29T18:33:11.179929-04:00",
    "num_connections": 1,
    "total": 1,
    "offset": 0,
    "limit": 1024,
    "connections": [
      {
        "cid": 6,
        "ip": "127.0.0.1",
        "port": 62922,
        "start": "2020-04-29T18:33:11.045627-04:00",
        "last_activity": "2020-04-29T18:33:11.179122-04:00",
        "rtt": "133.491852ms",
        "uptime": "0s",
        "idle": "0s",
        "pending_bytes": 0,
        "in_msgs": 0,
        "out_msgs": 0,
        "in_bytes": 0,
        "out_bytes": 0,
        "subscriptions": 1,
        "name": "NATS CLI",
        "lang": "go",
        "version": "1.9.2",
        "tls_version": "1.3",
        "tls_cipher_suite": "TLS_AES_128_GCM_SHA256"
      }
    ]
  },
  "server": {
    "name": "test",
    "host": "localhost",
    "id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "ver": "2.1.6",
    "seq": 36,
    "time": "2020-04-29T18:33:11.179998-04:00"
  }
}'
18:33:11 Sending request on [$SYS.REQ.SERVER.PING.ROUTEZ]
18:33:11 Received on [_INBOX.RVcYYixEAUwgoJ9TLV0gWt]: '{
  "data": {
    "server_id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "now": "2020-04-29T18:33:11.335843-04:00",
    "num_routes": 0,
    "routes": []
  },
  "server": {
    "name": "test",
    "host": "localhost",
    "id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "ver": "2.1.6",
    "seq": 42,
    "time": "2020-04-29T18:33:11.33592-04:00"
  }
}'
18:33:11 Sending request on [$SYS.REQ.SERVER.PING.GATEWAYZ]
18:33:11 Received on [_INBOX.QykAECl3RrYA4ee4lS06yO]: '{
  "data": {
    "server_id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "now": "2020-04-29T18:33:11.481966-04:00",
    "outbound_gateways": {},
    "inbound_gateways": {}
  },
  "server": {
    "name": "test",
    "host": "localhost",
    "id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "ver": "2.1.6",
    "seq": 48,
    "time": "2020-04-29T18:33:11.482009-04:00"
  }
}'
18:33:11 Sending request on [$SYS.REQ.SERVER.PING.LEAFZ]
18:33:11 Received on [_INBOX.h2WyvixEei2vs7L4Ufkedb]: '{
  "data": {
    "server_id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "now": "2020-04-29T18:33:11.625699-04:00",
    "leafnodes": 0,
    "leafs": null
  },
  "server": {
    "name": "test",
    "host": "localhost",
    "id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "ver": "2.1.6",
    "seq": 54,
    "time": "2020-04-29T18:33:11.625765-04:00"
  }
}'
> for OP in VARZ SUBSZ CONNZ ROUTEZ GATEWAYZ LEAFZ ; do nats -s nats://admin:changeit@localhost:4222 req "\$SYS.REQ.SERVER.NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2.$OP" {} ; done
18:33:19 Sending request on [$SYS.REQ.SERVER.NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2.VARZ]
18:33:19 Received on [_INBOX.SPWCWm4ADhYCtlIc5LUZxZ]: '{
  "data": {
    "server_id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "server_name": "test",
    "version": "2.1.6",
    "proto": 1,
    "go": "go1.13.6",
    "host": "localhost",
    "port": 4222,
    "auth_required": true,
    "tls_required": true,
    "max_connections": 65536,
    "ping_interval": 120000000000,
    "ping_max": 2,
    "http_host": "localhost",
    "http_port": 8000,
    "https_port": 0,
    "auth_timeout": 1,
    "max_control_line": 4096,
    "max_payload": 1048576,
    "max_pending": 67108864,
    "cluster": {},
    "gateway": {},
    "leaf": {},
    "tls_timeout": 0.5,
    "write_deadline": 2000000000,
    "start": "2020-04-29T18:31:59.406943-04:00",
    "now": "2020-04-29T18:33:19.619454-04:00",
    "uptime": "1m20s",
    "mem": 38457344,
    "cores": 16,
    "gomaxprocs": 16,
    "cpu": 1.6,
    "connections": 1,
    "total_connections": 9,
    "routes": 0,
    "remotes": 0,
    "leafnodes": 0,
    "in_msgs": 8,
    "out_msgs": 17,
    "in_bytes": 16,
    "out_bytes": 6328,
    "slow_consumers": 0,
    "subscriptions": 0,
    "http_req_stats": {
      "/": 0,
      "/connz": 0,
      "/gatewayz": 0,
      "/routez": 0,
      "/subsz": 0,
      "/varz": 0
    },
    "config_load_time": "2020-04-29T18:31:59.406943-04:00"
  },
  "server": {
    "name": "test",
    "host": "localhost",
    "id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "ver": "2.1.6",
    "seq": 60,
    "time": "2020-04-29T18:33:19.619564-04:00"
  }
}'
18:33:19 Sending request on [$SYS.REQ.SERVER.NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2.SUBSZ]
18:33:19 Received on [_INBOX.ewvigF6uqdzoKkYgLxnmPB]: '{
  "data": {
    "num_subscriptions": 0,
    "num_cache": 0,
    "num_inserts": 0,
    "num_removes": 0,
    "num_matches": 0,
    "cache_hit_rate": 0,
    "max_fanout": 0,
    "avg_fanout": 0,
    "total": 0,
    "offset": 0,
    "limit": 1024
  },
  "server": {
    "name": "test",
    "host": "localhost",
    "id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "ver": "2.1.6",
    "seq": 66,
    "time": "2020-04-29T18:33:19.765476-04:00"
  }
}'
18:33:19 Sending request on [$SYS.REQ.SERVER.NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2.CONNZ]
18:33:19 Received on [_INBOX.Ip0YeCZpRZg7aEmpAMYaZo]: '{
  "data": {
    "server_id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "now": "2020-04-29T18:33:19.906531-04:00",
    "num_connections": 1,
    "total": 1,
    "offset": 0,
    "limit": 1024,
    "connections": [
      {
        "cid": 12,
        "ip": "127.0.0.1",
        "port": 62944,
        "start": "2020-04-29T18:33:19.778554-04:00",
        "last_activity": "2020-04-29T18:33:19.905604-04:00",
        "rtt": "127.046273ms",
        "uptime": "0s",
        "idle": "0s",
        "pending_bytes": 0,
        "in_msgs": 0,
        "out_msgs": 0,
        "in_bytes": 0,
        "out_bytes": 0,
        "subscriptions": 1,
        "name": "NATS CLI",
        "lang": "go",
        "version": "1.9.2",
        "tls_version": "1.3",
        "tls_cipher_suite": "TLS_AES_128_GCM_SHA256"
      }
    ]
  },
  "server": {
    "name": "test",
    "host": "localhost",
    "id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "ver": "2.1.6",
    "seq": 72,
    "time": "2020-04-29T18:33:19.906663-04:00"
  }
}'
18:33:20 Sending request on [$SYS.REQ.SERVER.NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2.ROUTEZ]
18:33:20 Received on [_INBOX.11pWACw1iUgZlT8boszdVA]: '{
  "data": {
    "server_id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "now": "2020-04-29T18:33:20.047677-04:00",
    "num_routes": 0,
    "routes": []
  },
  "server": {
    "name": "test",
    "host": "localhost",
    "id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "ver": "2.1.6",
    "seq": 78,
    "time": "2020-04-29T18:33:20.047707-04:00"
  }
}'
18:33:20 Sending request on [$SYS.REQ.SERVER.NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2.GATEWAYZ]
18:33:20 Received on [_INBOX.739iyqVQuWx46buCNfl64s]: '{
  "data": {
    "server_id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "now": "2020-04-29T18:33:20.193264-04:00",
    "outbound_gateways": {},
    "inbound_gateways": {}
  },
  "server": {
    "name": "test",
    "host": "localhost",
    "id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "ver": "2.1.6",
    "seq": 84,
    "time": "2020-04-29T18:33:20.1933-04:00"
  }
}'
18:33:20 Sending request on [$SYS.REQ.SERVER.NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2.LEAFZ]
18:33:20 Received on [_INBOX.rtcOj3T20U1M0aPDcoTp8y]: '{
  "data": {
    "server_id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "now": "2020-04-29T18:33:20.337973-04:00",
    "leafnodes": 0,
    "leafs": null
  },
  "server": {
    "name": "test",
    "host": "localhost",
    "id": "NAYGWE6TJPSR3U4YYR3G6IILZKDP6CA2J4SLKSC3YD6XDVULFKEJLUO2",
    "ver": "2.1.6",
    "seq": 90,
    "time": "2020-04-29T18:33:20.337999-04:00"
  }
}'
>

@ripienaar
Copy link
Contributor

This is nice, but these end points over http have many more options. Should we consider accepting a JSON blob of options?

@derekcollison
Copy link
Member

We should support the options for sure, could be generic string that is the url args?

REQ .CONNZ subs=1&sort=id

@matthiashanel
Copy link
Contributor Author

matthiashanel commented Apr 30, 2020

@ripienaar @derekcollison are there options not covered by the options struct?
As far as my coding goes all options covered by an options struct are supported.
Take this code where I pass a pointer to VarzOptions as well as a function calling Varz with that pointer.

"VARZ": func(sub *subscription, _ *client, subject, reply string, msg []byte) {
	optz := &VarzOptions{}
	s.zReq(reply, msg, optz, func() (interface{}, error) { return s.Varz(optz) })
},

zReq parses the message into optz and calls the closure that is also aware of optz....
This is also the reason why I modified GatewayzOptions to include json tags.
If you want me to go away from the closure I'll probably go down the framework route and use a type cast to get to a similar level of compactness.

Example with content:

nats -s nats://admin:changeit@localhost:4222 req "\$SYS.REQ.SERVER.PING.CONNZ" '{"subscriptions_detail":true}'
11:34:23 Sending request on [$SYS.REQ.SERVER.PING.CONNZ]
11:34:23 Received on [_INBOX.ZIvqhEeeivfaEORD73mg1h]: '{
  "data": {
    "server_id": "NAWUCH2EHBI3FEECN65TA2TYNDIMMFTDB2S5WKCRLLEWWEXEE4ZSI5CL",
    "now": "2020-04-30T11:34:23.132473-04:00",
    "num_connections": 1,
    "total": 1,
    "offset": 0,
    "limit": 1024,
    "connections": [
      {
        "cid": 4,
        "ip": "127.0.0.1",
        "port": 51569,
        "start": "2020-04-30T11:34:23.010664-04:00",
        "last_activity": "2020-04-30T11:34:23.131953-04:00",
        "rtt": "121.288349ms",
        "uptime": "0s",
        "idle": "0s",
        "pending_bytes": 0,
        "in_msgs": 0,
        "out_msgs": 0,
        "in_bytes": 0,
        "out_bytes": 0,
        "subscriptions": 1,
        "name": "NATS CLI",
        "lang": "go",
        "version": "1.9.2",
        "tls_version": "1.3",
        "tls_cipher_suite": "TLS_AES_128_GCM_SHA256",
        "subscriptions_list_detail": [
          {
            "subject": "_INBOX.ZIvqhEeeivfaEORD73mg1h",
            "sid": "1",
            "msgs": 0,
            "max": 1,
            "cid": 4
          }
        ]
      }
    ]
  },
  "server": {
    "name": "test",
    "host": "localhost",
    "id": "NAWUCH2EHBI3FEECN65TA2TYNDIMMFTDB2S5WKCRLLEWWEXEE4ZSI5CL",
    "cluster": "name",
    "ver": "2.1.6",
    "seq": 28,
    "time": "2020-04-30T11:34:23.132527-04:00"
  }
}'

@ripienaar
Copy link
Contributor

Ah I missed you already have JSON payload there. Then it’s good. Sorry.l

Copy link
Member

@kozlovic kozlovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

server/events.go Outdated Show resolved Hide resolved
@matthiashanel
Copy link
Contributor Author

matthiashanel commented May 2, 2020

changed error return to include http status and thus match jetstream api ADR
Will address json type in a separate change covering all of system events.

nats -s nats://admin:changeit@localhost:4222 req "\$SYS.REQ.SERVER.PING.CONNZ" '{'
16:57:33 Sending request on [$SYS.REQ.SERVER.PING.CONNZ]
16:57:33 Received on [_INBOX.ah3hEoN9G7tRlX5u6MQJSq]: '{
  "error": {
    "code": 400,
    "description": "unexpected end of JSON input"
  },
  "server": {
    "name": "test",
    "host": "localhost",
    "id": "NBROAWUWMFTEYDK6M4FNFZUKRGWX4JSLF7DAXVVJPHICAPMASEHM3RPL",
    "cluster": "name",
    "ver": "2.1.6",
    "seq": 9,
    "time": "2020-05-02T16:57:33.828658-04:00"
  }
}'

Copy link
Member

@kozlovic kozlovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seem you set status to internal error regardless of err value.

}
if err == nil {
response["data"], err = respf()
status = http.StatusInternalServerError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seem weird to me. You set it regardless of the err result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because status is only returned when there is an error, I can basically set it all the time essentially keeping track of how far I got. But will change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, hence comment above when doing the unmarshal.

Copy link
Member

@kozlovic kozlovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Available via $SYS.REQ.SERVER.%s.%s and $SYS.REQ.SERVER.PING.%s
Last token is the endpoint name.

Signed-off-by: Matthias Hanel <mh@synadia.com>
@matthiashanel
Copy link
Contributor Author

@derekcollison, go / no-go / defer to Ivan?

@derekcollison
Copy link
Member

I will defer to @kozlovic. Thanks for ping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants