New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support partial TLS negotiation probing (sever cert only) #223

Closed
marcan opened this Issue Sep 11, 2017 · 5 comments

Comments

Projects
None yet
2 participants
@marcan

marcan commented Sep 11, 2017

I have a service provided over TLS that requires client cert authentication to connect. I want to monitor that the TCP port is up and that the service is serving a valid, unexpired server certificate (and how much time is left until expiry). However, my prober has no business actually having a valid client cert to complete the key exchange and be able to issue application layer requests.

Since the server cert is transferred before the client cert, and, in fact, blackbox_exporter can already tell the difference between an invalid server cert (Error dialing TCP: x509: certificate signed by unknown authority) and just the missing client cert later in the exchange (Error dialing TCP: remote error: tls: handshake failure), it should be possible to implement an option to consider the probe successful as long as the server cert has been received and validated, regardless of whether the full handshake completes or not.

@brian-brazil

This comment has been minimized.

Show comment
Hide comment
@brian-brazil

brian-brazil Sep 11, 2017

Member

If you application has conflated authentication and authorisation, that's the thing to look at. I'd suggest providing a valid cert that doesn't have permission to send requests.

Member

brian-brazil commented Sep 11, 2017

If you application has conflated authentication and authorisation, that's the thing to look at. I'd suggest providing a valid cert that doesn't have permission to send requests.

@marcan

This comment has been minimized.

Show comment
Hide comment
@marcan

marcan Sep 13, 2017

That involves actually having an authorization system. Which is great and all, but adding one just for the sake of prometheus seems somewhat overkill.

FWIW, this is ElasticSearch using search-guard-ssl. Adding the full blown search-guard on top would let me do what you suggest (though I'm not even sure if the authorization check happens during TLS negotiation on the node port or after - if it happens during the negotiation that doesn't buy me anything), by configuring a DN check, but right now the premise is simply that the cluster nodes need to trust each other, and they just do that by trusting certs from a given (sub)CA. There are no worthwhile authorization levels to consider; cluster nodes are equal peers and can do everything, and regular clients go through a frontend with an unrelated auth mechanism.

So while you have a point for more complex systems, there's an argument for supporting simpler setups where you either have a client cert or you don't.

The obvious workaround here would be to run an instance of the exporter on the actual ES machine(s) and monitor that, so the client cert never leaves those servers, but that feels like a bit of a hack.

marcan commented Sep 13, 2017

That involves actually having an authorization system. Which is great and all, but adding one just for the sake of prometheus seems somewhat overkill.

FWIW, this is ElasticSearch using search-guard-ssl. Adding the full blown search-guard on top would let me do what you suggest (though I'm not even sure if the authorization check happens during TLS negotiation on the node port or after - if it happens during the negotiation that doesn't buy me anything), by configuring a DN check, but right now the premise is simply that the cluster nodes need to trust each other, and they just do that by trusting certs from a given (sub)CA. There are no worthwhile authorization levels to consider; cluster nodes are equal peers and can do everything, and regular clients go through a frontend with an unrelated auth mechanism.

So while you have a point for more complex systems, there's an argument for supporting simpler setups where you either have a client cert or you don't.

The obvious workaround here would be to run an instance of the exporter on the actual ES machine(s) and monitor that, so the client cert never leaves those servers, but that feels like a bit of a hack.

@brian-brazil

This comment has been minimized.

Show comment
Hide comment
@brian-brazil

brian-brazil Sep 13, 2017

Member

The presumption is that the ssl expiry is a mere side effect of the actual blackbox test you're performing, and you're going to need working client auth to perform that test.

Member

brian-brazil commented Sep 13, 2017

The presumption is that the ssl expiry is a mere side effect of the actual blackbox test you're performing, and you're going to need working client auth to perform that test.

@marcan

This comment has been minimized.

Show comment
Hide comment
@marcan

marcan Sep 13, 2017

In that case I guess what I'm really asking for is an SSL-only test, only for the sake of checking the cert expiry. The actual service monitoring in this scenario would involve blackbox monitoring of the client interface (which is separate) plus whitebox monitoring of cluster status. What I'm trying to do here is just make sure the internal cluster certs don't expire, without otherwise interacting with that side of the service.

Addendum: I guess another alternative here would be to use a push gateway to export the last successful cert renewal as part of the renewal process, instead of trying to probe the result, but that doesn't catch situations where the cert isn't reloaded properly.

marcan commented Sep 13, 2017

In that case I guess what I'm really asking for is an SSL-only test, only for the sake of checking the cert expiry. The actual service monitoring in this scenario would involve blackbox monitoring of the client interface (which is separate) plus whitebox monitoring of cluster status. What I'm trying to do here is just make sure the internal cluster certs don't expire, without otherwise interacting with that side of the service.

Addendum: I guess another alternative here would be to use a push gateway to export the last successful cert renewal as part of the renewal process, instead of trying to probe the result, but that doesn't catch situations where the cert isn't reloaded properly.

@brian-brazil

This comment has been minimized.

Show comment
Hide comment
@brian-brazil

brian-brazil Sep 13, 2017

Member

Looking at the API, we can't do this with Go. Once the TLS connection has errored we don't get get back a connection object from which we could pull this information. Particular errors are also opaque to us, and I wouldn't want to depend on undocumented implementation details.

Member

brian-brazil commented Sep 13, 2017

Looking at the API, we can't do this with Go. Once the TLS connection has errored we don't get get back a connection object from which we could pull this information. Particular errors are also opaque to us, and I wouldn't want to depend on undocumented implementation details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment