Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow COSAgentProvider to scrape public TLS metrics endpoint #211

Closed
dragomirp opened this issue Jun 19, 2023 · 6 comments · Fixed by #246
Closed

Allow COSAgentProvider to scrape public TLS metrics endpoint #211

dragomirp opened this issue Jun 19, 2023 · 6 comments · Fixed by #246

Comments

@dragomirp
Copy link
Contributor

Enhancement Proposal

At the moment the machine grafana-agent subordinate charm assumes metrics endpoints would be served on localhost and does not expose scheme or tls_config for the endpoint. This can make it difficult to scrape apps that provide their own endpoints.

In the Postgresql VM charm we use Patroni to manage the Postgresql cluster, which provides metrics as part of its REST API, however it is all served as a single endpoint. Exposing the API with TLS means also encrypting the metrics endpoint. The certificate used can also be self-signed.

To be able to consume that endpoint, we would need to be able to set the scheme for connection and either be able to use the host for which the certificate was signed or configure the agent to ignore the validation errors.

@sed-i
Copy link
Contributor

sed-i commented Jun 21, 2023

Hi @dragomirp,
Do you mean that your app has multiple units, and only one of the units can be scraped?
A diagram might help :)

@dragomirp
Copy link
Contributor Author

Hi, there's just a single unit:

graph TD
subgraph "Machine 0"
subgraph "Subordinates 0"
grafana-agent/0
end
postgresql/0 ---|cos-agent:1| grafana-agent/0
end

but it has multiple endpoints inside:

graph TD

subgraph postgresql/0
    pg["prometheus-postgres-exporter
    :9187/metrics"]
    patroni["patroni 
    :8008/metrics"]
end
  • prometheus-postgres-exporter is an external exporter for Postgresql, which we can serve locally for the agent
  • patroni serves a REST API including a metrics endpoint, which we need to expose and which should be encrypted, if certificates are available

If the patroni API is unencrypted, we can serve it on localhost and scrape it with the agent. If we enable TLS, we need means to communicate that to the agent.

@sed-i
Copy link
Contributor

sed-i commented Jun 22, 2023

Let's take zookeeper as reference:

        self._grafana_agent = COSAgentProvider(
            self,
            metrics_endpoints=[
                {"path": "/metrics", "port": JMX_PORT},
                {"path": "/metrics", "port": METRICS_PROVIDER_PORT},
            ],
            metrics_rules_dir="./src/alert_rules/prometheus",
            logs_rules_dir="./src/alert_rules/loki",
            log_slots=["charmed-zookeeper:logs"],
        )

Are you asking for a functionality along the lines of:

            metrics_endpoints=[
-               {"path": "/metrics", "port": JMX_PORT},
+               {"path": "/metrics", "port": JMX_PORT, "scheme": "http"},
-               {"path": "/metrics", "port": METRICS_PROVIDER_PORT},
+               {"path": "/metrics", "port": METRICS_PROVIDER_PORT, "scheme": "https"},
            ],

?

@dragomirp
Copy link
Contributor Author

Hi, we'll also need to handle the verification of the certificate. I was thinking something in the lines of:

            metrics_endpoints=[
                {"path": "/metrics", "port": JMX_PORT},
-               {"path": "/metrics", "port": METRICS_PROVIDER_PORT},
+               {"path": "/metrics", "port": METRICS_PROVIDER_PORT, "scheme": "https", "skip_verify": True},
            ],

That way we can expose the endpoint on localhost and ignore the certificate validation or:

            metrics_endpoints=[
                {"path": "/metrics", "port": JMX_PORT},
-               {"path": "/metrics", "port": METRICS_PROVIDER_PORT},
+               {"path": "/metrics", "port": METRICS_PROVIDER_PORT, "scheme": "https", "host": "host-with-cert.com"},
            ],

So that we can scrape the host that the certificate was issued for. Being able to set tls_config.ca_file here would also be helpful, but it should be a viable path even without it (adding the CA cert to the system store should work).

@sed-i
Copy link
Contributor

sed-i commented Jun 22, 2023

Got it! Added to the tracker.

@simskij
Copy link
Member

simskij commented Jul 4, 2023

Skip verify sounds like the more reasonable option for localhost connections.

I'd rather see we do that than open up for the possibility of someone registering arbitrary scrape jobs for whatever remote server using a FQDN/url - which would be bad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants