Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Added PostgreSQL RFC #3612

Merged
merged 11 commits into from
Aug 31, 2020
Merged

chore: Added PostgreSQL RFC #3612

merged 11 commits into from
Aug 31, 2020

Conversation

jamtur01
Copy link
Contributor

@jamtur01 jamtur01 commented Aug 28, 2020

Closes #3603

Signed-off-by: James Turnbull <james@lovedthanlost.net>
@jamtur01 jamtur01 added domain: metrics Anything related to Vector's metrics events domain: rfcs labels Aug 28, 2020
Signed-off-by: James Turnbull <james@lovedthanlost.net>
Signed-off-by: James Turnbull <james@lovedthanlost.net>
Signed-off-by: James Turnbull <james@lovedthanlost.net>
@jamtur01 jamtur01 requested a review from bruceg August 28, 2020 01:18
Signed-off-by: James Turnbull <james@lovedthanlost.net>
Signed-off-by: James Turnbull <james@lovedthanlost.net>
Copy link
Member

@jszwedko jszwedko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! just some questions around metric naming mostly.

```toml
[sources.my_source_id]
type = "postgresql_metrics" # required
address = "postgres://postgres@localhost" # required - address of the PG server.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting we'll probably want to call this endpoint per #3590

We've also leaned towards allowing multiple for sources (#3519 (comment)) so this would be endpoints.

This does make the databases config a bit weird when used with multiple dbs. You could just have the database in the URL ala JDBC to allow for things like postgres://postgres@localhost/database

If we do that, we'll probably also want to label metrics with the endpoint and host to allow differentiation downstream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bit messy as a UI. :( And adding the DB on the URL only solves for one DB per connection, what if I want two?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it does feel messier. You could just specify the URL twice if you wanted two databases (or maybe do commas?), but I can see this would be weird for the common case of just wanting to monitor all databases on a given server. To that end, we should probably make it easy to just monitor all database (or provide a list of exclusions even). I think it's probably pretty common that I'd want to monitor all of them aside from the template* ones.

I think we should come up with a rule of thumb for when we should do single vs. multiple endpoints for a given source though; to ease the decision in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm I've always assumed you'd specific the source twice for separate servers... Specifying the connection URL twice will double (or more for other DBs) the connections and mean multiple collections on a single DB server. I don't think that's a good performance experience for folks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I definitely agree in this case.

I'm wondering what you think about the apache_metrics case though. I did change it to endpoints in response to #3519 (comment) . The prometheus scraper also allows multiple endpoints.

Copy link
Contributor Author

@jamtur01 jamtur01 Aug 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like Apache and PG are differently architected and hence monitored differently. Apache is a single server you're monitoring. Monitoring PG is monitoring one or more/all databases in a PG server. That additional level of granularity makes endpoints odd to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reverted this to endpoint for now. We can discuss.

rfcs/2020-08-27-3603-postgres-metrics.md Outdated Show resolved Hide resolved
rfcs/2020-08-27-3603-postgres-metrics.md Outdated Show resolved Hide resolved
rfcs/2020-08-27-3603-postgres-metrics.md Outdated Show resolved Hide resolved
rfcs/2020-08-27-3603-postgres-metrics.md Outdated Show resolved Hide resolved
rfcs/2020-08-27-3603-postgres-metrics.md Show resolved Hide resolved
rfcs/2020-08-27-3603-postgres-metrics.md Show resolved Hide resolved
Signed-off-by: James Turnbull <james@lovedthanlost.net>
Copy link
Member

@lukesteensen lukesteensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! There are definitely fancier things we can do with postgres down the line but this would be a great start.

rfcs/2020-08-27-3603-postgres-metrics.md Show resolved Hide resolved
Copy link
Member

@jszwedko jszwedko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question about namespacing / metric prefix, but otherwise this looks good to go.

Similar to #3581 (comment), I'm curious about letting people exclude databases rather than listing them all out, but we can always figure that out in a second pass as I think we can do it in a backwards compatible manner.

rfcs/2020-08-27-3603-postgres-metrics.md Outdated Show resolved Hide resolved
rfcs/2020-08-27-3603-postgres-metrics.md Outdated Show resolved Hide resolved
rfcs/2020-08-27-3603-postgres-metrics.md Show resolved Hide resolved
Signed-off-by: James Turnbull <james@lovedthanlost.net>
Signed-off-by: James Turnbull <james@lovedthanlost.net>
Signed-off-by: James Turnbull <james@lovedthanlost.net>
Signed-off-by: James Turnbull <james@lovedthanlost.net>
@jamtur01
Copy link
Contributor Author

Just one question about namespacing / metric prefix, but otherwise this looks good to go.

Similar to #3581 (comment), I'm curious about letting people exclude databases rather than listing them all out, but we can always figure that out in a second pass as I think we can do it in a backwards compatible manner.

I've added a potential excluded_databases option too - we can explore during implementation.

@jamtur01 jamtur01 changed the title chore: Added draft postgresql RFC chore: Added PostgreSQL RFC Aug 31, 2020
@jamtur01
Copy link
Contributor Author

I'm going to merge this and we can discuss questions as implementation unfolds.

@jamtur01 jamtur01 merged commit 57b90b4 into master Aug 31, 2020
@jamtur01 jamtur01 deleted the postgres_metrics_rfc branch August 31, 2020 15:53
mengesb pushed a commit to jacobbraaten/vector that referenced this pull request Dec 9, 2020
Signed-off-by: Brian Menges <brian.menges@anaplan.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: metrics Anything related to Vector's metrics events
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New postgresql_metrics source RFC
5 participants