Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postresql plugin does not close connections properly #2410

Closed
hellow554 opened this issue Feb 15, 2017 · 20 comments · Fixed by #2611
Closed

Postresql plugin does not close connections properly #2410

hellow554 opened this issue Feb 15, 2017 · 20 comments · Fixed by #2611
Labels
bug unexpected problem or unintended behavior

Comments

@hellow554
Copy link

hellow554 commented Feb 15, 2017

Bug report

when using the postgresql plugin and let it run for some time (1 hour should be enough) the postgresql server will claim, that there are no more free connections.
A look with netstat will confirm this (my current limit is 500 connections, all of them are in use)

Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 dbserver.:postgresql grafana.:60496 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:54826 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:33832 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:48662 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:34264 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:34470 TIME_WAIT
tcp        0      0 dbserver.:postgresql grafana.:34382 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:51706 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:38296 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:37974 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:53828 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:35974 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:38496 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:43290 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:36564 ESTABLISHED
.........
tcp        0      0 dbserver.:postgresql grafana.:35250 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:58990 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:60576 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:44946 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:41526 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:34516 TIME_WAIT
tcp        0      0 dbserver.:postgresql grafana.:36100 ESTABLISHED
tcp        0      0 dbserver.:postgresql grafana.:44844 ESTABLISHED
udp        0      0 localhost:43693         localhost:43693         ESTABLISHED

Relevant telegraf.conf:

[[inputs.postgresql]]
        address = "host=dbserver.my.domain user=user password=passwd"

Versions

$ telegraf --version
Telegraf vdev-49-gc8cc01b (git: master c8cc01b)

$ postgres --version
postgres (PostgreSQL) 9.4.6

This is, at least for me, always reproducible, even after I restart postgres, telegraf or influx. None of that worked

@sparrc
Copy link
Contributor

sparrc commented Feb 15, 2017

I believe this is a dupe of #1977

@sparrc sparrc closed this as completed Feb 15, 2017
@sparrc
Copy link
Contributor

sparrc commented Feb 15, 2017

actually looks slightly different on second glance....

Could you try testing with 1.2.1 (git checkout 1.2.1)? could be related to #1750

@sparrc sparrc reopened this Feb 15, 2017
@hellow554
Copy link
Author

I updated telegraf and restartet it. I will report the status in a few hours

@hellow554
Copy link
Author

Did not solve the problem. There is still a lot of established connections.
A persistent connection would solve this problem, but I think it's not a dup, but a bug, because even in non persistent mode it should work. It should open and close the connection properly!

@sparrc
Copy link
Contributor

sparrc commented Feb 15, 2017

Yes, but from a development perspective I'm not sure what we can do besides call db.Close() (which we're doing already): https://github.com/influxdata/telegraf/blob/master/plugins/inputs/postgresql/postgresql.go#L76

Are you sure it's telegraf causing this? from your netstat output it looks like it's grafana..?

@sparrc
Copy link
Contributor

sparrc commented Feb 15, 2017

@punkkeks could you try another thing? try specifying your connection URL as:

address = "postgres://user:passwd@dbserver.my.domain"

the reason being that we skip making a "connection pool" in that case

@hellow554
Copy link
Author

grafana is just the hostname. Grafana uses a postgresql connection.. hm.. I could stop telegraf and wait for an hour and then look at the connections again. But first I will try the other connection URI

@hellow554
Copy link
Author

It seems that the URI string has fixed the problem. What?! I will switch back to the old one and check if the bug occours again.

@hellow554
Copy link
Author

Yep. That was it. I switched back and forth multiple times and whenver the config had the uri instead of the "old" settings, the connections were properly closed.
Interesting.. does anybody hast an idea for that?!

I would suggest removing the old config format from the sample config and solely using the new one. (btw. could anyone update the readme for postgresql? There is no example config, you have to extract it from the go "source")

@sparrc
Copy link
Contributor

sparrc commented Feb 15, 2017

@punkkeks could you do that? 🙏 , you can edit a readme from within the github web interface, it's quite easy.

@sparrc
Copy link
Contributor

sparrc commented Feb 15, 2017

nevermind, I'm going to edit the plugin anyways, I'll do it

@sparrc
Copy link
Contributor

sparrc commented Feb 15, 2017

Interesting.. does anybody hast an idea for that?!

It seems to be a limitation of the client library. For some reason it forces you to create a "connection pool" if you have it in that format, whereas the other format you can simply open a connection.

I have no idea why, and I don't know enough about postgres connection pools to figure out a proper configuration for that. If you happen to have time to fiddle around with this code and find a solution, I'd appreciate it immensely :) https://github.com/influxdata/telegraf/blob/master/plugins/inputs/postgresql/connect.go#L93-L98

@phemmer might have better ideas than me

@phemmer
Copy link
Contributor

phemmer commented Feb 15, 2017

I haven't used the postgresql plugin. I just use postgresql_extensible. But there's nothing special with postgres about the way you have to do pools. Smells like a bug in the library if closing the connection pool leaves connections open.

@sparrc
Copy link
Contributor

sparrc commented Feb 15, 2017

@phemmer postgresql_extensible calls the same function to make a connection to db as the postgres plugin

@hellow554
Copy link
Author

Thanks guys :)
Thanks for the support and the final solution. I think issuing a bug in the go postgres binding/library should not be bad either.
I close this as resolved

@phemmer
Copy link
Contributor

phemmer commented Feb 15, 2017

Oh, I didn't notice that it's including the postgresql plugin as a dependency. Huh. I'm using it with a standard postgres DSN (host=/tmp sslmode=disable user=postgres dbname=postgres) and have no issues with connection leaks :-/

@phemmer
Copy link
Contributor

phemmer commented Feb 15, 2017

Also I would probably leave this ticket open. While the bug may be in an external library, telegraf would need to take action to get it fixed, such as updating the library once upstream issue is resolved. Other people may also come here and look for the issue before opening a new one.

Although since I don't experience the issue on my systems, it appears that it's not the format of the address parameter that causes it. So for a developer to reproduce the issue might be problematic.

@sparrc sparrc reopened this Feb 15, 2017
@sparrc
Copy link
Contributor

sparrc commented Feb 15, 2017

yes, we need to figure out why this is leaking connections, @punkkeks I'm going to reopen

@hellow554
Copy link
Author

Okay. I'll try to provide some tcpdumps tomorrow (GMT+1) to see what's gonig wrong.

@sparrc sparrc added the bug unexpected problem or unintended behavior label Feb 16, 2017
@sparrc sparrc added this to the Future Milestone milestone Feb 16, 2017
sparrc pushed a commit that referenced this issue Mar 9, 2017
* Add configuration docs to Postgresql input plugin

Add configuration docs to PostgreSQL input plugin README (mostly from the source code) though I've not included the configuration example that seems to use all he connections on the database[1].

[1] #2410

* Fix typo in readme and sampleConfig string.
@james-lawrence
Copy link
Contributor

@sparrc I'll happily look into this.

ssorathia pushed a commit to ssorathia/telegraf that referenced this issue Mar 25, 2017
* Add configuration docs to Postgresql input plugin

Add configuration docs to PostgreSQL input plugin README (mostly from the source code) though I've not included the configuration example that seems to use all he connections on the database[1].

[1] influxdata#2410

* Fix typo in readme and sampleConfig string.
calerogers pushed a commit to calerogers/telegraf that referenced this issue Apr 5, 2017
* Add configuration docs to Postgresql input plugin

Add configuration docs to PostgreSQL input plugin README (mostly from the source code) though I've not included the configuration example that seems to use all he connections on the database[1].

[1] influxdata#2410

* Fix typo in readme and sampleConfig string.
vlamug pushed a commit to vlamug/telegraf that referenced this issue May 30, 2017
* Add configuration docs to Postgresql input plugin

Add configuration docs to PostgreSQL input plugin README (mostly from the source code) though I've not included the configuration example that seems to use all he connections on the database[1].

[1] influxdata#2410

* Fix typo in readme and sampleConfig string.
maxunt pushed a commit that referenced this issue Jun 26, 2018
* Add configuration docs to Postgresql input plugin

Add configuration docs to PostgreSQL input plugin README (mostly from the source code) though I've not included the configuration example that seems to use all he connections on the database[1].

[1] #2410

* Fix typo in readme and sampleConfig string.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants