Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support read timeout for queries #450

Open
rajeshucsb opened this issue Mar 28, 2016 · 22 comments
Open

Support read timeout for queries #450

rajeshucsb opened this issue Mar 28, 2016 · 22 comments

Comments

@rajeshucsb
Copy link

Currently there is no way to specify a read timeout for queries.
The sql client has an option for statement_timeout
http://www.postgresql.org/docs/9.3/static/runtime-config-client.html

And JDBC has socketTimeout
https://jdbc.postgresql.org/documentation/head/connect.html

Something similar for golang pq driver will be very useful.

@johto
Copy link
Contributor

johto commented May 5, 2016

The sql client has an option for statement_timeout
http://www.postgresql.org/docs/9.3/static/runtime-config-client.html

That's a server-side option and requires no support in the client. You can use statement_timeout with pq just fine.

And JDBC has socketTimeout
https://jdbc.postgresql.org/documentation/head/connect.html

Which is an absolute abomination, and only causes problems in my opinion.

Something similar for golang pq driver will be very useful.

What's the problem you're trying to solve?

@bobbytables
Copy link

So I'm looking at using statement_timeout, and a go-nuts post has

Controlling it at pool level is trivial; just set statement_timeout in the second argument to sql.Open().
See: https://groups.google.com/forum/#!topic/golang-nuts/l9vga1aAsHI

You can easily Exec the query to set the statement timeout option right when you connect but how would this behave with connection pooling?

@johto
Copy link
Contributor

johto commented Aug 10, 2016

You can easily Exec the query to set the statement timeout option right when you connect but how would this behave with connection pooling?

It doesn't work. The connection pool doesn't tell you when it's connecting or re-connecting, and you're not even guaranteed to get the same connection on two subsequent executions.

However, you could use a transaction and SET LOCAL.

@bobbytables
Copy link

@johto Yeah that's what I figured. I took a gander at the connection options for postgres and it doesn't even seem possible at connection time to set this parameter.

@johto
Copy link
Contributor

johto commented Aug 10, 2016

I took a gander at the connection options for postgres and it doesn't even seem possible at connection time to set this parameter.

Sure it is. But because database/sql only exposes a pool of connections, you have to have it for all connections in the pool. Just set it in the connection string like I explained in the post you linked to.

@jmscott
Copy link

jmscott commented Aug 10, 2016

with regard to timing out all queries in a connection pool, i am not
convinced adding the option "statement_timeout=" works. for example,
statement_timeout=1 is not timeing out my pooled queries. my version of pq
was built yesterday, pulled from github via "go get ...". the database is
pg9.6beta3.

-j

On Tue, Aug 9, 2016 at 11:03 PM, Marko Tiikkaja notifications@github.com
wrote:

I took a gander at the connection options for postgres and it doesn't even
seem possible at connection time to set this parameter.

Sure it is. But because database/sql only exposes a pool of connections,
you have to have it for all connections in the pool. Just set it in the
connection string like I explained in the post you linked to.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#450 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABWsKDEKaenZmzEn2B8Ue5shH1v9hovoks5qeU13gaJpZM4H6DGs
.

@peter-mogensen
Copy link

Just a (possibly off-topic) note ... server-side statement_timeout doesn't help much if the issue is the network. I have problems making queries timeout on dead network connections - like when a NAT router somewhere decides to forget my connection.

@pneisen
Copy link

pneisen commented Sep 12, 2016

We were having the same issue where statement_timeout was not working for us if our RDS instance failed over or if the network went down. We would have connections hang indefinitely. After a failed pull request to add read and write timeouts to lib/pq, I wrote this driver wrapper: https://github.com/Kount/pq-timeouts

It is working well for us, and might work well for you.

@chrispassas
Copy link

I've been running into this issue several times a day. I'm using statement_timeout but a few times a day the message never gets back to the client. The result is postgres stops interacting with the client and I get hung on rows.Next() indefinitely.

Even if I use a switch / timer to stop waiting the goroutine that ran the query is hung for ever. This is happening in an API so I can't just restart it every time this happens.

@peter-mogensen
Copy link

It's possible to write a wrapper around the postgres driver to overwrite the Dial and DialTimeout functions to create net.Conn which are wrappers around another net.Conn setting timeouts on read and write.

@kafan1986
Copy link

I wanted to have different timeouts for different queries using statement_timeout. But parallel calls to these APIs were affecting the timeout of all the connections between my server and db. Seems like you can not have different timeouts using statement_timeout alone, one way is to put the entire query inside a transaction and use "set local statement_timeout".

@clark-archer
Copy link

@kafan1986 Or you can have multiple connection pools if you only have a few different timeout requirements.

@u007
Copy link

u007 commented Mar 6, 2018

how about connection pool full time out? in my application, since connection is initiated on start of server, there is not way to timeout if the db pool is full for a long period of time

@bandesz
Copy link

bandesz commented Jan 10, 2019

I'm surprised that this is still an open issue, we've been just bitten by this. When you use an AWS RDS Postgres server in Multi-AZ and you force a fail-over (the IP will change behind the server host) it takes a really long time for lib/pq to timeout (it seems a little over 15 minutes).

There is a wrapper library to workaround this: https://github.com/Kount/pq-timeouts (I haven't tested it)

And there is also a fairly new issue/PR in this repo:

Could you please address this issue?

Or is there another library which properly supports these timeouts?

@chrispassas
Copy link

@bandesz I've used the pq-timeouts library you mention to better manage this type of problem. It would be nice to have some solution without an external wrapper like pq-timeouts.

@bandesz
Copy link

bandesz commented Jan 10, 2019

@chrispassas thanks for the advice, I just did the same and it works nicely - also it fixes my problems.

For anyone finding this issue and trying to replace lib/pq with jackc/pgx - I had a problem with it when queries timeout as the library wasn't returning any errors.

@chrispassas
Copy link

What I think happens is the following.

If you use a context to timeout a query or cancel a query pq sends a cancel signal to postgres. The problem is if postgres isn't responding it will hang there for a very long time. It may not be 100% correct but I think we need an option to set a max time to wait for postgres to accept the cancel signal and if not cancel from the Go side and continue execution.

The pq-timeouts works because if no data is read/write in the time given it will timeout the network connection even if postgres hasn't accepted the cancel signal yet.

Basicly pq is assuming postgres will always respond in a timely manner but it can't/won't always. This makes using context cancel unreliable with this library.

I still think pq is awesome and use it every day. This was probably my biggest pain point using it daily over 2 years.

@zkirill
Copy link

zkirill commented May 21, 2019

@bandesz I'm surprised that this is still an open issue, we've been just bitten by this. When you use an AWS RDS Postgres server in Multi-AZ and you force a fail-over (the IP will change behind the server host) it takes a really long time for lib/pq to timeout (it seems a little over 15 minutes).

Does anyone know what dictates how long lib/pq takes to timeout? Some predictability would be useful with setting the time to wait for the service to stop during graceful server shutdown.

@bandesz
Copy link

bandesz commented May 21, 2019

@zkirill just a note, that we've successfully used the pq-timeouts lib I linked in a previous comment.

@paco0x
Copy link

paco0x commented May 21, 2019

@bandesz I'm surprised that this is still an open issue, we've been just bitten by this. When you use an AWS RDS Postgres server in Multi-AZ and you force a fail-over (the IP will change behind the server host) it takes a really long time for lib/pq to timeout (it seems a little over 15 minutes).

Does anyone know what dictates how long lib/pq takes to timeout? Some predictability would be useful with setting the time to wait for the service to stop during graceful server shutdown.

I've encountered the lib/pq hangs forever when Postgres server hangs caused by Linux kernel hanging. lib/pq never gonna timeout in this situation. See the code:

pq/conn.go

Line 932 in 2ff3cb3

_, err := io.ReadFull(cn.buf, x)

It uses io.ReadFull trying to read the response from the Postgres connection. But io.ReadFull never gonna timeout when the connection is active while Postgres server is unable to response cause the system hangs.

Even ExecContext() with a timeout context will not work in this situation. Cause lib/pq relies on the Postgres Server to kill the current connection when the context ended.

I've read the pg-timeouts lib code mentioned above. It seems to be a solution to this situation. It wraps the TCP connection to the Postgres server and sets read/write timeout on every Read or Write call to the connection.

@georgysavva
Copy link

Hi. It looks like this issue was opened 4 years ago and still didn't get any attention. I found a PR that tries to solve this problem:
#785
It was opened a year ago and still didn't get any reaction from the maintainers.
I think that read/write timeouts is a major feature for a database driver. And all other database libraries postgres/mongo support it.
Please fix this!

@xamix
Copy link

xamix commented Sep 22, 2022

Hi,

Just find this thread because I got stuck in the same situation where the poll never end, you can find here the stack trace from GDB of the thread stuck insode libpq:

[Switching to thread 34 (Thread 0x7f1ff27fc700 (LWP 24871))]
#0  0x00007f20399a5def in __GI___poll (fds=0x7f1ff27f92e8, nfds=1, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
29      ../sysdeps/unix/sysv/linux/poll.c: No such file or directory.
(gdb) bt full
#0  0x00007f20399a5def in __GI___poll (fds=0x7f1ff27f92e8, nfds=1, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
        resultvar = 18446744073709551100
        sc_cancel_oldtype = 0
        sc_ret = <optimized out>
#1  0x00005575e3fb57a1 in pqSocketCheck.part.0 ()
No symbol table info available.
#2  0x00005575e3fb6184 in pqWaitTimed ()
No symbol table info available.
#3  0x00005575e3fb40d0 in PQgetResult ()
No symbol table info available.
#4  0x00005575e3fb442c in PQexecFinish ()

This issue is very annoying to us because it arrived with no specific reason, maybe a packet was lost but all other connection to database from other thread worked correctly at the same time...

Is the situation not evolving since more than 6 years?
I saw workaround also which preconize to use Async API to timeout but since a lot user seem to discover this problem when It's too late, why not implementing it on blocking connection or WARN somewhere about the possibility to get stuck indefinately?

EDIT:
Mmmh seem I posted in the go linrary instead of the C libpq library. My bad!

Regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests