-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
connection.reset don't check again DNS #558
Comments
libpq resolves the host by DNS while PQreset, but we don't. This is because we explcit set the hostaddr connection parameter when the connection is established the first time. This prevents a newly DNS resolution when running PQresetStart. This patch adds DNS resolution to conn.reset Since we can not change the connection parameters after connection start, the underlying PGconn pointer is exchanged in reset_start2. This is done by a PQfinish() + PQconnectStart() sequence. That way the hostaddr parameter is updated and a new connection is established with it. Unfortunately there's no simple way to test the new behavior. But I verified that it works by the following code: ```ruby require "pg" puts "pg version: #{PG::VERSION}" system "sudo sed -i 's/.* abcd/::1 abcd/g' /etc/hosts" conn = PG.connect host: "abcd", password: "l" conn.exec("select 1") p conn.conninfo_hash.slice(:host, :hostaddr, :port) system "sudo sed -i 's/.* abcd/127.0.0.1 abcd/g' /etc/hosts" conn.reset conn.exec("select 1") p conn.conninfo_hash.slice(:host, :hostaddr, :port) system "sudo sed -i 's/.* abcd/::2 abcd/g' /etc/hosts" conn.reset conn.exec("select 1") p conn.conninfo_hash.slice(:host, :hostaddr, :port) ``` This gives the following output showing, that the IP address is updated: ``` pg version: 1.5.5 {:host=>"abcd", :hostaddr=>"::1", :port=>"5432"} {:host=>"abcd", :hostaddr=>"127.0.0.1", :port=>"5432"} ruby-pg/lib/pg/connection.rb:573:in `reset_start2': connection to server at "::2", port 5432 failed: Network is unreachable (PG::ConnectionBad) Is the server running on that host and accepting TCP/IP connections? ``` Whereas libpq resolves similarly with `async_api=false` ``` pg version: 1.5.5 {:host=>"abcd", :hostaddr=>nil, :port=>"5432"} {:host=>"abcd", :hostaddr=>nil, :port=>"5432"} test-reset-dns.rb:18:in `sync_exec': no connection to the server (PG::UnableToSend) ``` Fixes ged#558
…his is because we explicit set the `hostaddr` connection parameter when the connection is established the first time. This prevents a newly DNS resolution when running PQresetStart. This patch adds DNS resolution to `conn.reset` Since we can not change the connection parameters after connection start, the underlying PGconn pointer is exchanged in reset_start2. This is done by a PQfinish() + PQconnectStart() sequence. That way the `hostaddr` parameter is updated and a new connection is established with it. There is a `/etc/hosts` and `sudo` based test in the specs. The behavior of libpq is slightly different to that of ruby-pg. It can be verified by the following code: ```ruby require "pg" puts "pg version: #{PG::VERSION}" system "sudo sed -i 's/.* abcd/::1 abcd/g' /etc/hosts" conn = PG.connect host: "abcd", password: "l" conn.exec("select 1") p conn.conninfo_hash.slice(:host, :hostaddr, :port) system "sudo sed -i 's/.* abcd/127.0.0.1 abcd/g' /etc/hosts" conn.reset conn.exec("select 1") p conn.conninfo_hash.slice(:host, :hostaddr, :port) system "sudo sed -i 's/.* abcd/::2 abcd/g' /etc/hosts" conn.reset conn.exec("select 1") p conn.conninfo_hash.slice(:host, :hostaddr, :port) ``` This gives the following output showing, that the IP address is updated: ``` pg version: 1.5.5 {:host=>"abcd", :hostaddr=>"::1", :port=>"5432"} {:host=>"abcd", :hostaddr=>"127.0.0.1", :port=>"5432"} ruby-pg/lib/pg/connection.rb:573:in `reset_start2': connection to server at "::2", port 5432 failed: Network is unreachable (PG::ConnectionBad) Is the server running on that host and accepting TCP/IP connections? ``` Whereas libpq resolves similarly with `async_api=false`, but doesn't raise the error in `conn.reset` but in the subsequent `conn.exec`. ``` pg version: 1.5.5 {:host=>"abcd", :hostaddr=>nil, :port=>"5432"} {:host=>"abcd", :hostaddr=>nil, :port=>"5432"} test-reset-dns.rb:18:in `sync_exec': no connection to the server (PG::UnableToSend) ``` Fixes ged#558
…his is because we explicit set the `hostaddr` connection parameter when the connection is established the first time. This prevents a newly DNS resolution when running PQresetStart. This patch adds DNS resolution to `conn.reset` Since we can not change the connection parameters after connection start, the underlying PGconn pointer is exchanged in reset_start2. This is done by a PQfinish() + PQconnectStart() sequence. That way the `hostaddr` parameter is updated and a new connection is established with it. There is a `/etc/hosts` and `sudo` based test in the specs. The behavior of libpq is slightly different to that of ruby-pg. It can be verified by the following code: ```ruby require "pg" puts "pg version: #{PG::VERSION}" system "sudo sed -i 's/.* abcd/::1 abcd/g' /etc/hosts" conn = PG.connect host: "abcd", password: "l" conn.exec("select 1") p conn.conninfo_hash.slice(:host, :hostaddr, :port) system "sudo sed -i 's/.* abcd/127.0.0.1 abcd/g' /etc/hosts" conn.reset conn.exec("select 1") p conn.conninfo_hash.slice(:host, :hostaddr, :port) system "sudo sed -i 's/.* abcd/::2 abcd/g' /etc/hosts" conn.reset conn.exec("select 1") p conn.conninfo_hash.slice(:host, :hostaddr, :port) ``` This gives the following output showing, that the IP address is updated: ``` pg version: 1.5.5 {:host=>"abcd", :hostaddr=>"::1", :port=>"5432"} {:host=>"abcd", :hostaddr=>"127.0.0.1", :port=>"5432"} ruby-pg/lib/pg/connection.rb:573:in `reset_start2': connection to server at "::2", port 5432 failed: Network is unreachable (PG::ConnectionBad) Is the server running on that host and accepting TCP/IP connections? ``` Whereas libpq resolves similarly with `async_api=false`, but doesn't raise the error in `conn.reset` but in the subsequent `conn.exec`. ``` pg version: 1.5.5 {:host=>"abcd", :hostaddr=>nil, :port=>"5432"} {:host=>"abcd", :hostaddr=>nil, :port=>"5432"} test-reset-dns.rb:18:in `sync_exec': no connection to the server (PG::UnableToSend) ``` Fixes ged#558
The blog post is almost one year old, until this issue is raised here in the issue tracker. 😄 |
Oh how timely is this?! I've been seeing a similar and related issue with a few setups (eg Nutanix, Aurora, EDB) where a failover where the server doesn't go away (just disconnects clients) or goes and comes back causes issues due to reconnecting to the old IP. This has caused a few outages as all other services (mostly Golang) seem to pick up on the change, while Ruby based ones stuck on the old IP. I had a patch that's basically a simplified version of what that blog post lays out but while seeing about some upstream fixes found this issue and the PR! 🎉 @larskanis I tested your PR with my test setup I'd been using to dig into this problem and it fixes the issue completely. I'll comment over on the PR with more test details! |
pg-1.5.6 is released, fixing this issue. 🎉 |
Oh wow! That was so fast! Even got in before my release deadline today! 😂 Thank you so much! |
Doesn't happen that often, but in this case I was just waiting for a confirmation in order to make a release. |
thank you all and sorry for the late reply |
when using AWS aurora, the database can stay on the same host when the IP behind has changed, in that case it can be interesting that the
reset
of the connection check again if we have to connect to the same IP as during thenew
or to a new IP for the same host as provided initially.We can see this article that explain it in more details https://blog.50projects.com/2023/04/fixing-rails-stickiness.html and how it affect a rails application on top of AWS aurora
The text was updated successfully, but these errors were encountered: