Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pg 8.2.0 seems not works with pgpool(4.1.1) #2218

Closed
thynson opened this issue May 14, 2020 · 26 comments · Fixed by #2220
Closed

pg 8.2.0 seems not works with pgpool(4.1.1) #2218

thynson opened this issue May 14, 2020 · 26 comments · Fixed by #2220

Comments

@thynson
Copy link

thynson commented May 14, 2020

After upgrading pg version from 8.1.0 to 8.2.0, we get stuck with query database via pgpool (deployed with docker image dockerhub.azk8s.cn/bitnami/pgpool:4.1.1-debian-10-r90).

Expected Behavior

Query result fetched normally.

Current Behavior

No result can be fetched after several queries (may related to max connections count in the pool), and pgpool will run into a state with very high cpu utilization until being killed, even no query is performed any more. It could also be a bug of pg-pool.

Possible Solution

Downgrade to 8.1.0

@brandonlehmann
Copy link

Ran into another weird issue with 8.2.0 whereby I started receiving error messages such as SimpleQuery not allowed while in extended protocol mode out of the blue when using pooled connection(s). Attempting to swap between parameterized queries and non-parameterized queries on a single Client or PoolClient throws this error.

Downgrading back to 8.1.0 resolved the issue.

@brianc
Copy link
Owner

brianc commented May 14, 2020 via email

@BenSjoberg
Copy link

I'm experiencing the same issue using pg 8.2.0 and pgpool 4.1.1. Reproducible with this sample script:

(async () => {
  const pg = require('pg')

  // Connection parameters set through env vars
  const pool = new pg.Pool()

  try {
    // Simple table created with CREATE TABLE foo(bar TEXT);
    await pool.query('INSERT INTO foo (bar) VALUES ($1)', [ Date.now().toString() ])
    console.log('Inserted!')
  } catch(e) {
    console.error('Error:', e)
  }
  pool.end()
})()

Roughly 20% of the time the script just hangs and doesn't print "Inserted!" or any error. When this happens, the pgpool process handing the connection gets stuck at 100%. Oddly, even after I kill the node.js script the pgpool process stays at 100% CPU until I kill it.

This doesn't happen when using pg <= 8.1.0 or when connecting to the master DB server directly.

Hard to say if this is an issue with pgpool or with pg, but we haven't had issues with any other postgres clients.

@brianc
Copy link
Owner

brianc commented May 14, 2020 via email

@brianc
Copy link
Owner

brianc commented May 14, 2020

what version of node are you using btw?

@BenSjoberg
Copy link

My example script was tested on 12.16.3, but I've also experienced this with 14.2.0.

@brianc
Copy link
Owner

brianc commented May 14, 2020

Tried this in all versions of node (8, 10, 12, 13, 14) w/ no issue so far:

const Pool = require('../')

describe.only('pool under load', () => {
  it('completes many lifecycles without hanging', async () => {
    const run = async () => {
      // Connection parameters set through env vars
      const pool = new Pool()

      try {
        // Simple table created with CREATE TABLE foo(bar TEXT);
        await pool.query('INSERT INTO foo (bar) VALUES ($1)', [Date.now().toString()])
        console.log('Inserted!')
      } catch (e) {
        console.error('Error:', e)
      }
      pool.end()
    }
    let i = 0
    while (i++ < 100) {
      await run()
    }
  })

  it('completes many queries without hanging', async () => {
    const run = async () => {
      // Connection parameters set through env vars
      const pool = new Pool()

      try {
        // Simple table created with CREATE TABLE foo(bar TEXT);
        let i = 0
        while (i++ < 100) {
          await pool.query('INSERT INTO foo (bar) VALUES ($1)', [Date.now().toString()])
        }
        console.log('Inserted!')
      } catch (e) {
        console.error('Error:', e)
      }
      pool.end()
    }
    await Promise.all(new Array(20).fill(0).map(run))
  })
})

I just made a file called load-test.js under pg-pool/test and ran it with mocha in the pg-pool directory.

@brianc
Copy link
Owner

brianc commented May 14, 2020

perhaps its a particular database version...what version of postgres are you using?

@BenSjoberg
Copy link

Postgres 10.12, but it doesn't happen when connecting directly, just when going through pgpool.

@brianc
Copy link
Owner

brianc commented May 14, 2020

hmm weird - yeah in the script above you can see I'm using the pool there. Anything else you recommend trying?

@brianc
Copy link
Owner

brianc commented May 14, 2020

ah okay I missed the fact that pgpool isn't the same thing as pg-pool which is the pooling module included w/ the library. I've never used pgpool before. It's likely they're doing something weird to the protocol that's throwing off the parser. Not sure what yet, I'll need to install it and see.

@brianc
Copy link
Owner

brianc commented May 14, 2020

okay I don't consider this as much of an emergency as I did before! There might be a bug in pgpool as well. I'll look into this today, but I gotta do some work at my job for a couple hours.

@BenSjoberg
Copy link

Ah, sorry about the confusion! :) Yep, it's basically a load balancer. We use it to send all write queries to the master DB and load balance read-only queries between read replicas.

Like I said, it's tough to know (without being a developer of either) if this is an issue with pg or pgpool. My hunch is that pgpool isn't handling some edge case correctly, and that somehow most clients don't trigger it. But that's just a vague guess.

Totally understandable that it's not an emergency, since I'm guessing most users of pg don't use pgpool. Pinning to pg 8.1.x works fine for now, so no worries. I'd tinker around with this myself, but I have normal work to get back to as well. :)

Thanks for your help! (And for the great library.)

@brianc
Copy link
Owner

brianc commented May 14, 2020

Yeah totally - there's gotta be something related to the length byte of the packet not being sent or the packet being split on a weird boundary that's in the middle of the length byte or something...and that's causing the one while loop to get into an infinite state.

https://github.com/brianc/node-postgres/blob/master/packages/pg-protocol/src/parser.ts#L103

My hunch is some zero length buffers or something getting in there...it should be relatively easy to diagnose once I get pgpool-II installed locally & set up....but it'll take me some time to get that working.

@brianc
Copy link
Owner

brianc commented May 14, 2020

when I heard "cpu goes to 100%, process becomes unresponsive" I thought immediately "yeah probably that while loop not exiting". I should probably put some reasonable bound in there and if it loops more than like 100 times or something throw an error there's probably something wrong versus just hanging the process. That wouldn't really fix anything though. Anyways...I'll take a look a bit l8r.

@BenSjoberg
Copy link

To be clear, it's the pgpool process that hits 100%, not the node.js process. The node process just sits idle. Also, the event loop isn't blocked - if I add setInterval(() => console.log('test'), 1000) to that test script, "test" is still printed once per second while the query hangs. So that makes me think it's not an infinite loop in pg.

@brianc
Copy link
Owner

brianc commented May 14, 2020

oh jeeze okay nevermind...hunch incorrect, but also that's good news in a way - looking at that loop nothing immediately comes to mind why it could get 'stuck' looping...its just always a little tickling thought in the back of my mind when I write a while loop: "Are you sure this can't get stuck?". the plot thickens. I'll def need to look at to what's going on w/ this...some weird interaction with pgpool-II to be sure.

@BenSjoberg
Copy link

Something else I just discovered, it seems to only affect parameterized queries. If I change

await pool.query('INSERT INTO foo (bar) VALUES ($1)', [Date.now().toString()])

to

await pool.query(`INSERT INTO foo (bar) VALUES ('${Date.now()}')`)

then the issue doesn't occur. Hope this helps narrow it down!

@tstm
Copy link

tstm commented May 15, 2020

pgpool is fairly widely used, and often even quite transparent. Some cloud providers use it in between clients and postgres for loadbalancing / detecting primary nodes and it's not even very visible to the client.

We have run into the same issue and had to revert into the previous version.

@boromisp
Copy link
Contributor

Try to switch the order flush and sync in Connection.prototype.sync.
This is the main difference I could find, and after this change I could not reproduce the issue.

@brianc
Copy link
Owner

brianc commented May 15, 2020

This is the main difference I could find, and after this change I could not reproduce the issue.

awesome! I'll do that today.

@brianc
Copy link
Owner

brianc commented May 15, 2020

uhhhh github closed this automatically. So i published a new patch version w/ the suggested fix...would y'all try it out for me & see if it fixes your problem?!

@BenSjoberg
Copy link

8.2.1 fixes the issue for me. Thanks so much! 😄 Have a good weekend.

@brianc
Copy link
Owner

brianc commented May 16, 2020 via email

@brianc brianc closed this as completed May 16, 2020
@thynson
Copy link
Author

thynson commented May 17, 2020

Confirmed that it works for me. Thank you very much.

@brandonlehmann
Copy link

Works here too. Awesome work. Thanks for the quick fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants