Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node-API/LibraryEngine: $disconnect does not free up memory / kill engine #9044

Open
janpio opened this issue Aug 31, 2021 · 6 comments
Open
Labels
bug/1-unconfirmed Bug should have enough information for reproduction, but confirmation has not happened yet. kind/bug A reported bug. team/client Issue for team Client. topic: node-api formerly `nApi` topic: performance/memory

Comments

@janpio
Copy link
Member

janpio commented Aug 31, 2021

When Node-API/engineType=library is active, calling $disconnect() does not free up the memory the instance of the Engine is using. This is because the engine is not killed as it is/was with the BinaryEngine (child process was completely killed on stop()), but only the disconnect is sent to the Engine. Then when another $connect or Query (which implicitly connects) comes in, the same instance of the QueryEngine is used again (vs. a newly started one, as with the binary).

If you (unrealistically) create many PrismaClient instances and just $disconnect them after use, this leads to growing memory usage until the script ends.

Context:

Reproduction above, running for 60 minutes:

  • Currently:
    25467 rss 1605.77 MB
  • With library engine deleted on $disconnect:
    43100 rss 288.34 MB

We think this is a problem, but not of highest priority and can be solved down the line.

@janpio janpio added bug/1-unconfirmed Bug should have enough information for reproduction, but confirmation has not happened yet. kind/bug A reported bug. team/client Issue for team Client. topic: node-api formerly `nApi` labels Aug 31, 2021
@haleksandre
Copy link

haleksandre commented Sep 17, 2021

This also seem to cause problems in integrated testing environment where you have to point the client to a new datasource for each isolated tests.

An example of this is using Cypress where you need to spin up a test server (with he client within the server context) to then be able to test requests to that server.

I tried 2 solutions (each aren't working as intended):

// schema.prisma
datasource db {
  provider = "postgres"
  url      = env("POSTGRES_URL")
}

Disconnect + Connect

// Within a setup hook
await ctx.prisma.$disconnect();

process.env.POSTGRES_URL = 'new url'

await ctx.prisma.$connect();

The client doesn't seem to update & still point to the old datasource. This method was working in previous prisma versions (not sure at what version it started breaking).

Disconnect + New Client

// Within a setup hook
await ctx.prisma.$disconnect();

process.env.POSTGRES_URL = 'new url'

ctx.prisma = new PrismaClient({
  datasources: {
    db: {
      url: process.env.POSTGRES_URL
    },
  },
}); 

The new client points to the right datasource, but now since disconnect lets the old connection "active", the tests will eventually hit a "too many connection" error.

@janpio
Copy link
Member Author

janpio commented Sep 17, 2021

This also seem to cause problems in integrated testing environment where you have to point the client to a new datasource for each isolated tests.

An example of this is using Cypress where you need to spin up a test server (with he client within the server context) to then be able to test requests to that server.

Can you tell us a bit more about the problems this is causing and optimally provide a way to reproduce the problem? Right now we can not fully understand the impact this observation has in real world scenarios, so this would be very welcome.

The client doesn't seem to update & still point to the old datasource. This method was working in previous prisma versions (not sure at what version it started breaking).

This sounds like an unrelated bug - please open its own issue for this with a reproduction so we can confirm this. If that is really the case, we might need to fix it indeed.

The new client points to the right datasource, but now since disconnect lets the old connection "active", the tests will eventually hit a "too many connection" error.

That is also unexpected, as $disconnect should definitely close the database connection. The issue above just explains how the Engine instance (not the connection) is not killed/removed though, possibly leading to higher memory usage. Please also open an issue for this with a reproduction so we can look deeper into this. If we can reproduce this, it would be quite a drastic bug.

@haleksandre
Copy link

Here is a somwhat simplified replication https://github.com/haleksandre/prisma-disconnect

Start the test server

npm run start:test

Start Cypress

cypress open -P .

The setup/teardown hooks where the dynamic POSTGRES_URL is being generated & middleware to initialized a new client are located in the folder ./src/test/routes.ts

Next you can test the $disconnect + new client issue by simply running Cypess. Tests should pass until it hit a too many db connection error. Currently it loops for 40 tests. If you need to adjust the number of loops the tests are located at ./cypress/intergration/user.spec.ts.

Finally to test the $disconnect + $connect issue, you can remove the middleware here koa.use(middleware);. Once you run the tests they will all fail with an error user doesn't exists on the schema because the client isn't pointing to to the generated schema from the setup hook (where the migration + seed is hapening with execa).

Let me know if you need more help locating stuff.

Thanks.

@haleksandre
Copy link

haleksandre commented Oct 6, 2021

I found out that the connections goes into a "idle" state. Either $disconnect isn't working properly or it is putting the connection in "idle" state.

I found a workaround to keep the connections down with a SQL script, which you either call within the middleware or teardown with the help of the $executeRawUnsafe function.

This is the SQL script:

WITH inactive_connections AS (
  SELECT
    pid,
    rank() over (partition by client_addr order by backend_start ASC) as rank
  FROM
    pg_stat_activity
  WHERE
    -- Exclude the thread owned connection (ie no auto-kill)
    pid <> pg_backend_pid( )
  AND
    -- Exclude known applications connections
    application_name !~ '(?:psql)|(?:pgAdmin.+)'
  AND
    -- Include connections to the same database the thread is connected to
    datname = current_database()
  AND
    -- Include connections using the same thread username connection
    usename = current_user
  AND
    -- Include inactive connections only
    state in ('idle', 'idle in transaction', 'idle in transaction (aborted)', 'disabled')
  AND
    -- Include old connections (found with the state_change field)
    current_timestamp - state_change > interval '10 seconds'
)

SELECT
  pg_terminate_backend(pid)
FROM
  inactive_connections
WHERE
  rank > 1 -- Leave one connection for each application connected to the database;
// Teardown
await ctx.prisma.$executeRawUnsafe(`
  WITH inactive_connections AS (
    SELECT
      pid,
      rank() over (partition by client_addr order by backend_start ASC) as rank
    FROM
      pg_stat_activity
    WHERE
      -- Exclude the thread owned connection (ie no auto-kill)
      pid <> pg_backend_pid( )
    AND
      -- Exclude known applications connections
      application_name !~ '(?:psql)|(?:pgAdmin.+)'
    AND
      -- Include connections to the same database the thread is connected to
      datname = current_database()
    AND
      -- Include connections using the same thread username connection
      usename = current_user
    AND
      -- Include inactive connections only
      state in ('idle', 'idle in transaction', 'idle in transaction (aborted)', 'disabled')
    AND
      -- Include old connections (found with the state_change field)
      current_timestamp - state_change > interval '10 seconds'
  )

  SELECT
    pg_terminate_backend(pid)
  FROM
    inactive_connections
  WHERE
    rank > 1 -- Leave one connection for each application connected to the database;
`);

This cleans up all the "idle" connections that has been idling for 10 seconds or more.

Should I open another issue so someone could try to replicate & investigate to see if it is indeed a bug with $disconnect?

@janpio
Copy link
Member Author

janpio commented Oct 6, 2021

Yes, that would be awesome. It might be related to the original problem I describe above, but is fundamentally a different area. If your reproduction helps understanding that problem in isolation, that would be amazing.

That cleanup code is pretty neat by the way - although a tiny bit scary. Good that you scoped it to the current user etc.

@janpio
Copy link
Member Author

janpio commented Sep 2, 2023

Related (and probably better solution than changing behavior of $disconnect()): #12153

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug/1-unconfirmed Bug should have enough information for reproduction, but confirmation has not happened yet. kind/bug A reported bug. team/client Issue for team Client. topic: node-api formerly `nApi` topic: performance/memory
Projects
None yet
Development

No branches or pull requests

2 participants