Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in QueryExecutorImpl #1431

Closed
1 of 2 tasks
atuldaemon opened this issue Mar 6, 2019 · 13 comments
Closed
1 of 2 tasks

Memory leak in QueryExecutorImpl #1431

atuldaemon opened this issue Mar 6, 2019 · 13 comments
Assignees

Comments

@atuldaemon
Copy link

I'm submitting a ...

  • bug report
  • feature request

Describe the issue
The JDBC driver does not free up memory when reading large text from the DB. The memory leak happens in UTF8 decoding where the size of the decoderArray remains unbounded. The decoderArray is increased but is never shrunken if processing large strings.

Driver Version?
42.2.2

Java Version?
1.8

OS Version?
Ubuntu

PostgreSQL Version?
10.4 (Ubuntu 10.4-2.pgdg14.04+1)

To Reproduce

Here is a sample program to demonstrate the problem.

    Connection conn = null;
    PreparedStatement stmt = null;
    Boolean prev = null;
    try {
        conn = serverConnection.getConnection();

        prev = conn.getAutoCommit();
        conn.setAutoCommit(false);
        final String query = String.format("select * from table_with_large_text limit 10");
        stmt = conn.prepareStatement(query, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);

        stmt.setFetchSize(1);
        ResultSet rs = stmt.executeQuery();
        Thread.sleep(60 * 1000);    // (1)

        while (rs.next()) {
            articles.add(rs.getString("large_text_column"));
            Thread.sleep(60 * 1000);    // (2)
        }
        rs.close();
    } catch (Exception e) {
        e.printStackTrace();
    } finally {
        tryResetAutocommit(conn, prev);
        tryClose(stmt);
        tryClose(conn);
    }

If we take a heap dump at (1) and compare it with the heap dump at (2), then we will see that QueryExecutorImpl takes as much more memory as the size of the text column retrieved in rs.getString() and this memory never gets returned back even after stmt.close and conn.close

The reason for this is that the decoderArray in UTF8Encoding.java is expanded, but never shrunken back to its initial size of 1024, when it has to decode texts greater than 1024 size.

Expected behaviour
The memory footprint of the driver should remain constant. Here we see that the memory footprint increases but never goes back to its initial state.

Logs

Here is one of the heapdumps that I profiled using yourkit

+--Class-----------------------------------------------------------------------Retained Size-------Shallow Size-----
+--org.postgresql.core.v3.QueryExecutorImpl--------------------------------------80509768--------------184
  +--pgStream  org.postgresql.core.PGStream--------------------------------------79947872--------------24
     +--encoding  org.postgresql.core.UTF8Encoding-------------------------------79947872--------------24
     	+--decoderArray  char[39973916]------------------------------------------79947848--------------79947848
@davecramer
Copy link
Member

@atuldaemon is it possible to test this against 42.2.5 ?

atuldaemon added a commit to atuldaemon/pgjdbc that referenced this issue Mar 6, 2019
The decoderArray used in decoding UTF8 strings is now kept constant to 1024 bytes.
If the input size is larger than 1024 bytes, then only the cdata is allocated
to a larger value instead of increasing the size of decodeArray.
This helps keeping the size/memory to a constant.
@anshum17
Copy link

anshum17 commented May 25, 2020

Is this issue fixed in any newer version? We are facing the same issue in 42.2.2 version. Memory is allocated and is never returned back.

Class Name                              | Objects | Shallow Heap | Retained Heap | Percentage
----------------------------------------------------------------------------------------------
org.postgresql.core.v3.QueryExecutorImpl|     255 |       46,920 |   362,665,848 |     14.00%
|- org.postgresql.util.LruCache         |     252 |       12,096 |   218,611,296 |      8.44%
|- org.postgresql.core.v3.SimpleQuery   |  21,826 |    1,396,864 |    98,560,864 |      3.80%
----------------------------------------------------------------------------------------------

@davecramer
Copy link
Member

Any chance you can test it with a newer version ?

@anshum17
Copy link

Sure. I will test it on 42.2.6

@davecramer
Copy link
Member

The latest is 42.2.12

@vlsi
Copy link
Member

vlsi commented May 27, 2020

@anshum17 , I guess your case is different.
It looks like #1431 (comment) shows a statement cache.

I would suggest try the following:

  1. Check the connection pool size. Do you expect to have 255 connections? Are you really using them? Do you think you could reduce the connection pool size?
  2. Check the contents of SimpleQuery for one of the connections in your heap dump. It might be you have similar queries or you might have extremely long queries.
  3. You might try reducing preparedStatementCacheQueries (defaults to 256) or preparedStatementCacheSizeMiB (defaults to 5). For more information, see https://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters

@david-sauvage
Copy link

Sorry to use that old issue but I feel like I have a memory issue with my project and I know I have some extremely long queries because I insert or update a field with a very long string.
@vlsi what is wrong with having very long queries ?

@davecramer
Copy link
Member

Why do you think you have this issue?

When you say long. How big?

@david-sauvage
Copy link

Actually, I'm not sure if it's this issue specifically or this one : #1360
I was interested by this issue, because as I said I have pretty long queries due to updating very long varchar (close to 1 million characters for the biggest ones)

Here is what brought me on the pgjdbc repo :
image
Seems like my heapdump is full of byte arrays of this sort.
Will gladly have some feedback

@davecramer
Copy link
Member

Well we don't have a concrete plan on how to remove the finalizer, but I suppose we can create a new branch and provide snapshot if you can test it ?

@david-sauvage
Copy link

What would be the point of testing a snapshot if in the end the finalizer won't be removed ? 😅
My first comment was more about knowing what is wrong with long queries in order to maybe find another way to do what I'm doing with a better memory footprint

@davecramer
Copy link
Member

To find out if that is the real issue. We currently don't have a way to replicate it. I'm fairly certain finalizer will be going away once we figure out if/how to replace it.

@vlsi
Copy link
Member

vlsi commented Feb 18, 2023

I've prepared a fix to reduce the heap overhead of finalizers and prevent locking the finalizer thread in #2817
It would be nice if you could review and test it with your setup

git clone --depth 10 --branch reduce_finalize_overhead https://github.com/vlsi/pgjdbc.git pgjdbc_reduce_finalize_overhead
cd pgjdbc_reduce_finalize_overhead

# It will publish to the local maven repository
./gradlew :postgresql:publishToMavenLocal

# It will prepare `pgjdbc/build/libs/postgresql-42.6.0-SNAPSHOT-all.jar` and `postgresql-42.6.0-SNAPSHOT-osgi.jar`
./gradlew :postgresql:osgiJar

vlsi added a commit to vlsi/pgjdbc that referenced this issue Feb 19, 2023
…d StreamWrapper

Previously, an abandoned PgConnection could retain a significant amount of heap.
Now the bare minimum is moved to a separate class, so the abandoned connections
can be removed from the heap earlier.

This does not remove the use of `.finalize()`, and it only reduces the amount of retained heap memory.

Fixes pgjdbc#1431
Fixes pgjdbc#1360
@vlsi vlsi closed this as completed in c40855f Feb 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants