Memory leak in QueryExecutorImpl #1431

atuldaemon · 2019-03-06T10:01:09Z

I'm submitting a ...

bug report
feature request

Describe the issue
The JDBC driver does not free up memory when reading large text from the DB. The memory leak happens in UTF8 decoding where the size of the decoderArray remains unbounded. The decoderArray is increased but is never shrunken if processing large strings.

Driver Version?
42.2.2

Java Version?
1.8

OS Version?
Ubuntu

PostgreSQL Version?
10.4 (Ubuntu 10.4-2.pgdg14.04+1)

To Reproduce

Here is a sample program to demonstrate the problem.

    Connection conn = null;
    PreparedStatement stmt = null;
    Boolean prev = null;
    try {
        conn = serverConnection.getConnection();

        prev = conn.getAutoCommit();
        conn.setAutoCommit(false);
        final String query = String.format("select * from table_with_large_text limit 10");
        stmt = conn.prepareStatement(query, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);

        stmt.setFetchSize(1);
        ResultSet rs = stmt.executeQuery();
        Thread.sleep(60 * 1000);    // (1)

        while (rs.next()) {
            articles.add(rs.getString("large_text_column"));
            Thread.sleep(60 * 1000);    // (2)
        }
        rs.close();
    } catch (Exception e) {
        e.printStackTrace();
    } finally {
        tryResetAutocommit(conn, prev);
        tryClose(stmt);
        tryClose(conn);
    }

If we take a heap dump at (1) and compare it with the heap dump at (2), then we will see that QueryExecutorImpl takes as much more memory as the size of the text column retrieved in rs.getString() and this memory never gets returned back even after stmt.close and conn.close

The reason for this is that the decoderArray in UTF8Encoding.java is expanded, but never shrunken back to its initial size of 1024, when it has to decode texts greater than 1024 size.

Expected behaviour
The memory footprint of the driver should remain constant. Here we see that the memory footprint increases but never goes back to its initial state.

Logs

Here is one of the heapdumps that I profiled using yourkit

+--Class-----------------------------------------------------------------------Retained Size-------Shallow Size-----
+--org.postgresql.core.v3.QueryExecutorImpl--------------------------------------80509768--------------184
  +--pgStream  org.postgresql.core.PGStream--------------------------------------79947872--------------24
     +--encoding  org.postgresql.core.UTF8Encoding-------------------------------79947872--------------24
     	+--decoderArray  char[39973916]------------------------------------------79947848--------------79947848

The text was updated successfully, but these errors were encountered:

davecramer · 2019-03-06T10:10:52Z

@atuldaemon is it possible to test this against 42.2.5 ?

The decoderArray used in decoding UTF8 strings is now kept constant to 1024 bytes. If the input size is larger than 1024 bytes, then only the cdata is allocated to a larger value instead of increasing the size of decodeArray. This helps keeping the size/memory to a constant.

anshum17 · 2020-05-25T04:56:04Z

Is this issue fixed in any newer version? We are facing the same issue in 42.2.2 version. Memory is allocated and is never returned back.

Class Name                              | Objects | Shallow Heap | Retained Heap | Percentage
----------------------------------------------------------------------------------------------
org.postgresql.core.v3.QueryExecutorImpl|     255 |       46,920 |   362,665,848 |     14.00%
|- org.postgresql.util.LruCache         |     252 |       12,096 |   218,611,296 |      8.44%
|- org.postgresql.core.v3.SimpleQuery   |  21,826 |    1,396,864 |    98,560,864 |      3.80%
----------------------------------------------------------------------------------------------

davecramer · 2020-05-25T10:42:21Z

Any chance you can test it with a newer version ?

anshum17 · 2020-05-25T23:20:17Z

Sure. I will test it on 42.2.6

davecramer · 2020-05-25T23:23:27Z

The latest is 42.2.12

vlsi · 2020-05-27T19:51:01Z

@anshum17 , I guess your case is different.
It looks like #1431 (comment) shows a statement cache.

I would suggest try the following:

Check the connection pool size. Do you expect to have 255 connections? Are you really using them? Do you think you could reduce the connection pool size?
Check the contents of SimpleQuery for one of the connections in your heap dump. It might be you have similar queries or you might have extremely long queries.
You might try reducing preparedStatementCacheQueries (defaults to 256) or preparedStatementCacheSizeMiB (defaults to 5). For more information, see https://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters

david-sauvage · 2021-12-03T13:15:35Z

Sorry to use that old issue but I feel like I have a memory issue with my project and I know I have some extremely long queries because I insert or update a field with a very long string.
@vlsi what is wrong with having very long queries ?

davecramer · 2021-12-03T14:55:42Z

Why do you think you have this issue?

When you say long. How big?

david-sauvage · 2021-12-06T08:14:47Z

Actually, I'm not sure if it's this issue specifically or this one : #1360
I was interested by this issue, because as I said I have pretty long queries due to updating very long varchar (close to 1 million characters for the biggest ones)

Here is what brought me on the pgjdbc repo :

Seems like my heapdump is full of byte arrays of this sort.
Will gladly have some feedback

davecramer · 2021-12-06T12:13:35Z

Well we don't have a concrete plan on how to remove the finalizer, but I suppose we can create a new branch and provide snapshot if you can test it ?

david-sauvage · 2021-12-14T08:08:01Z

What would be the point of testing a snapshot if in the end the finalizer won't be removed ? 😅
My first comment was more about knowing what is wrong with long queries in order to maybe find another way to do what I'm doing with a better memory footprint

davecramer · 2021-12-14T13:04:19Z

To find out if that is the real issue. We currently don't have a way to replicate it. I'm fairly certain finalizer will be going away once we figure out if/how to replace it.

vlsi · 2023-02-18T13:27:21Z

I've prepared a fix to reduce the heap overhead of finalizers and prevent locking the finalizer thread in #2817
It would be nice if you could review and test it with your setup

git clone --depth 10 --branch reduce_finalize_overhead https://github.com/vlsi/pgjdbc.git pgjdbc_reduce_finalize_overhead
cd pgjdbc_reduce_finalize_overhead

# It will publish to the local maven repository
./gradlew :postgresql:publishToMavenLocal

# It will prepare `pgjdbc/build/libs/postgresql-42.6.0-SNAPSHOT-all.jar` and `postgresql-42.6.0-SNAPSHOT-osgi.jar`
./gradlew :postgresql:osgiJar

…d StreamWrapper Previously, an abandoned PgConnection could retain a significant amount of heap. Now the bare minimum is moved to a separate class, so the abandoned connections can be removed from the heap earlier. This does not remove the use of `.finalize()`, and it only reduces the amount of retained heap memory. Fixes pgjdbc#1431 Fixes pgjdbc#1360

bokken mentioned this issue Mar 16, 2019

Utf 8 encoding optimizations #1444

Merged

8 tasks

davecramer assigned bokken Nov 28, 2019

vlsi mentioned this issue Dec 31, 2021

feat: rework PgConnection leak detector from #finalize to PhantomReferences #2381

Closed

3 tasks

vlsi closed this as completed in c40855f Feb 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak in QueryExecutorImpl #1431

Memory leak in QueryExecutorImpl #1431

atuldaemon commented Mar 6, 2019

davecramer commented Mar 6, 2019

anshum17 commented May 25, 2020 •

edited

Loading

davecramer commented May 25, 2020

anshum17 commented May 25, 2020

davecramer commented May 25, 2020

vlsi commented May 27, 2020

david-sauvage commented Dec 3, 2021

davecramer commented Dec 3, 2021

david-sauvage commented Dec 6, 2021

davecramer commented Dec 6, 2021

david-sauvage commented Dec 14, 2021

davecramer commented Dec 14, 2021

vlsi commented Feb 18, 2023

Memory leak in QueryExecutorImpl #1431

Memory leak in QueryExecutorImpl #1431

Comments

atuldaemon commented Mar 6, 2019

davecramer commented Mar 6, 2019

anshum17 commented May 25, 2020 • edited Loading

davecramer commented May 25, 2020

anshum17 commented May 25, 2020

davecramer commented May 25, 2020

vlsi commented May 27, 2020

david-sauvage commented Dec 3, 2021

davecramer commented Dec 3, 2021

david-sauvage commented Dec 6, 2021

davecramer commented Dec 6, 2021

david-sauvage commented Dec 14, 2021

davecramer commented Dec 14, 2021

vlsi commented Feb 18, 2023

anshum17 commented May 25, 2020 •

edited

Loading