Skip to content

Conversation

@suddendust
Copy link
Contributor

Description

Currently, the library uses a single connection and serialises all queries on it. This results in poor perf. This PR configures it to use the connection pool by default.

Testing

Setup: See the Queries section for all the queries that were used in the load test. An artificial delay of 0.1s was introduced using pg_sleep() to simulate some load on the system.

Overall Performance Improvement

Concurrency Serial Avg (ms) Pool Avg (ms) Improvement Speedup
50 5,154 1,053 79.6% 4.9x
75 6,394 1,232 80.7% 5.2x
100 7,064 1,853 73.8% 3.8x

Detailed Query Comparison

Concurrency: 50

Query Metric Serial (ms) Pool (ms) Improvement Speedup
aggregation_avg_risk_by_service Average 5,152 1,061 79.4% 4.9x
Median 5,081 1,102 78.3% 4.6x
P90 8,340 1,226 85.3% 6.8x
P99 10,601 1,311 87.6% 8.1x
aggregation_count_by_method Average 5,128 1,050 79.5% 4.9x
Median 5,120 1,061 79.3% 4.8x
P90 8,555 1,266 85.2% 6.8x
P99 9,761 1,368 86.0% 7.1x
filter_by_api_type Average 5,172 1,052 79.7% 4.9x
Median 5,018 1,072 78.6% 4.7x
P90 9,370 1,216 87.0% 7.7x
P99 10,150 1,317 87.0% 7.7x
filtered_by_environment Average 4,792 1,071 77.6% 4.5x
Median 4,324 1,087 74.9% 4.0x
P90 8,446 1,236 85.4% 6.8x
P99 10,243 1,343 86.9% 7.6x
min_max_risk_scores Average 5,234 1,056 79.8% 5.0x
Median 5,039 1,086 78.4% 4.6x
P90 8,163 1,201 85.3% 6.8x
P99 9,534 1,297 86.4% 7.4x
multi_dimension_groupby Average 5,435 1,033 81.0% 5.3x
Median 5,271 1,061 79.9% 5.0x
P90 8,698 1,178 86.5% 7.4x
P99 11,296 1,307 88.4% 8.6x
multiple_columns_select Average 5,172 1,070 79.3% 4.8x
Median 5,153 1,098 78.7% 4.7x
P90 8,539 1,308 84.7% 6.5x
P99 10,660 1,381 87.0% 7.7x
simple_select_with_orderby Average 6,148 1,030 83.2% 6.0x
Median 6,170 1,062 82.8% 5.8x
P90 9,678 1,184 87.8% 8.2x
P99 12,718 1,214 90.5% 10.5x

Concurrency: 75

Query Metric Serial (ms) Pool (ms) Improvement Speedup
aggregation_avg_risk_by_service Average 6,737 1,203 82.1% 5.6x
Median 6,984 1,351 80.7% 5.2x
P90 10,719 1,645 84.7% 6.5x
P99 11,021 1,805 83.6% 6.1x
aggregation_count_by_method Average 5,994 1,226 79.5% 4.9x
Median 5,539 1,369 75.3% 4.0x
P90 10,174 1,660 83.7% 6.1x
P99 10,366 1,828 82.4% 5.7x
filter_by_api_type Average 5,956 1,211 79.7% 4.9x
Median 6,008 1,357 77.4% 4.4x
P90 10,133 1,666 83.6% 6.1x
P99 10,268 1,795 82.5% 5.7x
filtered_by_environment Average 6,572 1,242 81.1% 5.3x
Median 7,478 1,479 80.2% 5.1x
P90 10,568 1,603 84.8% 6.6x
P99 10,919 1,691 84.5% 6.5x
min_max_risk_scores Average 5,667 1,211 78.6% 4.7x
Median 5,523 1,359 75.4% 4.1x
P90 9,458 1,613 82.9% 5.9x
P99 10,799 1,699 84.3% 6.4x
multi_dimension_groupby Average 5,967 1,226 79.5% 4.9x
Median 5,808 1,404 75.8% 4.1x
P90 10,203 1,645 83.9% 6.2x
P99 10,339 1,804 82.6% 5.7x
multiple_columns_select Average 7,194 1,221 83.0% 5.9x
Median 7,800 1,357 82.6% 5.7x
P90 10,867 1,694 84.4% 6.4x
P99 11,341 1,911 83.1% 5.9x
simple_select_with_orderby Average 7,063 1,314 81.4% 5.4x
Median 7,649 1,541 79.9% 5.0x
P90 10,726 1,675 84.4% 6.4x
P99 11,072 1,791 83.8% 6.2x

Concurrency: 100

Query Metric Serial (ms) Pool (ms) Improvement Speedup
aggregation_avg_risk_by_service Average 6,959 1,710 75.4% 4.1x
Median 8,428 1,697 79.9% 5.0x
P90 9,938 2,303 76.8% 4.3x
P99 10,210 2,538 75.1% 4.0x
aggregation_count_by_method Average 6,751 1,802 73.3% 3.7x
Median 7,397 1,912 74.1% 3.9x
P90 9,563 2,426 74.6% 3.9x
P99 10,731 2,690 74.9% 4.0x
filter_by_api_type Average 7,290 1,793 75.4% 4.1x
Median 8,135 1,795 77.9% 4.5x
P90 9,860 2,640 73.2% 3.7x
P99 10,116 2,949 70.9% 3.4x
filtered_by_environment Average 6,563 1,838 72.0% 3.6x
Median 6,964 1,899 72.7% 3.7x
P90 10,105 2,493 75.3% 4.1x
P99 10,420 2,869 72.5% 3.6x
min_max_risk_scores Average 6,117 1,790 70.7% 3.4x
Median 7,420 1,827 75.4% 4.1x
P90 9,672 2,579 73.3% 3.8x
P99 10,926 3,041 72.2% 3.6x
multi_dimension_groupby Average 6,764 2,199 67.5% 3.1x
Median 7,431 2,234 69.9% 3.3x
P90 10,686 2,758 74.2% 3.9x
P99 10,890 2,980 72.6% 3.7x
multiple_columns_select Average 8,388 1,806 78.5% 4.6x
Median 8,766 1,812 79.3% 4.8x
P90 12,907 2,489 80.7% 5.2x
P99 13,082 2,811 78.5% 4.7x
simple_select_with_orderby Average 6,683 1,884 71.8% 3.5x
Median 8,023 1,944 75.8% 4.1x
P90 9,706 2,526 74.0% 3.8x
P99 10,357 2,709 73.8% 3.8x

Tail Latency Analysis

P99 Latency Improvements

Concurrency Serial P99 Avg (ms) Pool P99 Avg (ms) Improvement
50 10,324 1,317 87.2%
75 10,482 1,790 82.9%
100 10,592 2,773 73.8%

P90 Latency Improvements

Concurrency Serial P90 Avg (ms) Pool P90 Avg (ms) Improvement
50 8,724 1,222 86.0%
75 10,356 1,650 84.1%
100 10,305 2,527 75.5%

Queries

[
  {
    "name": "simple_select_with_orderby",
    "description": "Basic SELECT with ORDER BY score",
    "query": {
      "entityType": "Entity",
      "selection": [
        {
          "columnIdentifier": {
            "columnName": "Entity.name"
          }
        },
        {
          "columnIdentifier": {
            "columnName": "Entity.score"
          }
        }
      ],
      "orderBy": [
        {
          "expression": {
            "columnIdentifier": {
              "columnName": "Entity.score"
            }
          },
          "order": "DESC"
        }
      ],
      "limit": 100
    }
  },
  {
    "name": "filtered_by_environment",
    "description": "Filter by environment and order by timestamp",
    "query": {
      "entityType": "Entity",
      "selection": [
        {
          "columnIdentifier": {
            "columnName": "Entity.name"
          }
        },
        {
          "columnIdentifier": {
            "columnName": "Entity.environment"
          }
        },
        {
          "columnIdentifier": {
            "columnName": "Entity.timestamp"
          }
        }
      ],
      "filter": {
        "childFilter": [
          {
            "lhs": {
              "columnIdentifier": {
                "columnName": "Entity.environment"
              }
            },
            "operator": "EQ",
            "rhs": {
              "literal": {
                "value": {
                  "string": "env1"
                }
              }
            }
          }
        ],
        "operator": "AND"
      },
      "orderBy": [
        {
          "expression": {
            "columnIdentifier": {
              "columnName": "Entity.timestamp"
            }
          },
          "order": "DESC"
        }
      ],
      "limit": 50
    }
  },
  {
    "name": "multiple_columns_select",
    "description": "Select multiple columns without filter",
    "query": {
      "entityType": "Entity",
      "selection": [
        {
          "columnIdentifier": {
            "columnName": "Entity.name"
          }
        },
        {
          "columnIdentifier": {
            "columnName": "Entity.method"
          }
        },
        {
          "columnIdentifier": {
            "columnName": "Entity.type"
          }
        },
        {
          "columnIdentifier": {
            "columnName": "Entity.service"
          }
        },
        {
          "columnIdentifier": {
            "columnName": "Entity.environment"
          }
        }
      ],
      "orderBy": [
        {
          "expression": {
            "columnIdentifier": {
              "columnName": "Entity.name"
            }
          },
          "order": "ASC"
        }
      ],
      "limit": 200
    }
  },
  {
    "name": "aggregation_count_by_method",
    "description": "Count entities grouped by method",
    "query": {
      "entityType": "Entity",
      "selection": [
        {
          "columnIdentifier": {
            "columnName": "Entity.method"
          }
        },
        {
          "function": {
            "functionName": "COUNT",
            "arguments": [
              {
                "columnIdentifier": {
                  "columnName": "Entity.name"
                }
              }
            ],
            "alias": "entity_count"
          }
        }
      ],
      "groupBy": [
        {
          "columnIdentifier": {
            "columnName": "Entity.method"
          }
        }
      ],
      "orderBy": [
        {
          "expression": {
            "function": {
              "functionName": "COUNT",
              "arguments": [
                {
                  "columnIdentifier": {
                    "columnName": "Entity.name"
                  }
                }
              ],
              "alias": "entity_count"
            }
          },
          "order": "DESC"
        }
      ],
      "limit": 10
    }
  },
  {
    "name": "aggregation_avg_score_by_service",
    "description": "Average score by service name",
    "query": {
      "entityType": "Entity",
      "selection": [
        {
          "columnIdentifier": {
            "columnName": "Entity.service"
          }
        },
        {
          "function": {
            "functionName": "AVG",
            "arguments": [
              {
                "columnIdentifier": {
                  "columnName": "Entity.score"
                }
              }
            ],
            "alias": "avg_score"
          }
        },
        {
          "function": {
            "functionName": "COUNT",
            "arguments": [
              {
                "columnIdentifier": {
                  "columnName": "Entity.name"
                }
              }
            ],
            "alias": "entity_count"
          }
        }
      ],
      "groupBy": [
        {
          "columnIdentifier": {
            "columnName": "Entity.service"
          }
        }
      ],
      "orderBy": [
        {
          "expression": {
            "function": {
              "functionName": "AVG",
              "arguments": [
                {
                  "columnIdentifier": {
                    "columnName": "Entity.score"
                  }
                }
              ],
              "alias": "avg_score"
            }
          },
          "order": "DESC"
        }
      ],
      "limit": 20
    }
  },
  {
    "name": "filter_by_type",
    "description": "Filter by entity type",
    "query": {
      "entityType": "Entity",
      "selection": [
        {
          "columnIdentifier": {
            "columnName": "Entity.name"
          }
        },
        {
          "columnIdentifier": {
            "columnName": "Entity.type"
          }
        },
        {
          "columnIdentifier": {
            "columnName": "Entity.service"
          }
        }
      ],
      "filter": {
        "childFilter": [
          {
            "lhs": {
              "columnIdentifier": {
                "columnName": "Entity.type"
              }
            },
            "operator": "EQ",
            "rhs": {
              "literal": {
                "value": {
                  "string": "TYPE_A"
                }
              }
            }
          }
        ],
        "operator": "AND"
      },
      "orderBy": [
        {
          "expression": {
            "columnIdentifier": {
              "columnName": "Entity.name"
            }
          },
          "order": "ASC"
        }
      ],
      "limit": 150
    }
  },
  {
    "name": "min_max_scores",
    "description": "Min, Max and Average scores",
    "query": {
      "entityType": "Entity",
      "selection": [
        {
          "function": {
            "functionName": "MIN",
            "arguments": [
              {
                "columnIdentifier": {
                  "columnName": "Entity.score"
                }
              }
            ],
            "alias": "min_score"
          }
        },
        {
          "function": {
            "functionName": "MAX",
            "arguments": [
              {
                "columnIdentifier": {
                  "columnName": "Entity.score"
                }
              }
            ],
            "alias": "max_score"
          }
        },
        {
          "function": {
            "functionName": "AVG",
            "arguments": [
              {
                "columnIdentifier": {
                  "columnName": "Entity.score"
                }
              }
            ],
            "alias": "avg_score"
          }
        }
      ],
      "limit": 1
    }
  },
  {
    "name": "multi_dimension_groupby",
    "description": "Complex multi-dimensional GROUP BY: method + environment + type",
    "query": {
      "entityType": "Entity",
      "selection": [
        {
          "columnIdentifier": {
            "columnName": "Entity.method"
          }
        },
        {
          "columnIdentifier": {
            "columnName": "Entity.environment"
          }
        },
        {
          "columnIdentifier": {
            "columnName": "Entity.type"
          }
        },
        {
          "function": {
            "functionName": "COUNT",
            "arguments": [
              {
                "columnIdentifier": {
                  "columnName": "Entity.name"
                }
              }
            ],
            "alias": "entity_count"
          }
        },
        {
          "function": {
            "functionName": "AVG",
            "arguments": [
              {
                "columnIdentifier": {
                  "columnName": "Entity.score"
                }
              }
            ],
            "alias": "avg_score"
          }
        }
      ],
      "groupBy": [
        {
          "columnIdentifier": {
            "columnName": "Entity.method"
          }
        },
        {
          "columnIdentifier": {
            "columnName": "Entity.environment"
          }
        },
        {
          "columnIdentifier": {
            "columnName": "Entity.type"
          }
        }
      ],
      "orderBy": [
        {
          "expression": {
            "columnIdentifier": {
              "columnName": "Entity.type"
            }
          }
        }
      ],
      "limit": 100
    }
  }
]

Checklist:

  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • Any dependent changes have been merged and published in downstream modules

@codecov
Copy link

codecov bot commented Nov 10, 2025

Codecov Report

❌ Patch coverage is 83.33333% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.48%. Comparing base (e5e9562) to head (750a6c0).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...ore/documentstore/postgres/PostgresCollection.java 85.36% 5 Missing and 1 partial ⚠️
...documentstore/postgres/PostgresConnectionPool.java 75.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main     #248      +/-   ##
============================================
+ Coverage     80.44%   80.48%   +0.03%     
  Complexity     1151     1151              
============================================
  Files           215      215              
  Lines          5502     5533      +31     
  Branches        487      489       +2     
============================================
+ Hits           4426     4453      +27     
- Misses          750      753       +3     
- Partials        326      327       +1     
Flag Coverage Δ
integration 80.48% <83.33%> (+0.03%) ⬆️
unit 57.59% <46.29%> (-0.09%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@suddendust
Copy link
Contributor Author

@suresh-prakash Shall we switch to using connection pooling for update/write queries as well? (We are only using it for READ queries in this PR).

@suddendust
Copy link
Contributor Author

@suresh-prakash One issue with was seeing with pooling is connections getting stuck as can be seen in this table:


 conns |             state
-------+-------------------------------
     1 | active
    20 | idle in transaction (aborted)

From what I found online, this is because the queries on these connections haven't committed (even SELECT statements execute as transactions implicitly). Setting connection.autoCommit(true) basically solves this issue. However, for methods like update that handle transaction management manually, this might be an issue. So I am setting autoCommit to false every time we're closing a connection. Can you review this approach? Ref: https://dba.stackexchange.com/a/246411

@suddendust suddendust changed the title [Draft] Use Connection Pool in PG Queries Use Connection Pool in PG Queries Nov 10, 2025
resultSet.next();
long count = resultSet.getLong(1);
// Reset autoCommit before returning connection to pool
connection.setAutoCommit(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why this is necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Methods like PostgresCollection#update, we commit manually as:

if (documentOptional.isEmpty()) {
  connection.commit();
  return empty();
  }

Setting auto-commit for such queries can be risky (haven't validate what the exact behaviour is in this case).

So, we turn this to off whenever we return a connection back to the pool. So when update() gets a pooled connection, its auto-commit is set to false.

@suddendust
Copy link
Contributor Author

@suresh-prakash Why it works today with a single connection is because when we do client.getConnection(), it returns a connection with auto-commit set to true (the default). However, for pools, we explicitly set it to false in PostgresConnectionPool#setFactoryProperties:

image

@suresh-prakash
Copy link
Contributor

@suresh-prakash Shall we switch to using connection pooling for update/write queries as well? (We are only using it for READ queries in this PR).

Ideally, we should be using connection pooling everywhere.

@suresh-prakash
Copy link
Contributor

@suresh-prakash One issue with was seeing with pooling is connections getting stuck as can be seen in this table:


 conns |             state
-------+-------------------------------
     1 | active
    20 | idle in transaction (aborted)

From what I found online, this is because the queries on these connections haven't committed (even SELECT statements execute as transactions implicitly). Setting connection.autoCommit(true) basically solves this issue. However, for methods like update that handle transaction management manually, this might be an issue. So I am setting autoCommit to false every time we're closing a connection. Can you review this approach? Ref: https://dba.stackexchange.com/a/246411

Do we really have any queries that use transactions? Can you point to those queries?

@suddendust
Copy link
Contributor Author

@suresh-prakash Added a separate pool for manual transactions as discussed.

}

@Test
public void testGetConnection() throws SQLException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this. 🙂

@suresh-prakash suresh-prakash merged commit 23c12e2 into hypertrace:main Nov 13, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants