Skip to content

feat: add connection pooling to fix db connection exhaustion (#536)#816

Draft
ismisepaul wants to merge 19 commits intodevfrom
dev#536
Draft

feat: add connection pooling to fix db connection exhaustion (#536)#816
ismisepaul wants to merge 19 commits intodevfrom
dev#536

Conversation

@ismisepaul
Copy link
Copy Markdown
Member

@ismisepaul ismisepaul commented Mar 29, 2026

Note: This PR supersedes draft #800 (same work; head branch is now OWASP/SecurityShepherd:dev#536 instead of a fork). You can close #800 in favor of this PR.

Summary

Fixes #536 — automated attack tools (e.g. ZAP spider/fuzzer) exhaust all MySQL connections, causing multi-hour outages.

  • HikariCP connection pooling for MySQL/MariaDB with configurable pool size, connection timeouts, idle timeouts, and max lifetime — connections are bounded and reused instead of created per-request
  • Singleton MongoClient pattern for MongoDB, leveraging the driver's built-in connection pooling
  • DatabaseLifecycleListener for clean pool initialization on startup and shutdown on context destroy — no more leaked connections requiring manual Tomcat restarts
  • Documentation for database configuration, capacity planning, and troubleshooting
  • Tests for ConnectionPool, MongoDatabase singleton, and lifecycle listener

Changes

File Change
ConnectionPool.java New — HikariCP pool manager with core + per-challenge pools
Database.java Refactored to delegate all connections through ConnectionPool
MongoDatabase.java Refactored to singleton pattern with double-checked locking
DatabaseLifecycleListener.java New — ServletContextListener for pool lifecycle
web.xml Registered lifecycle listener
pom.xml Added HikariCP dependency
docs/database-configuration.md New — pool config reference, sizing guide, troubleshooting
docs/development-workflow.md New — dev cycle guide
docs/testing.md New — test running guide
CONTRIBUTING.md Links to new docs
*.properties.example New — example config files for MySQL and MongoDB
*Test.java New/updated tests for pool and singleton patterns

Default Pool Configuration

Setting Default Purpose
pool.maximumPoolSize 10 Max connections per pool
pool.minimumIdle 2 Min idle connections maintained
pool.connectionTimeout 30s Max wait for a connection before failing
pool.idleTimeout 10min Idle connection cleanup
pool.maxLifetime 30min Max connection lifetime

All values are configurable via database.properties.

Test plan

  • Unit tests pass (mvn test with Docker db containers running)
  • Application starts cleanly with pool initialization logged
  • Application shuts down cleanly with pool shutdown logged
  • Under load (e.g. ZAP fuzzer), connections are bounded and reused — no connection exhaustion
  • Challenge connections work correctly with per-schema pools
  • MongoDB operations work correctly with singleton client
  • Custom pool sizes in database.properties are respected

Discussion (from #800 comments)

The following was posted on PR #800 so reviewers do not need to open that draft for context.

Challenge pool sizing

After reviewing the challenge connection architecture:

Per-challenge pools are the right design — Each challenge uses different DB credentials scoped to its own schema (26 challenge properties files, each with unique username/password). This is intentional: it prevents a SQL injection in one challenge from accessing another challenge's data. A single shared pool with setCatalog() would require a single DB user with access to all schemas, defeating that isolation.

Challenge pools should be smaller than core — The connection exhaustion problem (#536) is caused by fuzzers/ZAP hammering core app endpoints (auth, scoreboard, module lookups via Getter/Setter), not challenge endpoints. Challenges are low-traffic by nature. Keeping pooling for challenges still helps (avoids TCP handshake + auth per request), but with minIdle=0 there is zero cost when a challenge is not in use, and connections spin up on demand when a student starts working.

Defaults in ConnectionPool.java for challenge vs core:

Setting Core pool Challenge pool
maxPoolSize 10 3
minIdle 2 0
idleTimeout 10 min 2 min

With 26 challenges, worst case is 26 × 3 = 78 connections if every challenge is under active use simultaneously; in practice with minIdle=0 and a 2-minute idle timeout, idle challenge pools hold zero connections.

Connection leaks and follow-up PRs

Pooling exposes pre-existing connection leaks in Getter.java (~32 methods) and Setter.java (~24 methods). Without try-with-resources, connections borrowed from the pool may not be returned — the core pool (max 10) can exhaust within minutes under normal use.

Required follow-up: convert Database.get*Connection() usage to try-with-resources and remove manual Database.closeConnection(); add tests alongside fixes. Priority: Getter.java (every request) > Setter.java (admin / user ops) > challenge servlets (isolated pools, lower risk).

Status: Getter.java fixes are in #817 (stacked on this PR). Setter.java and servlets are tracked separately (e.g. #815).

ismisepaul and others added 17 commits March 29, 2026 00:03
Replace per-request DriverManager connections with HikariCP connection
pooling for MySQL/MariaDB and singleton MongoClient for MongoDB. This
prevents automated tools (e.g. ZAP spider/fuzzer) from exhausting all
database connections and causing multi-hour outages.

- Add ConnectionPool using HikariCP with configurable pool size, timeouts
- Refactor Database.java to delegate all connections through the pool
- Refactor MongoDatabase.java to use singleton pattern with internal pooling
- Add DatabaseLifecycleListener for clean startup/shutdown of pools
- Add connection pool tests and MongoDB singleton tests
- Add database configuration, testing, and development workflow docs
- Add example properties files for MySQL and MongoDB configuration

Made-with: Cursor
- Add missing imports (Arrays, FileInputStream) in MongoDatabase.java
  lost during rebase conflict resolution
- Wrap ConnectionPool init failure as SQLException to maintain API
  compatibility with callers that catch SQLException
- Convert ConnectionPoolTest and DatabaseLifecycleListenerTest from
  JUnit 4 to JUnit 5 so surefire discovers and runs them
- Add mongo_challenge_test.properties test resource for MongoDatabaseTest
- Apply Google Java Format to all modified files

Made-with: Cursor
The pinned v3.6.0 action uses Node 12 (deprecated, scheduled for
removal Sep 2026) and may download an incompatible google-java-format
version for the current ubuntu-latest JDK. Update to v4 (pinned to
c1134eb) which uses Node 20 and auto-downloads the latest compatible
GJF release. Also update checkout to v4.

Made-with: Cursor
Made-with: Cursor
- Change allowMultiQueries=yes to allowMultiQueries=true in Database.java
  (HikariCP requires strict boolean values, not yes/no)
- Catch RuntimeException in ConnectionPoolIT.testChallengeConnection
  (HikariCP throws PoolInitializationException which extends RuntimeException)

Made-with: Cursor
ConnectionPool: read DriverType from properties with URL-based fallback
for mariadb/mysql drivers. Fixes Tomcat classloader issue where JDBC 4
auto-registration doesn't work for webapp-scoped drivers.

Setup: fix XOR logic for db host/port validation — the original condition
rejected valid input when both fields were provided. Extract validation
into testable validateHostPort() method.

Add SetupTest with 7 tests covering all host/port combinations.

Made-with: Cursor
Challenge pools use per-schema credentials by design (isolation between
challenges), but don't need the same sizing as the core pool. Set
challenge pools to max 3 / min idle 0 / 2min idle timeout so they hold
zero connections when unused.
Replace the silent-pass testChallengeConnection with tests that verify:
- Challenge connections are valid and usable
- Same credentials reuse the same pool
- Different schemas create separate pools
- Challenge pools use correct sizing (max 3, min idle 0, 2min timeout)
- Max connections are enforced at pool limit
- Shutdown clears all challenge pools
- Concurrent challenge connections succeed and share one pool

Also adds getChallengePoolCount() and getChallengePool() accessors to
ConnectionPool for test observability.
Configures HikariCP's leakDetectionThreshold to log a warning when a
connection is held for longer than 60 seconds without being closed.
Helps identify connection leaks that could lead to pool exhaustion.
Documents the MariaDB init failure caused by Docker caching an image
before the DELIMITER conversion script runs. Includes the fix
(--no-cache rebuild + volume cleanup) in both AGENTS.md and
docs/database-configuration.md.
Documents the setup.jsp flow required after docker compose up,
including how to get the auth token and the correct database hostname
(container name, not localhost).
The setup servlet uses 'dbauth' (not 'authToken') and requires MongoDB
params (mhost, mport) even when not using mongo challenges. Documents
the correct curl command and all parameter names for both agents and
human contributors.
Python load test that boots the stack, configures the platform,
registers 20 users, and simulates 17 normal users browsing alongside
3 aggressive users doing rapid automated scanning. Monitors DB
connections and app responsiveness, reports pass/fail based on
connection bounds and response times.
…#817)

* fix: return core DB connections to pool from Getter methods

- Use try-with-resources for all Getter paths that used manual closeConnection
- Keep legacy ResultSet APIs (getClassInfo all classes, getPlayersByClass, getAdmins) unchanged
- Add ConnectionPool.getCoreActiveConnections for tests
- Add GetterCorePoolLeakIT to assert bounded pool usage under repeated authUser

* style: format Getter.java with Google Java Format

* fix: cache isInstalled(), tune pool for stability under load

Setup.isInstalled() was called on every HTTP request via SetupFilter,
each time reading database.properties from disk and borrowing a core
pool connection just to check non-null. Under concurrent load this
exhausted the pool and cascaded into a full app lockup.

Cache with volatile Boolean + double-checked locking so the check
runs once, then returns constant-time on all subsequent requests.
resetInstalledCache() called after setup completes so the first
post-setup request re-evaluates. Warm the cache at startup from
DatabaseLifecycleListener.contextInitialized().

Pool tuning:
- maxPoolSize 10 → 20 (supports realistic classroom concurrency)
- connectionTimeout 30s → 5s (fail fast under overload instead of
  blocking Tomcat threads for 30s each)
- minIdle 2 → 5 (reduce cold-start latency)

Known limitation: authUser holds a DB connection during Argon2
password verification (~100-200ms). This limits throughput under
high concurrency. Follow-up will release connection before hashing.

Load test updated with --target and --concurrency flags for targeted
per-class/per-method testing instead of broad soak only.

* fix: only cache isInstalled()=true, not transient false

If the DB is temporarily unreachable during the first isInstalled()
call, caching false permanently locks the app into "not installed"
state for the JVM lifetime. Only cache the true (terminal) state;
leave installedCached=null on failure so subsequent requests retry.

* chore: update pool docs, harden SetupIT, remove stale TODO

Address Copilot review feedback on PR #817:

- Update database.properties.example and docs/database-configuration.md
  to reflect new pool defaults (maxPoolSize=20, minIdle=5,
  connectionTimeout=5000)
- Add assumeTrue guard in SetupIT cache tests so they skip instead of
  false-passing when the database is unavailable
- Remove stale TODO in Getter.authUser (try-with-resources answers it)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Automated attack tools starve all db connections

1 participant