feat: add connection pooling to fix db connection exhaustion (#536)#816
Draft
ismisepaul wants to merge 19 commits intodevfrom
Draft
feat: add connection pooling to fix db connection exhaustion (#536)#816ismisepaul wants to merge 19 commits intodevfrom
ismisepaul wants to merge 19 commits intodevfrom
Conversation
Replace per-request DriverManager connections with HikariCP connection pooling for MySQL/MariaDB and singleton MongoClient for MongoDB. This prevents automated tools (e.g. ZAP spider/fuzzer) from exhausting all database connections and causing multi-hour outages. - Add ConnectionPool using HikariCP with configurable pool size, timeouts - Refactor Database.java to delegate all connections through the pool - Refactor MongoDatabase.java to use singleton pattern with internal pooling - Add DatabaseLifecycleListener for clean startup/shutdown of pools - Add connection pool tests and MongoDB singleton tests - Add database configuration, testing, and development workflow docs - Add example properties files for MySQL and MongoDB configuration Made-with: Cursor
- Add missing imports (Arrays, FileInputStream) in MongoDatabase.java lost during rebase conflict resolution - Wrap ConnectionPool init failure as SQLException to maintain API compatibility with callers that catch SQLException - Convert ConnectionPoolTest and DatabaseLifecycleListenerTest from JUnit 4 to JUnit 5 so surefire discovers and runs them - Add mongo_challenge_test.properties test resource for MongoDatabaseTest - Apply Google Java Format to all modified files Made-with: Cursor
The pinned v3.6.0 action uses Node 12 (deprecated, scheduled for removal Sep 2026) and may download an incompatible google-java-format version for the current ubuntu-latest JDK. Update to v4 (pinned to c1134eb) which uses Node 20 and auto-downloads the latest compatible GJF release. Also update checkout to v4. Made-with: Cursor
Made-with: Cursor
- Change allowMultiQueries=yes to allowMultiQueries=true in Database.java (HikariCP requires strict boolean values, not yes/no) - Catch RuntimeException in ConnectionPoolIT.testChallengeConnection (HikariCP throws PoolInitializationException which extends RuntimeException) Made-with: Cursor
ConnectionPool: read DriverType from properties with URL-based fallback for mariadb/mysql drivers. Fixes Tomcat classloader issue where JDBC 4 auto-registration doesn't work for webapp-scoped drivers. Setup: fix XOR logic for db host/port validation — the original condition rejected valid input when both fields were provided. Extract validation into testable validateHostPort() method. Add SetupTest with 7 tests covering all host/port combinations. Made-with: Cursor
Challenge pools use per-schema credentials by design (isolation between challenges), but don't need the same sizing as the core pool. Set challenge pools to max 3 / min idle 0 / 2min idle timeout so they hold zero connections when unused.
Replace the silent-pass testChallengeConnection with tests that verify: - Challenge connections are valid and usable - Same credentials reuse the same pool - Different schemas create separate pools - Challenge pools use correct sizing (max 3, min idle 0, 2min timeout) - Max connections are enforced at pool limit - Shutdown clears all challenge pools - Concurrent challenge connections succeed and share one pool Also adds getChallengePoolCount() and getChallengePool() accessors to ConnectionPool for test observability.
Configures HikariCP's leakDetectionThreshold to log a warning when a connection is held for longer than 60 seconds without being closed. Helps identify connection leaks that could lead to pool exhaustion.
Documents the MariaDB init failure caused by Docker caching an image before the DELIMITER conversion script runs. Includes the fix (--no-cache rebuild + volume cleanup) in both AGENTS.md and docs/database-configuration.md.
Documents the setup.jsp flow required after docker compose up, including how to get the auth token and the correct database hostname (container name, not localhost).
The setup servlet uses 'dbauth' (not 'authToken') and requires MongoDB params (mhost, mport) even when not using mongo challenges. Documents the correct curl command and all parameter names for both agents and human contributors.
Python load test that boots the stack, configures the platform, registers 20 users, and simulates 17 normal users browsing alongside 3 aggressive users doing rapid automated scanning. Monitors DB connections and app responsiveness, reports pass/fail based on connection bounds and response times.
Made-with: Cursor
5 tasks
…#817) * fix: return core DB connections to pool from Getter methods - Use try-with-resources for all Getter paths that used manual closeConnection - Keep legacy ResultSet APIs (getClassInfo all classes, getPlayersByClass, getAdmins) unchanged - Add ConnectionPool.getCoreActiveConnections for tests - Add GetterCorePoolLeakIT to assert bounded pool usage under repeated authUser * style: format Getter.java with Google Java Format * fix: cache isInstalled(), tune pool for stability under load Setup.isInstalled() was called on every HTTP request via SetupFilter, each time reading database.properties from disk and borrowing a core pool connection just to check non-null. Under concurrent load this exhausted the pool and cascaded into a full app lockup. Cache with volatile Boolean + double-checked locking so the check runs once, then returns constant-time on all subsequent requests. resetInstalledCache() called after setup completes so the first post-setup request re-evaluates. Warm the cache at startup from DatabaseLifecycleListener.contextInitialized(). Pool tuning: - maxPoolSize 10 → 20 (supports realistic classroom concurrency) - connectionTimeout 30s → 5s (fail fast under overload instead of blocking Tomcat threads for 30s each) - minIdle 2 → 5 (reduce cold-start latency) Known limitation: authUser holds a DB connection during Argon2 password verification (~100-200ms). This limits throughput under high concurrency. Follow-up will release connection before hashing. Load test updated with --target and --concurrency flags for targeted per-class/per-method testing instead of broad soak only. * fix: only cache isInstalled()=true, not transient false If the DB is temporarily unreachable during the first isInstalled() call, caching false permanently locks the app into "not installed" state for the JVM lifetime. Only cache the true (terminal) state; leave installedCached=null on failure so subsequent requests retry. * chore: update pool docs, harden SetupIT, remove stale TODO Address Copilot review feedback on PR #817: - Update database.properties.example and docs/database-configuration.md to reflect new pool defaults (maxPoolSize=20, minIdle=5, connectionTimeout=5000) - Add assumeTrue guard in SetupIT cache tests so they skip instead of false-passing when the database is unavailable - Remove stale TODO in Getter.authUser (try-with-resources answers it)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #536 — automated attack tools (e.g. ZAP spider/fuzzer) exhaust all MySQL connections, causing multi-hour outages.
Changes
ConnectionPool.javaDatabase.javaMongoDatabase.javaDatabaseLifecycleListener.javaweb.xmlpom.xmldocs/database-configuration.mddocs/development-workflow.mddocs/testing.mdCONTRIBUTING.md*.properties.example*Test.javaDefault Pool Configuration
pool.maximumPoolSizepool.minimumIdlepool.connectionTimeoutpool.idleTimeoutpool.maxLifetimeAll values are configurable via
database.properties.Test plan
mvn testwith Docker db containers running)database.propertiesare respectedDiscussion (from #800 comments)
The following was posted on PR #800 so reviewers do not need to open that draft for context.
Challenge pool sizing
After reviewing the challenge connection architecture:
Per-challenge pools are the right design — Each challenge uses different DB credentials scoped to its own schema (26 challenge properties files, each with unique username/password). This is intentional: it prevents a SQL injection in one challenge from accessing another challenge's data. A single shared pool with
setCatalog()would require a single DB user with access to all schemas, defeating that isolation.Challenge pools should be smaller than core — The connection exhaustion problem (#536) is caused by fuzzers/ZAP hammering core app endpoints (auth, scoreboard, module lookups via
Getter/Setter), not challenge endpoints. Challenges are low-traffic by nature. Keeping pooling for challenges still helps (avoids TCP handshake + auth per request), but withminIdle=0there is zero cost when a challenge is not in use, and connections spin up on demand when a student starts working.Defaults in
ConnectionPool.javafor challenge vs core:maxPoolSizeminIdleidleTimeoutWith 26 challenges, worst case is 26 × 3 = 78 connections if every challenge is under active use simultaneously; in practice with
minIdle=0and a 2-minute idle timeout, idle challenge pools hold zero connections.Connection leaks and follow-up PRs
Pooling exposes pre-existing connection leaks in
Getter.java(~32 methods) andSetter.java(~24 methods). Without try-with-resources, connections borrowed from the pool may not be returned — the core pool (max 10) can exhaust within minutes under normal use.Required follow-up: convert
Database.get*Connection()usage to try-with-resources and remove manualDatabase.closeConnection(); add tests alongside fixes. Priority:Getter.java(every request) >Setter.java(admin / user ops) > challenge servlets (isolated pools, lower risk).Status:
Getter.javafixes are in #817 (stacked on this PR).Setter.javaand servlets are tracked separately (e.g. #815).