Skip to content

Perf issue 1641 select for update#1990

Merged
crivetimihai merged 34 commits intomainfrom
perf_issue_1641_select_for_update
Jan 13, 2026
Merged

Perf issue 1641 select for update#1990
crivetimihai merged 34 commits intomainfrom
perf_issue_1641_select_for_update

Conversation

@rakdutta
Copy link
Copy Markdown
Collaborator

@rakdutta rakdutta commented Jan 9, 2026

closes issue ##1641
This PR addresses performance and data consistency issues under high concurrency by introducing and testing row-level locking using get_for_update across critical service methods. The changes ensure that updates, toggles, and conflict checks on key resources (tools, servers, resources, prompts, gateways, A2A agents) are safe from race conditions, especially when running on PostgreSQL

Key Changes

  • Implements and tests the use of get_for_update for row-level locking in all major service classes.

  • Adds a comprehensive new test suite: test_row_level_locking.pyto verify correct locking behavior and prevent race conditions.

  • Refactors and improves mocking in existing service tests to accurately simulate locking and database behavior.

  • Updates tool, server, and agent service tests to ensure atomic operations and correct handling of concurrent modifications.

  • Enhances test coverage for edge cases, including conflict detection and concurrent toggles.


Run the Concurrency Row Locking Integration Test

Option 1: Using Docker (Recommended)

# Start a PostgreSQL container
docker run -d \
  --name postgres-test \
  -e POSTGRES_PASSWORD=test \
  -e POSTGRES_DB=test \
  -p 5432:5432 \
  postgres:15

Option 2: Using Local PostgreSQL

Ensure PostgreSQL is running locally, then create a test database and user:

CREATE DATABASE test;
CREATE USER postgres WITH PASSWORD 'test';
GRANT ALL PRIVILEGES ON DATABASE test TO postgres;

Run the Test

# Set the database URL
export DATABASE_URL=postgresql+psycopg://postgres:mysecretpassword@localhost:5432/test

# Execute the integration test
pytest tests/integration/test_concurrency_row_locking.py -v --with-integration

@rakdutta rakdutta force-pushed the perf_issue_1641_select_for_update branch from 3932617 to 4006cae Compare January 12, 2026 13:30
@rakdutta rakdutta marked this pull request as ready for review January 13, 2026 14:38
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
rakdutta and others added 7 commits January 13, 2026 18:50
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Cleanup leftover comments from get_for_update() refactor.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
- Change get_for_update default to skip_locked=False to properly wait
  for locked rows during conflict checks (prevents race conditions)
- Remove FOR UPDATE from get_server (read-only operation)
- Remove FOR UPDATE from toggle_gateway_status (holds lock during
  network I/O which causes timeouts/deadlocks)
- Fix unused imports and trailing whitespace in test files
- Fix no-op test bodies with proper pytest.skip decorators
- Update gateway toggle test to verify regular select is used

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Add is_postgresql() helper that checks both DB env var and DATABASE_URL,
so tests run correctly when using either configuration method per PR docs.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
@crivetimihai crivetimihai force-pushed the perf_issue_1641_select_for_update branch from 56ddb78 to dd25b90 Compare January 13, 2026 20:52
DELETE...RETURNING is not supported on MySQL/MariaDB. Use rowcount
check instead which works across all supported databases.

Also fixes trailing whitespace in test_tool_service.py and updates
test mock to use side_effect for different rowcount values.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Remove unused imports and trailing whitespace from test files
modified in this PR.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
@crivetimihai
Copy link
Copy Markdown
Member

Code Review Fixes Applied

After reviewing this PR, I've applied several fixes to address issues found during review:

Critical Fixes

Issue Fix
skip_locked=True default could miss conflicts Changed default to skip_locked=False - now waits for locked rows instead of skipping them
get_server using FOR UPDATE for read-only operation Reverted to regular db.execute(select(...)) - no locking needed for reads
toggle_gateway_status holding lock during network I/O Reverted to regular select - holding DB locks during _initialize_gateway network calls would cause timeouts/deadlocks
DELETE...RETURNING not supported on MySQL/MariaDB Changed to rowcount check which works across all databases

Test Improvements

Issue Fix
Test gating only checked DB=postgres Added is_postgresql() helper that checks both DB env var and DATABASE_URL
Unused imports (F401) in test files Removed with autoflake
Trailing whitespace (W293) in test files Fixed with black
No-op test bodies Added @pytest.mark.skip decorators with reasons
Test mock for delete operations Updated to use side_effect for different rowcount values

Commits Added

6d513f9c chore: Fix lint issues in PR test files
ebf03e89 fix: Use rowcount instead of RETURNING for MySQL compatibility
dd25b908 fix: Improve PostgreSQL detection in concurrency tests
ce55fa07 fix: Address row-level locking issues in PR #1990
7a0f9485 chore: Remove commented out db.get() calls

Verification

  • make flake8 passes
  • make ruff passes
  • ✅ All 1611 unit tests pass
  • ✅ No F401/W293 lint errors in PR files

Design Note

toggle_gateway_status intentionally does NOT use row locking because it calls _initialize_gateway which performs network I/O. Holding a database lock during potentially slow network operations would block other database operations and risk timeouts. The code includes a comment explaining this tradeoff.

@crivetimihai crivetimihai merged commit dae4290 into main Jan 13, 2026
52 checks passed
@crivetimihai crivetimihai deleted the perf_issue_1641_select_for_update branch January 13, 2026 21:52
kcostell06 pushed a commit to kcostell06/mcp-context-forge that referenced this pull request Feb 24, 2026
* tool,serve

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* resource

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* prompt

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* name conflict

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* conflict

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* gateway

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* a2a

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* tools

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* server

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* test

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* text fixed

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* ruff

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* test file

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* pylint

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* entity_id fix

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* gateway conflict

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* test

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* delete

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* test cases fixed

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* test fixed

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* test cases fixed

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* black

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* server

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* test cases fix

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* flake8

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* doctest

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* concurrency prompt

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* concurrency

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* ruff

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>

* chore: Remove commented out db.get() calls

Cleanup leftover comments from get_for_update() refactor.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix: Address row-level locking issues in PR IBM#1990

- Change get_for_update default to skip_locked=False to properly wait
  for locked rows during conflict checks (prevents race conditions)
- Remove FOR UPDATE from get_server (read-only operation)
- Remove FOR UPDATE from toggle_gateway_status (holds lock during
  network I/O which causes timeouts/deadlocks)
- Fix unused imports and trailing whitespace in test files
- Fix no-op test bodies with proper pytest.skip decorators
- Update gateway toggle test to verify regular select is used

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix: Improve PostgreSQL detection in concurrency tests

Add is_postgresql() helper that checks both DB env var and DATABASE_URL,
so tests run correctly when using either configuration method per PR docs.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* fix: Use rowcount instead of RETURNING for MySQL compatibility

DELETE...RETURNING is not supported on MySQL/MariaDB. Use rowcount
check instead which works across all supported databases.

Also fixes trailing whitespace in test_tool_service.py and updates
test mock to use side_effect for different rowcount values.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

* chore: Fix lint issues in PR test files

Remove unused imports and trailing whitespace from test files
modified in this PR.

Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>

---------

Signed-off-by: rakdutta <rakhibiswas@yahoo.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
Co-authored-by: Mihai Criveti <crivetimihai@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants