Skip to content

Security audit: tenant isolation crossover report#217

Merged
RafaelPo merged 1 commit intomainfrom
audit-tenant-crossover
Feb 25, 2026
Merged

Security audit: tenant isolation crossover report#217
RafaelPo merged 1 commit intomainfrom
audit-tenant-crossover

Conversation

@RafaelPo
Copy link
Contributor

Summary

  • Comprehensive defensive tenant isolation audit of the MCP server (everyrow-mcp/src/everyrow_mcp/)
  • Covers all data access paths: MCP tool endpoints, REST API endpoints, Redis store, result cache, uploads, and OAuth auth
  • Audited files: tool_helpers.py, redis_store.py, result_store.py, routes.py, tools.py, app.py, auth.py, uploads.py, config.py, http_config.py, middleware.py

Findings: 0 Critical, 2 High, 4 Medium, 5 Low, 3 Info

High: REST endpoints (/api/progress/, /api/results/.../download) authenticate with poll tokens only — no user-level ownership check. Poll token leakage allows cross-tenant progress polling and full CSV data exfiltration.

Medium: Poll tokens in query params (log leakage), Redis keys not tenant-scoped, result cache has no intrinsic ownership check, upload TOCTOU window.

Positive: Fail-closed ownership model, per-request API clients, Fernet-encrypted tokens at rest, atomic auth code replay prevention, SSRF protection, HMAC-signed uploads.

Test plan

  • Review findings against codebase
  • Prioritise HIGH-1 and HIGH-2 for remediation
  • Validate recommended fixes don't break widget polling flow

🤖 Generated with Claude Code

@RafaelPo
Copy link
Contributor Author

@claude code review

@github-actions
Copy link
Contributor

github-actions bot commented Feb 25, 2026

Claude finished @RafaelPo's task in 1m 52s —— View job


Code Review: Tenant Isolation Security Audit

  • Read CLAUDE.md and repo guidelines
  • Read the audit report (TENANT_CROSSOVER_AUDIT.md)
  • Cross-reference findings against source code
  • Provide review feedback

Verification Summary

I cross-referenced every finding (HIGH through INFO) against the actual source code. All 14 findings are accurate — line numbers, code behavior, and severity assessments all check out. This is a thorough and well-structured audit.


Review Feedback

Findings are sound

HIGH-1 & HIGH-2 are the core issues and are correctly identified. The REST endpoints in routes.py (api_progress, api_download) authenticate solely via _validate_poll_token() — there is no call to _check_task_ownership() or any user identity verification. Verified at routes.py:90 and routes.py:143. This is a real gap since the MCP tool endpoints (tools.py:746, tools.py:843) do enforce ownership checks.

MEDIUM-1 through MEDIUM-4 all verified. The build_key() function at redis_store.py:43-46 confirms flat mcp:{type}:{task_id} keys with no tenant prefix. The try_cached_result() function at result_store.py:165 confirms no intrinsic ownership check. The peek-then-pop window in uploads.py:205-285 is real.

LOW and INFO findings all verified and appropriately scoped.

One additional observation

The audit notes that REST endpoints bypass the per-request _http_client_factory() pattern (INFO-2). Worth calling out more explicitly: routes.py:93 decrypts the stored API key from Redis and creates its own API client, which means the secondary tenant isolation defense (per-request token scoping) doesn't apply to the REST code path. This slightly strengthens the case for HIGH-1/HIGH-2 urgency.

Issues to address in this PR

1. Local file paths in findings — All file references use absolute local paths (/Users/rafaelpoyiadzi/Documents/git/everyrow-sdk/worktrees/audit-tenant-crossover/everyrow-mcp/...). These should use repo-relative paths (e.g., everyrow-mcp/src/everyrow_mcp/routes.py:76-126) for portability and readability. Fix this →

2. Minor formatting: The architecture diagram and summary table are well-structured. No issues there.

Overall assessment

Approve with minor changes. The audit is technically accurate, well-prioritized, and actionable. The HIGH findings represent a real gap in the REST API layer that should be addressed. The recommended fixes are practical and appropriately scoped. Just clean up the local file paths before merging.

@RafaelPo RafaelPo force-pushed the audit-tenant-crossover branch from e6b6db3 to 90ddd39 Compare February 25, 2026 11:22
@RafaelPo
Copy link
Contributor Author

@claude code review

@github-actions
Copy link
Contributor

github-actions bot commented Feb 25, 2026

Claude finished @RafaelPo's task in 1m 36s —— View job


Code Review: Tenant Isolation Crossover

  • Read CLAUDE.md and repo guidelines
  • Read PR diff and changed files
  • Cross-reference ownership checks against source code
  • Review test coverage
  • Provide review feedback

Summary

This PR adds a _validate_task_owner() check to both REST endpoints (api_progress and api_download), plus corresponding tests. The intent is to close the gap identified in the tenant isolation audit (HIGH-1/HIGH-2) where REST endpoints only validated poll tokens without any ownership check.


Key Finding: Ownership check is incomplete

The Sentry review is correct. _validate_task_owner() verifies a task has an owner, but never checks who is calling.

Compare the two implementations:

MCP tool endpoints (tools.py:69-98) — full identity check:

owner = await redis_store.get_task_owner(task_id)
# ...checks owner exists...
access_token = get_access_token()
user_id = access_token.client_id if access_token else None
if not user_id or user_id != owner:  # ← compares caller vs owner
    return [TextContent(..., text="Access denied: this task belongs to another user.")]

REST endpoints (routes.py:76-97) — existence check only:

owner = await redis_store.get_task_owner(task_id)
if not owner:  # ← only checks existence, not identity
    return JSONResponse({"error": "Task ownership could not be verified"}, ...)
logger.info("REST access granted for task %s (owner=%s)", task_id, owner)
return None  # ← always grants access if any owner exists

The docstring at routes.py:80-83 claims this is "the REST-layer equivalent of _check_task_ownership() in tools.py" — but it's materially weaker. Any valid poll token still grants access regardless of who the owner is.

This is still useful as defense-in-depth (fail-closed for tasks without an owner record), but it does not close the cross-tenant gap from HIGH-1/HIGH-2. If an attacker leaks a poll token, this check doesn't stop them.

What should happen next

The fundamental constraint is that REST endpoints have no caller identity — they authenticate via poll token only, with no OAuth context. Possible approaches for a follow-up:

  1. Bind poll tokens to user identity at creation — encode or associate the user ID in the poll token, then verify on the REST side
  2. Require an additional authentication factor on REST endpoints (e.g., a short-lived HMAC-signed cookie set during the OAuth flow)
  3. Accept poll-token-as-identity as the security model — if poll tokens have sufficient entropy (128 bits via secrets.token_urlsafe(16)) and are encrypted at rest, the token is the access control, and the ownership check only needs to be fail-closed for unregistered tasks

Option 3 is arguably what this PR implements, and it's a reasonable position — but the docstring should be honest about it rather than claiming equivalence with the MCP-layer check.

Specific feedback

1. Misleading docstring (routes.py:80-83)

The docstring should not claim equivalence with _check_task_ownership(). Suggested rewording:

"Defence-in-depth: ensures the task was properly created with an ownership record. Unlike _check_task_ownership() in tools.py, this cannot verify caller identity because REST endpoints authenticate via poll token only (no OAuth context). The poll token itself serves as the primary access control for the REST layer."

Fix this →

2. Owner logged at INFO level (routes.py:96)

logger.info("REST access granted for task %s (owner=%s)", task_id, owner)

Logging the owner ID at INFO level on every successful poll is noisy (widgets poll every few seconds) and leaks user identifiers into application logs. Consider DEBUG level, or removing the owner value from the message.

Fix this →

3. Tests are well-structured

The new test_denied_without_owner tests in all three test files (test_routes.py, test_http_integration.py, test_result_store.py) correctly verify the fail-closed behavior, and existing tests are properly updated to set up owner records. Good coverage.

4. Missing negative test: valid poll token but wrong owner

There's no test that stores a different owner and verifies access is still granted (which it would be, since the check doesn't compare identities). This would document the current behavior explicitly. If the intent is to close the gap later, having this test as a # TODO: should return 403 once identity-bound would be helpful.

Verdict

Approve with minor changes. The code change is safe and adds a useful defense-in-depth layer. The two items to address before merge:

  1. Fix the docstring to accurately describe what the check does (existence, not identity)
  2. Lower the log level from INFO to DEBUG

The broader cross-tenant gap (no caller identity on REST endpoints) is an architectural limitation that should be tracked separately.


| audit-tenant-crossover

@RafaelPo
Copy link
Contributor Author

@claude code review

@github-actions
Copy link
Contributor

github-actions bot commented Feb 25, 2026

Claude finished @RafaelPo's task in 1m 27s —— View job


Code Review: Tenant Isolation Crossover

  • Read CLAUDE.md and repo guidelines
  • Read PR diff and changed files
  • Cross-reference ownership checks against source code
  • Review test coverage
  • Provide review feedback

Summary

This PR closes the HIGH-1/HIGH-2 gap from the tenant isolation audit. The two commits form a clean progression:

  1. Commit 1: Adds _validate_task_owner() — fail-closed existence check on the REST layer
  2. Commit 2: Binds poll tokens to user identity at creation, enabling a poll_owner == task_owner cross-check

The REST security model is now: poll token as bearer credential + ownership cross-check as tamper detection. This is architecturally sound given that REST endpoints have no OAuth context.


Verification

Compared the MCP-layer check (tools.py:69-98) against the REST-layer check (routes.py:76-113):

Check MCP layer REST layer
Owner exists (fail-closed) tools.py:80 routes.py:87
Caller identity vs owner tools.py:91 (via get_access_token()) N/A — no OAuth context
Poll-token-to-owner binding N/A routes.py:98-109 (cross-check)

The cross-check at routes.py:98-109 detects a scenario where task_owner is tampered with after poll-token issuance — the stored poll_owner won't match. This is a meaningful defense-in-depth layer beyond the previous review's assessment.

The binding in tool_helpers.py:99-111 correctly extracts user_id from the OAuth access token and passes it to store_poll_token(), ensuring the poll token carries the user's identity into the REST layer.


Issues

1. INFO-level log on every successful poll (routes.py:112)

logger.info("REST access granted for task %s (owner=%s)", task_id, owner)

Widgets poll every few seconds. This will flood application logs at INFO level and leak user identifiers. Should be DEBUG.

Fix this →

2. poll_owner stored unencrypted (redis_store.py:262-266)

The poll token itself is encrypted via encrypt_value(), but the user_id (OAuth client_id) is stored as plaintext:

if user_id:
    await client.setex(
        name=build_key("poll_owner", task_id),
        time=TOKEN_TTL,
        value=user_id,  # ← plaintext
    )

Meanwhile task_owner at line 286 is also plaintext, so this is at least consistent. But worth noting that if the threat model considers user identifiers sensitive in Redis, both should be encrypted. Low priority — this could be addressed in a follow-up if needed.


Test coverage

Tests are thorough and well-structured:

  • test_denied_without_owner — verifies fail-closed (all 3 test files) ✓
  • test_denied_on_owner_mismatch — verifies the cross-check catches tampered ownership (test_routes.py:131-145) ✓
  • Existing tests properly updated with user_id="test-user" and store_task_owner() setup ✓
  • Negative path: no owner → 403; mismatched owner → 403 ✓
  • Happy path: matching owner → proceeds to next check ✓

Verdict

Approve with one minor fix. The ownership cross-check is well-designed and the docstring (routes.py:77-84) now accurately describes what the check does. Lower the log level on line 112 from info to debug before merging.


| audit-tenant-crossover

The REST endpoints /api/progress and /api/results/.../download
previously authenticated solely via poll tokens with no user-level
ownership verification. This adds a three-layer fail-closed check:

1. Task must have a recorded owner (task_owner)
2. Poll token must be bound to a user identity (poll_owner)
3. poll_owner must match task_owner (cross-check)

Changes:
- routes: add _validate_task_owner() called by api_progress/api_download
- redis_store: store_poll_token accepts user_id, new get_poll_token_owner
- tool_helpers: bind user identity when creating poll token
- tests: deny-without-owner, deny-without-poll-owner, deny-on-mismatch

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@RafaelPo RafaelPo force-pushed the audit-tenant-crossover branch from 375da25 to fee28fc Compare February 25, 2026 11:38
@RafaelPo RafaelPo merged commit 45717bf into main Feb 25, 2026
5 checks passed
@RafaelPo RafaelPo deleted the audit-tenant-crossover branch February 25, 2026 11:39
Comment on lines 100 to 105
if settings.is_http:
access_token = get_access_token()
if not access_token or not access_token.client_id:
raise RuntimeError(
f"Cannot record task owner for {task_id}: no authenticated user"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: In no-auth HTTP mode, task submission will always fail with a RuntimeError because _submission_ui_json unconditionally tries to get an access token which doesn't exist.
Severity: HIGH

Suggested Fix

Modify _submission_ui_json to handle the no-auth HTTP case. Either bypass the ownership check entirely when no access token is present in no-auth mode, or assign a default system/synthetic user as the task owner for these submissions. This will prevent the RuntimeError and allow tasks to be submitted successfully.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: everyrow-mcp/src/everyrow_mcp/tool_helpers.py#L100-L105

Potential issue: In the `_submission_ui_json` function, when running in HTTP mode
(`settings.is_http` is true), the code attempts to retrieve an `access_token`. In the
supported no-auth HTTP mode, `get_access_token()` correctly returns `None`. This
triggers a conditional check that raises a `RuntimeError`, halting task submission. This
occurs because the implementation does not distinguish between authenticated and
unauthenticated HTTP modes, enforcing an ownership check that cannot be satisfied
without a user. As a result, any tool attempting to submit a task in no-auth mode will
fail, rendering this mode unusable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant