Skip to content

Feat/remote mcp server#165

Closed
RafaelPo wants to merge 24 commits intomainfrom
feat/remote-mcp-server
Closed

Feat/remote mcp server#165
RafaelPo wants to merge 24 commits intomainfrom
feat/remote-mcp-server

Conversation

@RafaelPo
Copy link
Contributor

@RafaelPo RafaelPo commented Feb 19, 2026

Here's how I'd break this into 4 reviewable PRs, ordered by dependency:

PR 1: Refactor server.py into modules (pure refactor, no new behavior)

  • Split monolithic server.py into app.py, tools.py, models.py, utils.py, templates.py, state.py, settings.py
  • Thin server.py entry point
  • Existing tests updated for new import paths
  • ~0 net new lines — just moves code around

PR 2: HTTP transport + OAuth + Redis + GCS (the core remote server)

  • auth.py — OAuth 2.1 provider (Supabase + JWKS)
  • http_config.py — HTTP mode setup, middleware
  • routes.py — REST endpoints (progress polling, results download)
  • redis_utils.py — Redis client + key helpers
  • gcs_results.py + gcs_storage.py — GCS result caching
  • state.py updates for Redis-backed token storage
  • deploy/ (Dockerfile, docker-compose, .env.example)
  • Tests: test_auth.py, test_gcs_storage.py, test_redis_utils.py
  • New deps in pyproject.toml + uv.lock

PR 3: Results widget redesign (self-contained UI)

  • templates.py — interactive data explorer, pagination, filtering, clipboard export
  • Widget-related changes in tools.py (structured content, CSV URL)
  • test_server.py widget tests

PR 4: input_url support (builds on PR 1's utils.py)

  • utils.py — validate_url, normalize_google_url, fetch_csv_from_url, load_input
  • models.py — input_url, left_url/right_url fields
  • tools.py — switch load_csv → load_input
  • test_utils.py — URL tests

Each PR is independently reviewable and testable. PR 1 is risk-free (refactor). PR 2 is the meatiest but
self-contained behind the --http flag. PRs 3 and 4 are small and focused.

RafaelPo and others added 3 commits February 19, 2026 14:14
- OAuth 2.1 with PKCE using Supabase JWT passthrough (no API key creation)
- Fernet encryption for tokens at rest in Redis
- Redis-backed OAuth state (codes, tokens, clients) with Sentinel support
- GCS result storage with signed download URLs
- MCP Apps UI widgets for session progress and paginated results
- Poll-token auth for progress endpoint, security headers on results
- GKE Helm chart, Dockerfile, and CI/CD deployment workflow
- input_json support and scalar result handling

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Research fields returned as nested objects (e.g. {"research": {"answer": "hi"}})
were previously hidden in the widget table. Now flattened client-side to
dot-notation columns (research.answer) for display. Data returned to the LLM
is unchanged.

Also flip RESULT_STORAGE to gcs for testing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…json to screen/rank

- Replace scattered os.environ.get() calls with pydantic-settings HttpSettings
  and StdioSettings classes for centralised config with type coercion and
  validation
- Extract HTML UI templates (progress, results, session) into templates.py
- Add input_json support to ScreenInput and RankInput (matching AgentInput)
- Minor redis_utils cleanup (extract health_check_interval constant)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
return hashlib.sha256(jwt.encode()).hexdigest()[:16]


class EveryRowAuthProvider(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more complicated than I was expecting. Is there maybe something off-the-shelf that exists for this? I don't think we're doing anything especially bespoke here, so I'm surprised we're having to roll our own so much here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC keeps arguing what we have is a good solution... ofc there is some bias. Do you think it's something we can revisit later?

"pytest>=9.0.2",
"pytest-asyncio>=1.3.0",
"basedpyright>=1.22.0",
"fakeredis>=2.21.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I experimented with fakeredis a bit while testing square. It's good for some things, but has a fair few traps in there. If it's good enough for you now, then that's all good, but eventually it might be easier to spin up a small real redis and wipe it between tests

Split the ~1860-line server.py into 6 modules:
- models.py: input models and schema helpers
- state.py: ServerState dataclass with Redis-backed token storage
- routes.py: REST endpoints (progress polling, results download)
- http_config.py: HTTP mode setup (OAuth, routes, middleware)
- gcs_results.py: GCS result cache/upload and response building

The multi-pod bug where in-memory token dicts caused 403s on different
pods is fixed by state.py's async methods (store_task_token, get_task_token,
etc.) which write to both the local dict and Redis, with Redis fallback
on reads.

Also: remove 150-char cell truncation in results widget, add .env.example
for deploy config, fix Fernet key in test compose.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@RafaelPo
Copy link
Contributor Author

@claude code review

- Replace _get_result_dataframe with focused _fetch_task_result that
  returns a DataFrame or raises TaskNotReady
- Move cache eviction to state.evict_stale_results()
- Fix token TTL bug: task/poll tokens now use 24h TTL (TOKEN_TTL)
  instead of 10min RESULT_CACHE_TTL, preventing 403s on long tasks
- Remove docker-compose.test.yaml (fake Supabase, not usable)
- Remove test scripts from tracking (kept locally)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@RafaelPo
Copy link
Contributor Author

@claude code review

Completes the remaining ServerState fixes from the review:
- Use locked get/pop methods for result_cache in routes.py and server.py
- Replace state.transport != "stdio" with state.is_http in _build_inline_response

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
RafaelPo and others added 14 commits February 19, 2026 15:59
…tr access

- Make evict_stale_results async and acquire _lock before mutating
  result_cache, matching all other cache methods
- Add redis field directly to ServerState, removing the _redis property
  that reached into auth_provider._redis
- Set state.redis in configure_http_mode alongside auth_provider

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…to pydantic-settings

- Fix all 36 basedpyright errors in auth.py (generic TypeVar for redis
  helpers, _client_id() narrowing helper, redis async stub ignores),
  http_config.py (lifespan typing, GCS bucket narrowing), and server.py
  (create_model type ignore)
- Fix GCS pagination bug: cached preview from first request was served
  for all offsets. Now uses per-page Redis caching with GCS fetch-on-miss
- Add download_json() to GCSResultStore for page cache misses
- Move PREVIEW_SIZE into pydantic-settings _BaseSettings shared by both
  HttpSettings and StdioSettings; remove redundant constant and property
- Initialize StdioSettings in test conftest so state.settings is never None

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Make redis_host/redis_port/redis_db optional with defaults; add
  validator requiring either Sentinel or direct Redis config
- Hoist everyrow_api_url to _BaseSettings (shared by both transports)
- Add EVERYROW_API_URL to Helm values.yaml
- Add REDIS_ENCRYPTION_KEY to SOPS-encrypted secrets
- Type state.settings as _BaseSettings for pyright compatibility

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract FastMCP instance, lifespans, and resource handlers into app.py.
Move all 8 MCP tool functions and their helpers into tools.py.
Slim server.py to just main() and a bare import for decorator registration.
Remove local dict token storage from ServerState (Redis-only).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reorder columns in the results table so input columns (no dot) appear
first and flattened research columns (dot-prefixed from nested dicts)
appear last.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In HTTP mode, Claude AI hallucinates paths like /mnt/user-data/outputs/
that don't exist in the container. Since output_path is already ignored
in HTTP mode, only validate the .csv extension.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Extract _SingleSourceInput base with input_csv/input_data/input_json
- DedupeInput extends _SingleSourceInput (was csv-only)
- MergeInput adds left/right_input_data and left/right_input_json
- Both tools now use load_csv() and shared submission helpers
- Bump default page_size from 5 to 20 (max 100)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Estimate tokens from serialized JSON size and recommend an optimal
page_size when the current one would significantly over/undershoot
a 4K token budget. The recommendation appears in the summary text
and in the next-page call hint.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add Redis-based IP rate limiter (10 req/60s) on /auth/start and
  /auth/callback to prevent abuse
- Add Fernet encryption key canary check on HTTP startup so a wrong key
  fails fast instead of silently breaking all token lookups
- Fix handle_callback race: use _redis_get + delete-after-persist instead
  of _redis_pop so pending auth survives transient failures
- Downgrade request logging to DEBUG and stop leaking token prefixes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace opaque-token-in-Redis model with direct JWT pass-through.
SupabaseTokenVerifier validates Supabase JWTs via the project JWKS
endpoint, eliminating Fernet encryption, refresh tokens, secondary
indexes, and rate limiting. Swap cryptography dep for PyJWT[crypto].

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace basic table with a full-featured widget: sortable columns,
inline filters, row selection (click/ctrl/shift), clipboard copy as TSV,
research data popover on hover, host theme integration via ext-apps SDK,
and a toolbar with row count summary + select all / copy buttons.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Six improvements to the auth module:
1. Wrap blocking JWKS call in asyncio.to_thread()
2. Read JWT algorithm from header instead of private _algorithm attr
3. Derive expires_in from JWT's actual exp claim
4. Add 30-day TTL to client registrations (prevents unbounded Redis growth)
5. Reuse httpx.AsyncClient instead of creating one per call
6. Full refresh token support with rotation and Supabase delegation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add input_url field to _SingleSourceInput and left_url/right_url to MergeInput
  for fetching CSV data from URLs (Google Sheets, Drive, S3, any hosted file)
- Add validate_url(), normalize_google_url(), fetch_csv_from_url(), load_input()
  to utils.py with Google Sheets/Drive URL auto-normalization
- Switch all 5 tool functions from sync load_csv() to async load_input()
- Add csv_url to widget JSON and CSV download link in summary text for all result sizes
- Simplify widget export buttons to clipboard-only (Copy CSV / Copy JSON)
  since MCP App widgets run in sandboxed iframes without download permissions
- Add tests for URL validation, Google URL normalization, URL fetching, and load_input

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@RafaelPo
Copy link
Contributor Author

@claude code review

RafaelPo and others added 3 commits February 20, 2026 09:45
Resolve merge conflicts in pyproject.toml (combined deps), server.py
(kept refactored entry point), and uv.lock (regenerated). Port
_validate_response_schema and _validate_screen_response_schema from
main's monolithic server.py into models.py. Update manifest.json to
add everyrow_single_agent and fix everyrow_results description.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@RafaelPo
Copy link
Contributor Author

@claude code review

@RafaelPo RafaelPo marked this pull request as draft February 20, 2026 09:55
RafaelPo added a commit that referenced this pull request Feb 20, 2026
Part 2 of the remote MCP server split: adds Streamable HTTP transport
with OAuth 2.1 (Supabase JWKS), Redis-backed state for multi-pod
deployments, GCS result caching, paginated inline results, and MCP App
widget templates (progress, results, session UIs).

New modules: auth.py, http_config.py, routes.py, redis_utils.py,
gcs_storage.py, gcs_results.py, state.py, settings.py, templates.py.
Updated: models.py (_SingleSourceInput base, input_data/input_json,
SingleAgentInput, optional output_path with pagination), tools.py
(HTTP client, GCS upload, inline pagination, single_agent), server.py
(--http flag, argparse, re-exports).

Ref: feat/remote-mcp-server branch (PR #165)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
RafaelPo added a commit that referenced this pull request Feb 20, 2026
Part 2 of the remote MCP server split: adds Streamable HTTP transport
with OAuth 2.1 (Supabase JWKS), Redis-backed state for multi-pod
deployments, GCS result caching, paginated inline results, and MCP App
widget templates (progress, results, session UIs).

New modules: auth.py, http_config.py, routes.py, redis_utils.py,
gcs_storage.py, gcs_results.py, state.py, settings.py, templates.py.
Updated: models.py (_SingleSourceInput base, input_data/input_json,
SingleAgentInput, optional output_path with pagination), tools.py
(HTTP client, GCS upload, inline pagination, single_agent), server.py
(--http flag, argparse, re-exports).

Ref: feat/remote-mcp-server branch (PR #165)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@RafaelPo RafaelPo closed this Feb 21, 2026
RafaelPo added a commit that referenced this pull request Feb 24, 2026
Re-adds the deployment infra removed in PR #165 (deferred to follow-up).
Helm chart deploys to GKE with Redis Sentinel, health probes, Gateway API
HTTPRoute, and SOPS-encrypted secrets. GitHub Actions workflow runs checks,
builds Docker image to GAR, and deploys with helm upgrade --atomic on main.

Changes from the original:
- Drop RESULT_STORAGE/GCS_RESULTS_BUCKET env vars (no GCS implementation)
- Drop REDIS_ENCRYPTION_KEY from secrets (not used in code)
- Bump appVersion to 0.3.4
- Fix .gitignore to not exclude templates/secrets.yaml

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
RafaelPo added a commit that referenced this pull request Feb 24, 2026
Re-adds the deployment infra removed in PR #165 (deferred to follow-up).
Helm chart deploys to GKE with Redis Sentinel, health probes, Gateway API
HTTPRoute, and SOPS-encrypted secrets. GitHub Actions workflow runs checks,
builds Docker image to GAR, and deploys with helm upgrade --atomic on main.

Changes from the original:
- Drop RESULT_STORAGE/GCS_RESULTS_BUCKET env vars (no GCS implementation)
- Drop REDIS_ENCRYPTION_KEY from secrets (not used in code)
- Bump appVersion to 0.3.4
- Fix .gitignore to not exclude templates/secrets.yaml

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
RafaelPo added a commit that referenced this pull request Feb 24, 2026
Restore deployment infra deferred from PR #165, updated for current codebase:

Helm chart (everyrow-mcp/deploy/chart/):
- Deployment with /health probes, rolling update, hyperdisk tolerations
- ClusterIP Service (80 -> 8000)
- Gateway API HTTPRoute using .Release.Namespace for staging isolation
- SOPS-encrypted secrets via GCP KMS (Supabase keys, URLs, Redis config)

GitHub Actions workflow (.github/workflows/deploy-mcp.yaml):
- PR trigger: runs checks only (ruff, pytest, basedpyright)
- Manual dispatch with deploy_production / deploy_staging toggles
- Pipeline: setup -> checks -> build-and-push (GAR) -> deploy (Helm)
- Slack notification on failure, --atomic rollback

Staging environment:
- Namespace: everyrow-mcp-staging
- Staging Supabase, Redis DB 14, mcp-staging.everyrow.io
- Values overlay: values.yaml + values.staging.yaml + staging secrets

Sensitive values (Supabase URLs, Redis endpoints) are in SOPS-encrypted
files, not in plain values.yaml, since the repo is public.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
RafaelPo added a commit that referenced this pull request Feb 24, 2026
Restore deployment infra deferred from PR #165, updated for current codebase:

Helm chart (everyrow-mcp/deploy/chart/):
- Deployment with /health probes, rolling update, hyperdisk tolerations
- ClusterIP Service (80 -> 8000)
- Gateway API HTTPRoute using .Release.Namespace for staging isolation
- SOPS-encrypted secrets via GCP KMS (Supabase keys, URLs, Redis config)

GitHub Actions workflow (.github/workflows/deploy-mcp.yaml):
- PR trigger: runs checks only (ruff, pytest, basedpyright)
- Manual dispatch with deploy_production / deploy_staging toggles
- Pipeline: setup -> checks -> build-and-push (GAR) -> deploy (Helm)
- Slack notification on failure, --atomic rollback

Staging environment:
- Namespace: everyrow-mcp-staging
- Staging Supabase, Redis DB 14, mcp-staging.everyrow.io
- Values overlay: values.yaml + values.staging.yaml + staging secrets

Sensitive values (Supabase URLs, Redis endpoints) are in SOPS-encrypted
files, not in plain values.yaml, since the repo is public.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
RafaelPo added a commit that referenced this pull request Feb 24, 2026
Restore deployment infra deferred from PR #165, updated for current codebase:

Helm chart (everyrow-mcp/deploy/chart/):
- Deployment with /health probes, rolling update, hyperdisk tolerations
- ClusterIP Service (80 -> 8000)
- Gateway API HTTPRoute using .Release.Namespace for staging isolation
- SOPS-encrypted secrets via GCP KMS (Supabase keys, URLs, Redis config)

GitHub Actions workflow (.github/workflows/deploy-mcp.yaml):
- PR trigger: runs checks only (ruff, pytest, basedpyright)
- Manual dispatch with deploy_production / deploy_staging toggles
- Pipeline: setup -> checks -> build-and-push (GAR) -> deploy (Helm)
- Slack notification on failure, --atomic rollback

Staging environment:
- Namespace: everyrow-mcp-staging
- Staging Supabase, Redis DB 14, mcp-staging.everyrow.io
- Values overlay: values.yaml + values.staging.yaml + staging secrets

Sensitive values (Supabase URLs, Redis endpoints) are in SOPS-encrypted
files, not in plain values.yaml, since the repo is public.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
RafaelPo added a commit that referenced this pull request Feb 24, 2026
Restore deployment infra deferred from PR #165, updated for current codebase:

Helm chart (everyrow-mcp/deploy/chart/):
- Deployment with /health probes, rolling update, hyperdisk tolerations
- ClusterIP Service (80 -> 8000)
- Gateway API HTTPRoute using .Release.Namespace for staging isolation
- SOPS-encrypted secrets via GCP KMS (Supabase keys, URLs, Redis config)

GitHub Actions workflow (.github/workflows/deploy-mcp.yaml):
- PR trigger: runs checks only (ruff, pytest, basedpyright)
- Manual dispatch with deploy_production / deploy_staging toggles
- Pipeline: setup -> checks -> build-and-push (GAR) -> deploy (Helm)
- Slack notification on failure, --atomic rollback

Staging environment:
- Namespace: everyrow-mcp-staging
- Staging Supabase, Redis DB 14, mcp-staging.everyrow.io
- Values overlay: values.yaml + values.staging.yaml + staging secrets

Sensitive values (Supabase URLs, Redis endpoints) are in SOPS-encrypted
files, not in plain values.yaml, since the repo is public.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
RafaelPo added a commit that referenced this pull request Feb 24, 2026
Restore deployment infra deferred from PR #165, updated for current codebase:

Helm chart (everyrow-mcp/deploy/chart/):
- Deployment with /health probes, rolling update, hyperdisk tolerations
- ClusterIP Service (80 -> 8000)
- Gateway API HTTPRoute using .Release.Namespace for staging isolation
- SOPS-encrypted secrets via GCP KMS (Supabase keys, URLs, Redis config)

GitHub Actions workflow (.github/workflows/deploy-mcp.yaml):
- PR trigger: runs checks only (ruff, pytest, basedpyright)
- Manual dispatch with deploy_production / deploy_staging toggles
- Pipeline: setup -> checks -> build-and-push (GAR) -> deploy (Helm)
- Slack notification on failure, --atomic rollback

Staging environment:
- Namespace: everyrow-mcp-staging
- Staging Supabase, Redis DB 14, mcp-staging.everyrow.io
- Values overlay: values.yaml + values.staging.yaml + staging secrets

Sensitive values (Supabase URLs, Redis endpoints) are in SOPS-encrypted
files, not in plain values.yaml, since the repo is public.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
RafaelPo added a commit that referenced this pull request Feb 24, 2026
Restore deployment infra deferred from PR #165, updated for current codebase:

Helm chart (everyrow-mcp/deploy/chart/):
- Deployment with /health probes, rolling update, hyperdisk tolerations
- ClusterIP Service (80 -> 8000)
- Gateway API HTTPRoute using .Release.Namespace for staging isolation
- SOPS-encrypted secrets via GCP KMS (Supabase keys, URLs, Redis config)

GitHub Actions workflow (.github/workflows/deploy-mcp.yaml):
- PR trigger: runs checks only (ruff, pytest, basedpyright)
- Manual dispatch with deploy_production / deploy_staging toggles
- Pipeline: setup -> checks -> build-and-push (GAR) -> deploy (Helm)
- Slack notification on failure, --atomic rollback

Staging environment:
- Namespace: everyrow-mcp-staging
- Staging Supabase, Redis DB 14, mcp-staging.everyrow.io
- Values overlay: values.yaml + values.staging.yaml + staging secrets

Sensitive values (Supabase URLs, Redis endpoints) are in SOPS-encrypted
files, not in plain values.yaml, since the repo is public.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
RafaelPo added a commit that referenced this pull request Feb 24, 2026
…201)

Restore deployment infra deferred from PR #165, updated for current codebase:

Helm chart (everyrow-mcp/deploy/chart/):
- Deployment with /health probes, rolling update, hyperdisk tolerations
- ClusterIP Service (80 -> 8000)
- Gateway API HTTPRoute using .Release.Namespace for staging isolation
- SOPS-encrypted secrets via GCP KMS (Supabase keys, URLs, Redis config)

GitHub Actions workflow (.github/workflows/deploy-mcp.yaml):
- PR trigger: runs checks only (ruff, pytest, basedpyright)
- Manual dispatch with deploy_production / deploy_staging toggles
- Pipeline: setup -> checks -> build-and-push (GAR) -> deploy (Helm)
- Slack notification on failure, --atomic rollback

Staging environment:
- Namespace: everyrow-mcp-staging
- Staging Supabase, Redis DB 14, mcp-staging.everyrow.io
- Values overlay: values.yaml + values.staging.yaml + staging secrets

Sensitive values (Supabase URLs, Redis endpoints) are in SOPS-encrypted
files, not in plain values.yaml, since the repo is public.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants