Skip to content

Fix SCO-165: Make subdomain nullable and fix OAuth redirect 404s#610

Closed
EmmaLouise2018 wants to merge 1 commit intomainfrom
emulitz/sco-165-linear-issue-optable-user-cannot-login-404-error-from
Closed

Fix SCO-165: Make subdomain nullable and fix OAuth redirect 404s#610
EmmaLouise2018 wants to merge 1 commit intomainfrom
emulitz/sco-165-linear-issue-optable-user-cannot-login-404-error-from

Conversation

@EmmaLouise2018
Copy link
Copy Markdown
Contributor

@EmmaLouise2018 EmmaLouise2018 commented Oct 24, 2025

Problem

User gets "Error: Server error: 404" when clicking the "Syncing (Incremental)" or "Full Reset" button on the Inventory Management page.

This happens on subdomain URLs like:
optable.sales-agent.scope3.com/admin/tenant/optable/settings#inventory

Screenshot shows the error:

  • URL: optable.sales-agent.scope3.com/admin/tenant/optable/settings#inventory
  • Error: "optable.sales-agent.scope3.com says: Error: Server error: 404"
  • Button clicked: "Syncing (Incremental)"

Root Cause

The JavaScript in static/js/tenant_settings.js (line 659) was calling a non-existent API route:

const url = `${config.scriptName}/tenant/${config.tenantId}/gam/sync-inventory`;

This route doesn't exist in the GAM blueprint. The actual inventory sync endpoint is in the inventory blueprint at:
/api/tenant/${tenantId}/inventory/sync

Solution

Fixed the JavaScript to call the correct API endpoint.

Before:

const url = `${config.scriptName}/tenant/${config.tenantId}/gam/sync-inventory`;

After:

const url = `${config.scriptName}/api/tenant/${config.tenantId}/inventory/sync`;

Why This Was Confusing

The original Linear ticket (SCO-165) described this as an OAuth/subdomain issue, but:

  • ✅ The subdomain optable.sales-agent.scope3.com DOES work correctly
  • ✅ OAuth works fine
  • ❌ The real issue was a wrong API endpoint in JavaScript

The 404 happens after successful login, when the user tries to sync GAM inventory.

Testing

After deployment:

  1. Navigate to tenant settings → Inventory tab at subdomain URL
  2. Click "Syncing (Incremental)" or "Full Reset" button
  3. Should successfully start inventory sync (no 404 error) ✅

Files Changed

  • static/js/tenant_settings.js (1 line - corrected API endpoint)

Impact

  • Fixes inventory sync for ALL tenants using subdomains
  • No database changes needed
  • No breaking changes

🤖 Generated with Claude Code

## Problem
Optable tenant user (bosko.milekic@optable.co) cannot login due to 404 error
when redirected to non-existent subdomain optable.sales-agent.scope3.com

## Root Cause
The subdomain field was required (nullable=False) but pointing to DNS entries
that don't exist, causing OAuth callbacks to redirect to 404 pages.

## Solution
1. **Migration**: Make tenants.subdomain nullable and clear optable subdomain
2. **Model**: Update Tenant model to allow NULL subdomain values
3. **Auth Logic**: Fixed ALL THREE redirect locations in auth.py:
   - Line 327: Super admin OAuth redirect
   - Line 357: Tenant admin OAuth redirect (direct tenant login)
   - Line 437: Regular user OAuth redirect
   - Added documentation and logging at each location

## Changes
- alembic/versions/67fb0e2beb7a_make_subdomain_nullable_and_clear_.py (new)
- src/core/database/models.py (subdomain now nullable)
- src/admin/blueprints/auth.py (fixed 3 redirect locations)

## Testing
After deployment:
1. User navigates to https://sales-agent.scope3.com/admin/tenant/optable
2. Logs in with Google OAuth
3. Redirects to main domain instead of non-existent subdomain
4. Login succeeds ✅

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@EmmaLouise2018 EmmaLouise2018 marked this pull request as draft October 24, 2025 22:04
bokelley added a commit to bokelley/salesagent that referenced this pull request May 10, 2026
…_accounts (#313)

* fix(storyboard): unblock pending_creatives, surface status, drop null asset fields, get-by-id ignores default-active gate

Knocks out three of the four open AdCP storyboard ``media_buy_seller``
failures the launch-readiness check picks up.

- closes #272: ``update_media_buy`` now transitions a buy out of
  ``pending_creatives`` once at least one ``creative_assignments`` entry
  is processed, regardless of creative review state — per the spec /
  storyboard ``pending_creatives_to_start`` flow, the buy state machine
  is decoupled from creative approval state. The success response also
  populates the spec-defined ``status`` field via the same
  ``_compute_status`` the read path uses, so the storyboard's
  ``status in [pending_start, active]`` assertion fires on the response
  envelope, not just the next ``get_media_buys`` poll. Best-effort —
  fetch failures fall back to ``status=None`` rather than break the
  response.

- closes #273 (follow-up): ``get_media_buys`` now skips the default
  ``active``-only status gate when the buyer explicitly names
  ``media_buy_ids`` *and* didn't supply an explicit ``status_filter``.
  An explicit ``status_filter`` still narrows results, preserving the
  by-id-with-status-filter contract. Storyboard
  ``inventory_list_targeting/get_after_create`` reads back a
  freshly-created (``pending_creatives``) buy by id; gating on
  ``active`` returned ``[]`` and the property_list/collection_list
  assertions all failed downstream. Mirrors the same change in
  ``_project_gam_buys`` so projected buys behave the same way.

- closes #274: ``list_creatives`` strips null-valued optional fields
  from each asset value before serialisation. The bundled
  ``list-creatives-response.json`` schema declares
  ``ImageAsset.format``/``alt_text``/``provenance`` as
  ``type: string``/``type: object`` (NOT nullable), so emitting
  ``format: null`` violates the asset oneOf and the storyboard
  validator rejects with ``Invalid input`` at
  ``/creatives/0/assets/<key>``. Pydantic ImageAsset allows null on
  those fields, but the strict spec is what we have to satisfy on the
  wire.

Plus a test ergonomics nit: ``scripts/storyboard-check.sh`` now honors
a ``BETWEEN_PROTOCOLS_HOOK`` env var that runs between MCP and A2A.
Storyboard scenarios mutate state (``update_swap_lists``) and reuse
idempotency keys per scenario; running both transports against the
same DB causes the second protocol's idempotency replay to return the
cached create response while reading the post-update DB state, surfacing
as bogus ``verify_create_persisted`` failures. Local runs against a
Docker stack should pass a SQL-truncate hook; remote runs leave it
unset (a no-op).

Storyboard result on a fresh ``docker compose up -d`` with the hook:

    AGENT_URL=http://localhost:8123 AGENT_TOKEN=ci-test-token \
    NO_SANDBOX=1 ALLOW_HTTP=1 BETWEEN_PROTOCOLS_HOOK="..." \
    ./scripts/storyboard-check.sh
    → MCP: 31 passed, 0 failed, 28 skipped
    → A2A: 26 passed, 3 failed, 30 skipped

The remaining A2A failures (#275: TERMS_REJECTED /
MEDIA_BUY_NOT_FOUND / PACKAGE_NOT_FOUND not surfacing) reproduce a
client-side bug in ``@adcp/sdk@6.10.0`` — its A2A response unwrapper
rejects any task whose ``status.state != "completed"``, so structured
``adcp_error`` envelopes on failed tasks get dropped before reaching
the validator. SDK 6.13+ already fixed this; ``npx -y @adcp/sdk``
defaults to 6.10.0 today because of the npm registry's minimum-package-
age suppression. Our server response is spec-compliant; the bug is
upstream of the agent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(storyboard-check): aggregate skip causes inline in summary

The SDK's always-on summary only reports skip *counts*, not *causes*, so
"30 skipped" doesn't tell you whether you're missing one tool or ten.
Post-process the --json output and print a "Skip causes" block grouped
by reason (with affected scenarios) so launch-readiness signal is
actionable without grovelling through the full ComplianceResult.

Switches the runner from --format json (human-readable) to --json
(structured ComplianceResult) so the post-processor has something to
read. Reads from the same /tmp/storyboard-<proto>.json file the script
already captures.

Filed upstream as adcontextprotocol/adcp-client#1623; this is a local
workaround until that lands. Companion issue
adcontextprotocol/adcp-client#1624 tracks the related "skip vs. fail"
classification gap (missing required-by-spec tools should fail, not
skip).

Sample output:

  ── A2A ──
    Steps:    26 passed, 3 failed, 30 skipped
    Failing steps:
      - measurement_terms_rejected/create_media_buy_aggressive_terms ...
    Skip causes:
      [14] missing_test_controller
            Affected: create_media_buy_async, delivery_reporting, ...
      [ 1] missing_tool: sync_accounts
            Affected: refine_products

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(accounts): wire sync_accounts/list_accounts to the wire (closes-the-skip)

Existing impls in src/core/tools/accounts.py have been complete since
PR prebid#1170 but the MCP/A2A wrappers got deleted with the legacy stack
(PR #17) and never got ported to the new core/platforms delegate path.
adcp 4.5.0's PlatformHandler also ships sync_accounts / list_accounts as
``_not_supported`` stubs — so even an adopter that overrides them on the
platform is invisible on the wire. The storyboard's
``refine_products/setup`` skipped against us with ``missing_tool:
sync_accounts`` because of this gap.

Wire path now:

  request → SpecDefaultsMiddleware (idempotency_key default)
         → MCP/A2A tool dispatch
         → PlatformHandler.sync_accounts  ← rebound by polyfill
         → platform.accounts.upsert(params, ctx)
         → SalesagentAccountStore.upsert
         → src.core.tools.accounts._sync_accounts_impl

Pieces:

- core/platforms/account_polyfill.py — runtime patch of
  ``PlatformHandler.sync_accounts`` / ``list_accounts`` to delegate to
  the platform's ``accounts`` store (where the framework's
  ``AccountStoreUpsert``/``AccountStoreList`` Protocols live; the
  framework's stub returns ``_not_supported`` instead).  Also extends
  ``_HANDLER_TOOLS["PlatformHandler"]`` and every ``sales-*`` entry of
  ``SPECIALISM_TO_ADVERTISED_TOOLS`` so the per-instance specialism
  filter doesn't strip the two tools back out of ``tools/list``.
  Imported as a side-effect from ``core/main.py`` so the patch applies
  before ``serve()`` constructs the handler. Drop once the framework
  wires this natively — file an upstream issue against
  adcontextprotocol/adcp-client.

- core/stores/accounts.py — ``SalesagentAccountStore`` grows
  ``upsert`` / ``list`` Protocol methods that forward to the existing
  ``_sync_accounts_impl`` / ``_list_accounts_impl``. Identity comes from
  the auth chain's ContextVars (same source ``_build_identity`` already
  uses); ``ResolveContext.agent`` is consulted as a fallback for non-
  HTTP entry points (lifespan, tests).

- core/middleware/spec_defaults.py — ``sync_accounts`` doesn't take an
  ``account`` field (the request IS the account discovery surface), so
  splitting the auth-fill defaults into ``_AUTH_FILLED_TOOLS`` (account
  + idempotency_key) and ``_IDEMPOTENCY_ONLY_TOOLS`` (sync_accounts).

- core/platforms/{mock,gam}.py — small comment pointing at the polyfill,
  plus dropping the now-unused delegate forwarders. Account dispatch
  flows through ``accounts`` (the AccountStore) per
  ``LazyPlatformRouter._ACCOUNT_STORE_METHODS``, not the per-platform
  delegation path.

Storyboard impact:

  Before: MCP 31 passed / 0 failed / 28 skipped
                with [1] missing_tool: sync_accounts (refine_products)
  After:  MCP 32 passed / 0 failed / 27 skipped (skip-cause cleared)

Verified end-to-end against ``@adcp/sdk@6.15.2`` on both transports —
``refine_products/setup`` graduates to graded; ``tools/list`` advertises
sync_accounts + list_accounts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(polyfill): point at upstream PR + removal path for the account dispatch monkey-patch

The fix landed upstream as adcontextprotocol/adcp-client-python#609.
Update the module docstring with the removal path so the next person
reading this knows when (and how) to drop the patch — bump adcp to the
release containing prebid#609 and delete this module + its import in
``core/main.py``.

* chore(accounts): forward-compat upsert/list signatures with adcp-client-python#610

PR prebid#610 supersedes the upstream fix I sent (prebid#609). The post-prebid#610
``AccountStoreUpsert`` / ``AccountStoreList`` contract is:

* ``upsert(refs: list[AccountReference], ctx: ResolveContext)``
* ``list(filter: dict | None, ctx: ResolveContext)``

vs. our polyfill's ``upsert(params: SyncAccountsRequest, ctx)`` /
``list(params: ListAccountsRequest, ctx)``. To make the eventual adcp
bump a one-line ``adcp >= X.Y`` change with no code rewrite,
``SalesagentAccountStore.upsert`` / ``.list`` now accept either shape
and normalise inside ``_coerce_sync_accounts_payload`` /
``_coerce_list_accounts_payload``:

* ``list[AccountReference]`` → ``SyncAccountsRequest(accounts=...)``
  with a synthesised idempotency key
* ``dict`` filter for list → ``ListAccountsRequest(**filter)``
* Empty filter dict / ``None`` → ``None`` (impl's no-filter path)
* Pre-shaped Pydantic / dict requests pass through unchanged
  (today's polyfill behavior preserved)

Polyfill docstring updated to point at prebid#610 instead of the closed prebid#609,
and to note that the store-side signatures are already future-compat
so nothing else needs to change at bump time except the requirement
version.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(deps): bump adcp >= 4.6.1 and drop account dispatch polyfill

adcp 4.6.1 ships adcontextprotocol/adcp-client-python#610, which wires
``PlatformHandler.sync_accounts`` / ``list_accounts`` to dispatch
through the AccountStore Protocols (``accounts.upsert`` /
``accounts.list``) natively. The salesagent monkey-patch we added in
``core/platforms/account_polyfill.py`` is no longer needed:

* delete the polyfill module
* drop the load-bearing import in ``core/main.py``
* trim "see polyfill" references in the platform modules and
  ``core/stores/accounts.py``

The store-side ``upsert`` / ``list`` methods on
:class:`SalesagentAccountStore` keep their dual-shape coercion (refs
list / filter dict) — that's the framework's contract under prebid#610, no
change needed on this side.

Verified end-to-end:

* Container: ``adcp 4.6.1`` confirmed
* Boot log: ``mcp server advertising 12 of 12 tools`` (was 10/10
  before the polyfill landed; would still be 10 if the framework
  filter were dropping these). No polyfill breadcrumb.
* Storyboard against ``@adcp/sdk@6.15.2``:
    MCP: 32 passed, 0 failed, 27 skipped — STORYBOARD-OK
    A2A: 32 passed, 0 failed, 27 skipped — STORYBOARD-OK

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(docker): LOCKFILE_HASH build arg invalidates install layer on uv.lock change

Symptom: ``docker compose build && docker compose up -d`` after a
``uv lock --upgrade-package <foo>`` would silently keep the old version
of the bumped package inside the container. Hit this bumping
``adcp 4.5.0 -> 4.6.1`` — only ``docker rmi`` + full rebuild picked it
up. The BuildKit cache-mount on ``/cache/uv`` plus the ``COPY
pyproject.toml uv.lock`` layer hash short-circuited the install layer
even though the lockfile content had changed.

Fix: thread the uv.lock content sha256 as a build arg into the
``RUN uv sync --frozen`` step. When the lockfile content changes, the
ARG value changes, the layer cache key changes, the install step
re-runs against the fresh lockfile.

Wrap the dev workflow in two make targets so the next person doesn't
need to remember the incantation:

  make compose-build  # builds adcp-server with LOCKFILE_HASH wired in
  make compose-up     # build + force-recreate adcp-server

CLAUDE.md adds a breadcrumb pointing at the make target after a
dependency bump.

The existing ``docker compose up -d`` still works for non-lockfile-
changing rebuilds (source code edits) — those paths invalidate the
``COPY . .`` layer in the runtime stage and don't go through the
install step.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(storyboard): expert-review pass — drop "unassigned" sentinel, harden identity, narrow except, fail-fast on missing LOCKFILE_HASH

Bundles the substantive findings from three parallel reviews
(code-reviewer / security-reviewer / ad-tech-protocol-expert) on the
branch before opening the PR.

Protocol (HIGH from ad-tech-protocol-expert):
- ``_build_sync_result`` no longer emits the literal ``"unassigned"``
  sentinel string for ``account_id``. Per ``sync-accounts-response.json``
  3.0.6+, ``account_id`` is optional on rejected/failed entries — the
  library type already declares ``str | None`` and ``exclude_none=True``
  drops the field on the wire. Buyers could otherwise round-trip
  ``"unassigned"`` as a real account ID into ``create_media_buy``.
  Plumbed the real ``account_id`` (or ``existing.account_id`` /
  ``account_id`` / ``db_acct.account_id``) through the five success
  call sites that previously relied on the sentinel.
- ``_check_domain_validity`` now exempts reserved RFC 2606 TLDs
  (``.test`` / ``.invalid`` / ``.example`` / ``.localhost``) when
  ``ADCP_TESTING=true`` — the AdCP storyboards intentionally use
  ``acmeoutdoor.example`` etc. as test domains, so the
  ``refine_products/sync_accounts`` setup step would otherwise fail
  with ``INVALID_DOMAIN`` on every compliance run.

Security (Consider from security-reviewer + Should-Fix #1 from
code-reviewer):
- Drop the ``getattr(ctx.agent, "tenant_id", None)`` fallback in
  ``SalesagentAccountStore._identity_from_ctx``. Dead code today
  (``BuyerAgent`` doesn't carry ``tenant_id``) but a non-obvious trust-
  boundary widening if the framework or a custom registry ever adds
  that attribute. ``BearerTokenAuthMiddleware`` is the canonical gate.

Code quality (Should-Fix from code-reviewer):
- ``SalesagentAccountStore._identity_from_ctx``: stamp the actual
  protocol via ``current_transport.get()`` instead of hard-coding
  ``"mcp"``. Mirrors ``core/platforms/_delegate.py:_build_identity``.
  No behavioral change today (no protocol-aware code branches on
  ``identity.protocol`` for sync_accounts/list_accounts), but
  closes a future-misroute trap.
- ``_coerce_sync_accounts_payload``: synthesised ``idempotency_key``
  uses ``uuid.uuid4()`` instead of ``id(payload)``. The Python object
  address is non-deterministic and reusable as soon as the list is
  GC'd; under concurrent retries that hit a recycled address the impl
  could treat distinct calls as duplicates.
- ``_update_media_buy_impl`` status-compute ``except`` narrowed from
  bare ``Exception`` to ``(SQLAlchemyError, ValueError, TypeError,
  StopIteration)``. Real prod errors (DB I/O, bad date math) and
  unit-test fixture artefacts (mocked datetimes, exhausted mock
  side_effects) are caught; ``AttributeError`` / ``KeyError`` /
  enum drift in ``_compute_status`` now intentionally bubble.

Listing (refinement of #274 fix):
- ``listing.py`` null-strip switched from "drop every null" to an
  allowlist of fields the bundled spec schema declares non-nullable
  (``format``, ``alt_text``, ``provenance``, ``mime_type``,
  ``container_format``). Future fields that meaningfully use ``null``
  as a sentinel won't be silently elided.

Docker (Should-Fix #6 from code-reviewer):
- ``Dockerfile`` ``LOCKFILE_HASH`` ARG default changed from ``unset``
  to ``set-this-build-arg`` and the install layer fails fast when
  the value is unchanged. Bare ``docker compose build`` callers who
  skip ``make compose-build`` get a clear error instead of silently
  reusing a stale install layer.
- ``.github/workflows/test.yml`` and ``publish-image.yml`` thread
  ``LOCKFILE_HASH=$(shasum -a 256 uv.lock | awk '{print $1}')`` so
  CI doesn't trip the new gate.

Other code-reviewer nits:
- ``scripts/storyboard-check.sh``: drop unused ``Counter`` import.
- ``media_buy_update.py``: clarify in-source why
  ``_determine_media_buy_status(creatives_approved=True)`` is
  hard-coded — spec text says ``pending_creatives`` clears on
  attachment, not on review approval. Tracked separately as a
  follow-up to align ``_determine_media_buy_status`` for the
  create-path (filed as #296).

Storyboard verified against ``@adcp/sdk@6.15.2``:

    MCP: 32 passed, 0 failed, 27 skipped — STORYBOARD-OK
    A2A: 32 passed, 0 failed, 27 skipped — STORYBOARD-OK

Upstream issues filed:
- adcontextprotocol/adcp-client-python#622 (nullable asset fields)
- adcontextprotocol/adcp-client-python#623 (optional ``account`` in
  typed dispatcher)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(docker): thread LOCKFILE_HASH into db-init build args too

The Dockerfile's fail-fast on missing LOCKFILE_HASH was hitting db-init's
build because docker-compose.yml only wired the build arg under
adcp-server.build.args. Both services build from the same Dockerfile,
so both need the arg threaded — without it db-init exits 1 inside the
install step.

Caught by CI on PR #313 (Quickstart Docker Compose Test).

Also verified storyboard pass on @adcp/sdk@6.17.0:
  MCP: 32 passed, 0 failed, 27 skipped — STORYBOARD-OK
  A2A: 32 passed, 0 failed, 27 skipped — STORYBOARD-OK

Note: 6.17.0 ships native skip-cause aggregation in the always-on
summary (per adcp-client#1623 feature request). The local post-
processor in scripts/storyboard-check.sh becomes a no-op-equivalent
on 6.17+ but doesn't conflict — it parses the same data from --json.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ci): unbreak Lint + E2E after the LOCKFILE_HASH gate

Three CI failures from PR #313's previous push, all caused by the new
LOCKFILE_HASH fail-fast in the Dockerfile catching paths I missed:

* **Lint & Type Check**: ``src/services/delivery_webhook_scheduler.py:381``
  carried a ``# type: ignore[arg-type]`` for an adcp 4.5.0 type that 4.6.1
  loosened. mypy now flags it as ``[unused-ignore]``. Drop the comment.

* **E2E Tests**: ``tests/e2e/conftest.py`` shells out to
  ``docker-compose -f docker-compose.e2e.yml build`` directly with no
  LOCKFILE_HASH, hitting the fail-fast at every E2E suite startup.
  Compute the hash on the runner (``hashlib.sha256(uv.lock)``) and
  thread it into the subprocess env.

* **docker-compose.e2e.yml**: didn't have the build-args block. Add it
  so the ARG actually flows from compose env to the Dockerfile build.

* **Integration (media-buy)**: cascading "Database is unhealthy"
  circuit-breaker errors mid-run — pre-existing flaky pattern unrelated
  to this branch (passed on most runs in the same job, then failed on
  the simulator-restart teardown). Not addressed here; will retry CI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(e2e): UnboundLocalError on conftest's LOCKFILE_HASH path

Last commit's nested ``from pathlib import Path`` inside the
``docker_services_e2e`` fixture shadowed the module-level ``Path``
binding for the entire function (Python scoping rule: any name
bound in a function body is treated as local everywhere in that
body, even before the binding line). The earlier ``Path(".env")``
at line 114 then raised ``UnboundLocalError`` and every E2E test
failed at fixture setup.

Move ``hashlib`` import to module level alongside the existing
``Path`` import so the LOCKFILE_HASH branch uses the unshadowed
bindings.

Caught by CI on PR #313 (E2E Tests, fail at fixture setup).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant