Reconcile main with origin/main + CrewAI 1.10 integration fixes#84
Merged
Reconcile main with origin/main + CrewAI 1.10 integration fixes#84
Conversation
bead: ar-8doh Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Aligns buyer-side agentic spec hashing with the seller-side approach so cross-repo drift detection works consistently. Per Quinn ar-8doh review. bead: ar-8doh Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…kupTool Per proposal §5.2 (data model) and §5.5 (tools). bead: ar-50cm Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds compat shim for legacy list[str] rows (first -> primary, rest -> extensions, source=inferred). Adds audience_strictness policy. Adds Content Taxonomy 2.x->3.x deletion validation at brief ingestion. Per proposal §6 row 4. bead: ar-fe0h Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…alBookingRequest Threads the audience surface through the orchestrator data classes with backward-compatible None default. Per proposal §5.2 + §6 row 5. bead: ar-9nwu Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Audience Planner agent now instantiated between brief ingest and orchestrator handoff; produces stub AudiencePlan (full reasoning loop is bead §7). - Three UCP audience tools moved from Research Agent to Audience Planner where they belong. - Mock EmbeddingMintTool added (delegates to existing UCPClient mock embedding generator). - UCP modules carry "Agentic Audiences (UCP)" rename header comments per §5.6 locked decision. bead: ar-fgyq Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per proposal §5.5: classify → pick primary → add constraints/ extensions → validate → emit plan + rationale. Pure-Python core with CrewAI shell for rationale. Graceful degradation when discovery unavailable. bead: ar-9u25 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Channel-crew factories now accept typed AudiencePlan; the audience- context formatter renders all 4 roles + rationale. Backward compat for legacy dict input preserved. Per proposal §5.3 + §6 row 19. bead: ar-5y8v Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
BuyerDealFlow now invokes the audience planner step alongside CampaignPipeline; AudiencePlan threads through any seller-bound data classes / HTTP calls. Per proposal §5.3 + §6 row 18. bead: ar-ts30 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Covers happy-path with all 3 audience types, legacy migration, serialization parity at flow→seller boundary, mocked capability- degradation scenario, and pre-set state.audience_plan precedence. Per proposal §6 row 20. bead: ar-6ipo Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the dead-code "Audience Planner (UCP)" section in docs/architecture/agent-hierarchy.md with the §8 drop-in MkDocs block from the audience-extension proposal: - Renames the agent to "Audience Planner (Agentic Audiences / UCP)" per §5.6 dual-naming policy - Adds the three-audience-types table (Standard / Contextual / Agentic) - Documents the composable overlay model (primary + constraints + extensions + exclusions) with set semantics - Documents the reasoning loop, configuration, and tools - Cross-references the parent-repo capability-negotiation guide, naming explainer, and wire-format spec - Updates the mermaid diagram label and the channel-crew "Audience context" callout to use the dual name bead: ar-nd3i Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ported Per proposal §5.7 layer 2 + §6 row 12. Composable with bead §13's pre-flight integration (the two together implement full capability negotiation per §5.7). bead: ar-0w48 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per proposal §5.7 + §6 row 13a. Append-only audit trail keyed by audience_plan_id; emits degradation, capability_rejection, snapshot_honor events. bead: ar-q2uh Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per proposal §5.6 + §6 row 14b. Buyer now emits both application/vnd.ucp.embedding+json; v=1 and the new application/vnd.iab.agentic-audiences+json; v=1; logs audience_plan_id at INFO for audit-trail correlation. bead: ar-y6ki Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per proposal §5.7 layers 1+2 + §6 row 13. Orchestrator now calls /.well-known/agent.json before booking (TTL <=1h cache, honors Cache-Control), applies degrade_plan_for_seller per audience_strictness, composes with §12's retry path for stale-cache cases. bead: ar-gkbr Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per proposal §5.1 Step 4 + §6 row 15. Builder maps Standard→ user.data[].segment[].id, Contextual→site.cat/cattax=7, Agentic →user.ext.iab_agentic_audiences.refs[] (feature-flagged). bead: ar-8vzg Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per proposal §6 row 16 (part 1 of 2). Scenarios 2-4 follow. bead: ar-lk23 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…trip Part 2 of 2 for §16. Adds capability degradation (mocked legacy seller), hard-reject on zero standard overlap, and cross-repo AudiencePlan JSON round-trip schema-drift backstop. bead: ar-lk23 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per E2-1's locked decision (docs/decisions/EMBEDDING_STRATEGY_2026-04-25.md). EMBEDDING_MODE switch in UCPClient (mock|local|advertiser|hybrid); sentence-transformers/all-MiniLM-L6-v2 for local; advertiser-supplied vectors accepted verbatim; mock fallback for CI. Adds embedding_provenance to ComplianceContext per E2-7 Gap 6. - Settings: new embedding_mode field (default hybrid; override via EMBEDDING_MODE env var) - UCPClient: new create_query_embedding_with_provenance() returns QueryEmbeddingResult(embedding, provenance, dimension); existing create_query_embedding() preserved as backward-compat wrapper - Lazy local model load with graceful fallback if sentence-transformers not installed - Out-of-range advertiser vectors (dim < 256 or > 1024) rejected with warning; falls back to local/mock per mode - Adds 10 unit tests covering all 4 modes + provenance + backward compat bead: ar-0abx Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per E2-2's hybrid strategy. EmbeddingMintTool now reads settings.embedding_mode and renders a per-mode descriptive string (MOCK/LOCAL/ADVERTISER-SUPPLIED/HYBRID). EMBEDDING_MODE_LABEL_MOCK preserved as a static constant for backward compat with existing imports. _format_ref pulls the dynamic label so audit-trail entries record the actual provenance per booking. bead: ar-c2vp Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per E2-2 + E2-1. Replaces "256-1024 dim, cosine similarity" generic language with the locked hybrid strategy: sentence-transformers/all-MiniLM-L6-v2 local + advertiser-supplied + mock CI fallback. Adds Embedding Provenance subsection cross-linking the strategy decision and the E2-7 consent review. bead: ar-espk Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New `ad_buyer.eval` package with `evaluate_embedding_modes()` function that runs a fixed corpus of audience briefs through each EMBEDDING_MODE and reports per-mode metrics (determinism, dimension, distinctiveness, provenance). Used by §17 release-gate audits and informs E2-4 threshold recalibration. Cosine-distance distinctiveness over pairwise fixture comparisons surfaces the difference between mock SHA256 and real sentence-transformers embeddings on semantically related-but-distinct briefs. bead: ar-f2y2 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per E2-3's eval harness — mock SHA256 vectors saturate quickly so the "strong" threshold has to be tighter (≥0.85) to avoid false matches. Real sentence-transformers vectors live in a smoother semantic space and tolerate the original 0.70 strong threshold. Advertiser and hybrid modes follow the local convention. UCPClient.validate_audience_with_seller now reads thresholds from _similarity_thresholds_for_mode() (which honors settings.embedding_mode). MkDocs configuration table updated. Re-derive via ad_buyer.eval.evaluate_embedding_modes() when the model swaps. bead: ar-318x Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bead: ar-tuac Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Exercises EMBEDDING_MODE=local + hybrid through the buyer's UCPClient, asserting: local model produces 384-dim or falls back gracefully (no crash), per-mode threshold tightening per E2-4, dynamic label per E2-5, eval harness reports real provenance per E2-3, embedding_provenance field on ComplianceContext per E2-7 Gap 6, full AudiencePlan round-trip through JSON. bead: ar-zyqd Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…zs0) Production code uses datetime.now(timezone.utc); test was comparing datetime.now() (local) which fails after local-time midnight passes UTC midnight. Match production by using UTC in the test. bead: ar-szs0 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per Quinn's note during ar-lk23 review. The cross-repo round-trip test hard-coded `.worktrees/audience-extension/src`; now derives the path from the buyer worktree name (so any worktree name works) with an AD_SELLER_SRC_PATH env-var override and a graceful fallback to the seller repo's main src/ when no companion worktree exists. bead: ar-840n Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per consent surface review (E2-7) Gap 5 + proposal §7. Single ComplianceContext can't honestly express per-region consent for global agentic campaigns. Until per-jurisdiction fan-out lands as a follow-up to E2-2, brief ingestion rejects agentic refs declared with jurisdiction='GLOBAL'. Standard / Contextual GLOBAL refs are allowed (they don't carry per-region consent semantics). New validate_no_global_agentic() validator + GlobalAgenticUnsupported exception. Wired into CampaignBrief's model_validator alongside the existing Content Taxonomy 2.x→3.x check. bead: ar-ei0s Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per Quinn's flag during ar-ts30 verification. Adds a small `time_utils.utc_now()`
helper that returns naive UTC datetime (matching the prior datetime.utcnow()
semantic) without the Python 3.12+ deprecation warning. Updates 23 call
sites across 8 files: events/models, flows/dsp_deal_flow,
interfaces/api/main, models/{flow_state,state_machine,ucp},
negotiation/{models,strategy}.
Tightens the obsolete §22-in-label assertion in test_audience_planner_wiring
that E2-5 superseded with the dynamic per-mode label.
bead: ar-4e9b
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per Quinn's flag during UAT 2026-04-25 (mkdocs build --strict aborts). Anchor fixes (deployment-ops-guide.md): replace '&' in 4 H2 headings with 'and' so the auto-generated slugs match the TOC link targets (`#environment-variables-and-configuration` etc.). The default slugifier dropped '&' producing inconsistent dash counts. Cross-page link fixes: 8 missing-target links in event-bus/overview.md, state-machines/order-lifecycle.md, and architecture/mcp-server.md referenced docs/pages that were never written (`deal-store.md`, `state-machine.md`, `booking-flow.md`, `event-bus.md`, `ai-assistant/overview.md`). Replaced with plain-text source-file references so mkdocs --strict passes without losing the information. Also fixed the embedding-strategy link I introduced in E2-9 (cross-mkdocs relative path was wrong-depth; now described in plain text). bead: ar-w9xv Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per Quinn's flag during ar-ts30 verification. PR #81 was advertised as "DSP → BuyerDealFlow rename" but missed the file `flows/dsp_deal_flow.py`, class `DSPDealFlow`, state model `DSPFlowState`, and status enum `DSPFlowStatus`. This commit completes that rename: - File: src/ad_buyer/flows/dsp_deal_flow.py → buyer_deal_flow.py - Class: DSPDealFlow → BuyerDealFlow - State: DSPFlowState → BuyerDealFlowState - Status enum: DSPFlowStatus → BuyerDealFlowStatus - Updated all 89 reference sites across 9 files (src/ + tests/) - Updated import paths and __init__ exports Out of scope: `agents/level2/dsp_agent.py` and `tools/dsp/` directory (separate followups; bead description scoped to flow + state). Full buyer suite 3013/3013 passing post-rename. bead: ar-62g7 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Epic 1 (ar-hrwc): 24/24 beads — Standard + Contextual + Agentic audience support across buyer + seller, with composable overlay model, capability negotiation, structured rejection, snapshot honor, OpenRTB carrier mapping. Epic 2 (ar-wi9x): 10/10 beads — sentence-transformers hybrid embedding model, per-mode similarity thresholds, canonical JSON Schema drift backstop, embedding_provenance metadata. 8 P3 follow-ups closed during the run. Live UAT R3: byte-identical wire round-trip on real seller booking (deal DEMO-78215162AD80, plan_id sha256:cc87ed44...199b292e). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace string-comparison if/elif chain with a registry dict keyed by Tool class. Unknown tools fall back to .description / .name rather than producing empty strings. Each registered audience/dsp/research tool tested for a non-empty mapping or fallback. bead: ar-yt4 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the eager module-top `settings = Settings()` with a lazy `get_settings()` cached factory + `_LazySettings` proxy alias for existing import sites. Tests that need to override env vars now see the override on first attribute access rather than fighting the import-time instantiation. bead: ar-le3 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Audit found all 44 existing methods already have return annotations (22 sync + 22 async). This parametrized regression test walks ad_buyer.tools.* for BaseTool subclasses and asserts every _run/_arun declares a return type — locks in the property going forward. bead: ar-gsd Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extracted the 4 nearly-identical create_*_crew bodies into a single _build_channel_crew helper parameterized by frozen _ChannelCrewSpec dataclasses. Public signatures preserved; existing callers (CampaignPipeline, BuyerDealFlow, channel-crew tests) unaffected. Per-channel variation (manager-agent factory, research/recommendation task descriptions, expected_output strings) lives in 4 spec instances. Task descriptions now use .format() with named keys for clarity. Line count: 636 → 563 (~12% reduction). All 44 channel-crew tests still green; full buyer suite 3076/3077 (1 unrelated flake = ar-0isf). bead: ar-w5g Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Documents the unit/integration/smoke tier layout, run commands, conventions (PYTHONPATH, worktree venv, AD_SELLER_SRC_PATH, ANTHROPIC_API_KEY, EMBEDDING_MODE), the regression-guard tests that lock in invariants, known flakes, and an audience-extension-tests-by-epic index. bead: ar-7p3 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`python -m ad_buyer.demo.campaign_demo --headless` drives all 6 stages (submit-brief → approve-plan → approve-booking → approve-creative → activate → report) via Flask test client without binding a port. Emits one JSON object per stage to stdout (default) or `--summary` for a short human-readable line per stage. Stage sequence proven against the actual sample-brief #0; exits non-zero on any stage failure or invalid --sample-index. Useful for CI smoke, demo canaries, and one-shot validation without a browser. 4 new tests cover summary mode, JSON mode, default-is-JSON, and invalid sample-index → non-zero exit. bead: ar-jzek Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…erience cleanup Epic ar-v5lt: - ar-0vtg: POST /api/v1/quotes endpoint — was 500, now 200 in 16ms - ar-le3: lazy Settings init via _LazySettings proxy - ar-yt4: tool _natural_language_ registry dict + case-insensitive lookup - ar-gsd: regression guard on _run/_arun return type annotations - ar-w5g: channel crew DRY (4 factories → 1 builder + 4 specs; 12% line cut) - ar-7p3: tests/README.md test-suite map - ar-jzek: campaign demo --headless / --json mode Buyer suite 3076/3077 passing (1 order-dep flake = ar-0isf, separate followup). Seller suite 767/767. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…th-origin-buyer # Conflicts: # src/ad_buyer/flows/buyer_deal_flow.py # tests/unit/test_buyer_deal_discovery_pricing.py # tests/unit/test_buyer_deal_flow.py
Origin's PR #81 renamed src/ad_buyer/tools/dsp/ → src/ad_buyer/tools/buyer_deals/. Two test files added locally on the pre-rename base still imported from the old path; the merge picked up the rename via git's rename detection but the hard-coded `from ad_buyer.tools.dsp.request_deal import ...` lines did not update. Point them at `tools.buyer_deals.request_deal` to match the renamed module. bead: ar-z40x Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The test imported `settings` at module level and then used
`patch.object(settings, "embedding_mode", mode)`. That pattern is fine in
isolation, but `tests/unit/test_settings_lazy_init.py` deletes
`ad_buyer.config.settings` from `sys.modules` and reimports it to verify
lazy construction. After that reload, the module's `settings` symbol is a
new `_LazySettings` proxy backed by a fresh `lru_cache`, while the
threshold test's captured `settings` symbol still points at the old
proxy backed by the old cache. `_similarity_thresholds_for_mode` does
`from ..config.settings import settings` at call time, so it reads the
new proxy — meaning our patch wrote to the wrong cached `Settings` and
the threshold lookup returned the default ("hybrid") instead of the
patched mode.
Fix: resolve `settings` dynamically each iteration via
`sys.modules["ad_buyer.config.settings"].settings`, and add an explicit
side-effect import of the settings module so the entry exists in
`sys.modules` even when the threshold test runs in isolation. Production
code is unchanged.
bead: ar-0isf
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The cross-repo AudiencePlan JSON round-trip test resolved the seller src path via parents[2] from the test file, which assumed the file lived inside `ad_buyer_system/.worktrees/<name>/...`. When the test ran from the canonical buyer repo path (no worktree), the math walked too high, producing nonexistent sibling paths and a ModuleNotFoundError on `ad_seller`. Replace the parent-index math with explicit ancestry search for `ad_buyer_system`, then look for a matching seller worktree only when the test itself is running inside a buyer worktree, falling back to `ad_seller_system/src` otherwise. Honors the existing `AD_SELLER_SRC_PATH` override. bead: ar-e2rj Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…34o) Mirrors seller-side fix 5df5c38. CrewAI 1.10.1 made Flow.state read-only; buyer POST /bookings and CLI were assigning to flow.state and crashing with: ValueError: property 'state' of 'DealBookingFlow' object has no setter Both call sites fixed: - src/ad_buyer/interfaces/api/main.py:549 — POST /bookings background task - src/ad_buyer/interfaces/cli/main.py:103 — CLI booking command Fix: DealBookingFlow.__init__ now accepts **state_kwargs forwarded to Flow.__init__(**state_kwargs). Call sites pass campaign_brief= at construction instead of assigning flow.state = BookingState(...) post-construction. Regression tests added in TestCrewAI110FlowStateRegression: - construction with no state (defaults) - construction with campaign_brief kwargs - state setter raises AttributeError/ValueError guard Also updated test_get_booking_status_after_creation to accept "awaiting_approval" as a valid terminal status now that the flow runs. bead: ar-x34o Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Starlette sub-app routing causes \`/mcp/sse\` to 307→404; the working URL for AI clients (Claude Desktop, ChatGPT, Cursor, Windsurf) is \`/mcp/sse/sse\`. Update all setup guides, config examples, and the architecture reference to use the canonical path. Files updated: - docs/claude-desktop-setup.md: remote URL, local config, troubleshooting heading - docs/multi-client-setup.md: all ChatGPT, Codex, Cursor, Windsurf config examples - docs/architecture/mcp-server.md: canonical URL note, Mermaid diagram label - docs/ai-assistant/developer-setup.md: hand-off URL and verify curl command - docs/guides/deployment-ops-guide.md: MCP endpoint box, claude_desktop_config example, Python client snippet bead: ar-yptd Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
mainwithorigin/main— local was 2 commits behind (cc9c4d7PR Rename DSP → BuyerDealFlow across code, tests, and docs #81 DSP→BuyerDealFlow rename,73a45e3litellm→CrewAI native provider migration). Conflicts resolved in favor of origin's renames.flow.stateread-only crash, MCP doc URL paths.What's in
test_lookup_per_mode(sys.modules reload created two_LazySettingsproxies).worktrees/audience-extension/refs)Flow.stateis read-only — switchedDealBookingFlowto constructor injection (API + CLI)/mcp/sse/sse(Starlette sub-app path-doubling)Test posture
POST /bookingsreturns 422 (handler reached, zeroproperty 'state'errors)docs/reports/QUINN_VERIFICATION_ar-{0isf,e2rj,x34o,fvap}_2026-04-26.mdTest plan
pytest tests/, expect 3081/3081 passedPOST /bookingswith empty{}body — expect 422 (not 500)mkdocs build --strictclean🤖 Generated with Claude Code