Conversation
- Add comprehensive property-based tests for TrendSearchInputSchema and TrendSearchEventDataSchema using fast-check - Validate input serialization round-trip through JSON to ensure data integrity - Test valid input acceptance with various query lengths, company context, and optional categories - Test invalid input rejection for empty, whitespace-only, and oversized inputs - Verify event payload structure and field preservation across serialization cycles - Add type definitions for trend search feature (TrendSearchInputSchema, TrendSearchEventDataSchema, SearchCategoryEnum) - Update package.json and pnpm-lock.yaml with fast-check dependency - Ensures type safety and data validation for ai-trend-search-engine feature
- Add comprehensive property-based tests for TrendSearchInputSchema and TrendSearchEventDataSchema using fast-check - Validate input serialization round-trip through JSON to ensure data integrity - Test valid input acceptance with various query lengths, company context, and optional categories - Test invalid input rejection for empty, whitespace-only, and oversized inputs - Verify event payload structure and field preservation across serialization cycles - Add type definitions for trend search feature (TrendSearchInputSchema, TrendSearchEventDataSchema, SearchCategoryEnum) - Update package.json and pnpm-lock.yaml with fast-check dependency - Ensures type safety and data validation for ai-trend-search-engine feature
…for trend search pipeline with test cases
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Deodat-Lawson
left a comment
There was a problem hiding this comment.
would it be possible to move trend-search folder to the folder 'scr/lib/tools'. I wanted that to be our uniformed tools folder
There was a problem hiding this comment.
Pull request overview
Adds an AI Trend Search Engine module to the Marketing Engine service: a stateless trend-search pipeline (LLM query planning → Tavily web search → LLM synthesis) orchestrated via an Inngest background job, with persistence handled by a new trend_search_jobs table and Next.js API routes for job creation and polling.
Changes:
- Introduces
src/server/trend-search/*pipeline modules (types, query planner, web search, synthesizer, runner) plus DB helpers for job persistence. - Adds Inngest function + registration and typed event schemas for the trend-search workflow.
- Adds Next.js API routes for starting searches, listing searches, and polling job status/results; adds property-based tests via
fast-check.
Reviewed changes
Copilot reviewed 24 out of 26 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| src/server/trend-search/web-search.ts | Tavily web search executor with retries + URL deduping. |
| src/server/trend-search/types.ts | Shared Zod schemas + types for inputs, events, jobs, and results. |
| src/server/trend-search/synthesizer.ts | LLM-based synthesis into structured results (with placeholder padding). |
| src/server/trend-search/run.ts | Stateless pipeline runner wiring planner → search → synthesizer. |
| src/server/trend-search/query-planner.ts | LLM-based sub-query planning with structured output. |
| src/server/trend-search/index.ts | Public module entry-point exports for programmatic invocation. |
| src/server/trend-search/db.ts | Drizzle-based job store + helper functions for CRUD/status/results. |
| src/server/inngest/functions/trendSearch.ts | Inngest orchestrator for the multi-step pipeline + persistence. |
| src/server/inngest/functions/processDocument.ts | Adjusts event typing usage to align with new Inngest schemas typing. |
| src/server/inngest/client.ts | Adds EventSchemas union typing for Inngest events (incl. trend-search). |
| src/server/db/schema/trend-search.ts | New Drizzle schema for trend_search_jobs table + indexes. |
| src/server/db/schema.ts | Exposes the new trend-search schema from the schema barrel. |
| src/app/api/trend-search/route.ts | POST to enqueue searches + GET to list searches for a company. |
| src/app/api/trend-search/[jobId]/route.ts | GET endpoint to poll job status/results scoped by company. |
| src/app/api/inngest/route.ts | Registers the new trend-search Inngest function. |
| scripts/test-trend-search.ts | Smoke-test script intended to run the pipeline without DB/Inngest. |
| tests/api/trendSearch/web-search.pbt.test.ts | Property/unit tests for web search executor (fast-check + fetch mocks). |
| tests/api/trendSearch/types.pbt.test.ts | Property tests for input/event schemas and validation expectations. |
| tests/api/trendSearch/synthesizer.pbt.test.ts | Property/unit tests for synthesizer output shape/traceability. |
| tests/api/trendSearch/query-planner.pbt.test.ts | Property tests for planner category constraints and output sizing. |
| tests/api/trendSearch/persistence.pbt.test.ts | Property tests for in-memory persistence helpers and company isolation. |
| tests/api/trendSearch/inngest-completion.pbt.test.ts | Property tests for Inngest completion flow/status transitions. |
| package.json | Adds fast-check dependency for property-based testing. |
| pnpm-lock.yaml | Lockfile updates for fast-check and related dependency graph changes. |
| .gitignore | Ignores .kiro directory. |
| .env.example | Removes the repo’s env example file. |
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Zod min(1) rejects empty strings; whitespace-only strings have length ≥ 1 | ||
| // but the schema uses min(1) on raw string length, not trimmed. | ||
| // Per requirements 1.4: "empty or whitespace-only" should be rejected. | ||
| // The schema enforces min(1) which rejects empty strings. | ||
| // Whitespace-only strings pass min(1) by character count but fail semantically. | ||
| // We verify the schema rejects truly empty strings (length 0). | ||
| // For whitespace-only, we check the trimmed length is 0 to confirm the intent. | ||
| const trimmed = whitespaceQuery.trim(); | ||
| if (trimmed.length === 0) { | ||
| // Pure whitespace — schema should reject (min(1) catches empty after trim if we add .trim()) | ||
| // Current schema uses min(1) on raw length; whitespace strings of length ≥ 1 pass raw min(1). | ||
| // This test documents the behavior: raw whitespace passes min(1) but fails semantic intent. | ||
| // The schema correctly rejects empty string ("") via min(1). | ||
| expect(result.success).toBe(whitespaceQuery.length >= 1); // documents current behavior | ||
| } |
There was a problem hiding this comment.
The whitespace-only query property test is not actually asserting rejection: it ends up expecting result.success to be true for any whitespace string (since whitespaceQuery.length >= 1 is always true). After tightening TrendSearchInputSchema to trim + min(1), update this test to assert success === false for whitespace-only input so it matches the stated property and requirements.
| // Zod min(1) rejects empty strings; whitespace-only strings have length ≥ 1 | |
| // but the schema uses min(1) on raw string length, not trimmed. | |
| // Per requirements 1.4: "empty or whitespace-only" should be rejected. | |
| // The schema enforces min(1) which rejects empty strings. | |
| // Whitespace-only strings pass min(1) by character count but fail semantically. | |
| // We verify the schema rejects truly empty strings (length 0). | |
| // For whitespace-only, we check the trimmed length is 0 to confirm the intent. | |
| const trimmed = whitespaceQuery.trim(); | |
| if (trimmed.length === 0) { | |
| // Pure whitespace — schema should reject (min(1) catches empty after trim if we add .trim()) | |
| // Current schema uses min(1) on raw length; whitespace strings of length ≥ 1 pass raw min(1). | |
| // This test documents the behavior: raw whitespace passes min(1) but fails semantic intent. | |
| // The schema correctly rejects empty string ("") via min(1). | |
| expect(result.success).toBe(whitespaceQuery.length >= 1); // documents current behavior | |
| } | |
| // TrendSearchInputSchema trims the query and applies min(1), | |
| // so whitespace-only strings should be rejected. | |
| expect(result.success).toBe(false); |
| const SearchResultSchema = z.object({ | ||
| sourceUrl: z.string().describe("URL of the source (must be one of the provided raw result URLs)"), | ||
| summary: z.string().describe("Short summary of the result"), | ||
| description: z.string().describe("Longer description of relevance to the query and company"), | ||
| }); | ||
|
|
||
| const SynthesizerOutputSchema = z.object({ | ||
| results: z | ||
| .array(SearchResultSchema) | ||
| .max(5) | ||
| .describe("Up to 5 selected and ranked results with summary and description"), | ||
| }); | ||
|
|
||
| const PLACEHOLDER_RESULT: SearchResult = { | ||
| sourceUrl: "", | ||
| summary: "Insufficient results", | ||
| description: "Not enough search results were found for this query.", | ||
| }; |
There was a problem hiding this comment.
SearchResultSchema permits empty strings for sourceUrl, summary, and description, and the code pads missing items with placeholders. For cases where rawResults.length >= 5, this can violate the documented property that synthesized results must have non-empty fields and real URLs. Tighten the Zod schema (e.g., .min(1)) and handle short/invalid model outputs by retrying or filling with additional real rawResults instead of placeholders when enough inputs exist.
| "eslint": "^9.34.0", | ||
| "eslint-config-next": "^15.5.2", | ||
| "eslint-plugin-drizzle": "^0.2.3", | ||
| "fast-check": "^4.5.3", | ||
| "jest": "^30.2.0", | ||
| "jest-environment-jsdom": "^30.2.0", | ||
| "postcss": "^8.5.6", |
There was a problem hiding this comment.
fast-check is only referenced from test files (__tests__/...) and not from src/. It should be a devDependency rather than a production dependency to avoid inflating runtime installs/bundles.
| retry-axios@2.6.0(axios@1.7.4(debug@4.4.3)): | ||
| retry-axios@2.6.0(axios@1.7.4): | ||
| dependencies: | ||
| axios: 1.7.4(debug@4.4.3) |
There was a problem hiding this comment.
The retry-axios lockfile entry looks partially updated: the snapshot key is retry-axios@2.6.0(axios@1.7.4) but its dependency still resolves axios: 1.7.4(debug@4.4.3). This kind of inconsistency often comes from manual edits or an interrupted install and can cause non-reproducible installs. Regenerate the lockfile via a clean pnpm install (or pnpm -w install) and commit the resulting pnpm-lock.yaml.
| axios: 1.7.4(debug@4.4.3) | |
| axios: 1.7.4 |
| // ─── GET /api/trend-search ────────────────────────────────────────────────── | ||
| export async function GET() { | ||
| try { | ||
| const { userId } = await auth(); | ||
| if (!userId) { | ||
| return NextResponse.json( | ||
| { error: "Unauthorized" }, | ||
| { status: 401 }, | ||
| ); | ||
| } | ||
|
|
||
| const [userInfo] = await db | ||
| .select() | ||
| .from(users) | ||
| .where(eq(users.userId, userId)); | ||
|
|
||
| if (!userInfo) { | ||
| return NextResponse.json( | ||
| { error: "User not found" }, | ||
| { status: 400 }, | ||
| ); | ||
| } | ||
|
|
||
| const jobs = await getJobsByCompanyId(userInfo.companyId); | ||
|
|
||
| const results = jobs.map((job) => ({ | ||
| id: job.id, | ||
| status: job.status, | ||
| query: job.input.query, | ||
| categories: job.input.categories ?? [], | ||
| createdAt: job.createdAt.toISOString(), | ||
| })); | ||
|
|
||
| return NextResponse.json({ searches: results }, { status: 200 }); |
There was a problem hiding this comment.
The PR description says GET /api/trend-search returns a paginated list, but this handler does not accept limit/offset (or cursor) query params and always returns up to the store default (currently 100). Either implement request-driven pagination (and return paging metadata) or adjust the PR description to match the current behavior.
| const humanPrompt = buildHumanPrompt(rawResults, query, companyContext, categories); | ||
|
|
||
| const response = await structuredModel.invoke([ | ||
| new SystemMessage(SYSTEM_PROMPT), | ||
| new HumanMessage(humanPrompt), | ||
| ]); | ||
|
|
||
| const parsed = SynthesizerOutputSchema.safeParse(response); | ||
| if (!parsed.success) { | ||
| throw new Error(`Synthesizer output validation failed: ${parsed.error.message}`); | ||
| } | ||
|
|
||
| const synthesized = parsed.data.results as SearchResult[]; | ||
| const padded: SearchResult[] = [...synthesized]; | ||
|
|
||
| while (padded.length < TARGET_COUNT) { | ||
| padded.push({ ...PLACEHOLDER_RESULT }); | ||
| } | ||
|
|
||
| return padded.slice(0, TARGET_COUNT); | ||
| } |
There was a problem hiding this comment.
There is no runtime enforcement that each synthesized sourceUrl actually appears in the input raw results URL set (requirement/property 8). The prompt asks the model not to invent URLs, but without verification an LLM can still output arbitrary URLs and they’ll be returned to callers. Add a validation step after parsing to ensure every non-empty sourceUrl is included in rawResults.map(r => r.url), and fail/retry if any are not.
| export type TrendSearchPipelineStage = "searching" | "synthesizing"; | ||
|
|
||
| export interface RunTrendSearchOptions { | ||
| onStageChange?: (stage: TrendSearchPipelineStage) => Promise<void> | void; | ||
| } | ||
|
|
||
| /** | ||
| * Trend-search pipeline: planQueries → executeSearch → synthesizeResults. | ||
| * | ||
| * Pure pipeline execution — no DB writes, no side effects. | ||
| * Callers (e.g. Inngest) own persistence and status tracking. | ||
| */ | ||
| export async function runTrendSearch( | ||
| input: TrendSearchInput, | ||
| options: RunTrendSearchOptions = {}, | ||
| ): Promise<TrendSearchOutput> { | ||
| // Step 1: Plan queries | ||
| const categories = input.categories; | ||
| const plannedQueries = await planQueries(input.query, input.companyContext, categories); | ||
|
|
||
| // Step 2: Execute web searches | ||
| await options.onStageChange?.("searching"); | ||
| const rawResults = await executeSearch(plannedQueries); |
There was a problem hiding this comment.
The PR description and status enum include a planning stage, but the pipeline stage type and stage-change callbacks never emit it. This means jobs will transition from queued straight to searching, and UIs/pollers won’t be able to reflect the planning step. Consider adding a planning stage to TrendSearchPipelineStage and invoking onStageChange("planning") before planQueries (and updating the Inngest wrapper accordingly).
| const output = (await step.run("run-pipeline", async () => { | ||
| let lastStage: TrendSearchPipelineStage | null = null; | ||
|
|
||
| const onStageChange = async (stage: TrendSearchPipelineStage) => { | ||
| if (stage === lastStage) return; | ||
| lastStage = stage; | ||
| await updateStatusOrThrow(jobId, companyId, stage); | ||
| }; | ||
|
|
||
| await onStageChange("searching"); | ||
|
|
||
| return runTrendSearch(toPipelineInput(eventData), { | ||
| onStageChange, | ||
| }); | ||
| })) as TrendSearchOutput; |
There was a problem hiding this comment.
The Inngest job status updates never set planning, despite the status enum and PR design doc including it. If you add a planning stage in the pipeline, also update this wrapper to record it (ideally before running planQueries) so job status progression matches the documented pipeline stages.
| import { runTrendSearch } from "~/lib/tools/trend-search/index"; | ||
|
|
||
| async function main() { | ||
| const input = { | ||
| query: "latest AI trends in retail marketing", | ||
| companyContext: | ||
| "We are a mid-size fashion retailer focused on Gen Z customers in the US market.", | ||
| categories: ["fashion", "tech"] as ("fashion" | "tech")[], |
There was a problem hiding this comment.
This script imports runTrendSearch from ~/lib/tools/trend-search/index, but there is no src/lib/tools/trend-search/ directory in the repo. Update the import to use the new module entry point (~/server/trend-search or ~/server/trend-search/index) or add the intended lib/tools wrapper so the smoke test can run.
| // Create job record in DB | ||
| await createJob({ | ||
| id: jobId, | ||
| companyId, | ||
| userId, | ||
| query: input.query, | ||
| companyContext: input.companyContext, | ||
| categories: input.categories, | ||
| }); | ||
|
|
||
| // Dispatch Inngest event | ||
| await inngest.send({ | ||
| name: "trend-search/run.requested", | ||
| data: { | ||
| jobId, | ||
| companyId: companyId.toString(), | ||
| userId, | ||
| query: input.query, | ||
| companyContext: input.companyContext, | ||
| ...(input.categories ? { categories: input.categories } : {}), | ||
| }, | ||
| }); | ||
|
|
There was a problem hiding this comment.
If inngest.send() throws after the DB job is created, the handler returns 500 but leaves a persisted job stuck in queued forever. Consider wrapping the send in its own try/catch and marking the job failed (with errorMessage) when dispatch fails, or creating the job only after a successful dispatch (if acceptable).
…into feature/ai-trend-search-engine-local
Design Document: AI Trend Search Engine
Overview
The AI Trend Search Engine is a new module within the Marketing Engine service that accepts a natural language prompt and company context, searches the web for recent news and events, and returns 5 structured results with citations. It runs as an Inngest background job and is designed as a self-contained module that can later be invoked by AI agents.
The pipeline follows four stages:
Persistence is handled by the caller (Inngest job), not the core pipeline. This keeps the module stateless and reusable.
The module lives under
src/server/trend-search/as a standalone directory, with an Inngest function insrc/server/inngest/functions/and an API route insrc/app/api/trend-search/.Architecture
sequenceDiagram participant Client participant API as API Route<br/>/api/trend-search participant Inngest as Inngest Job participant QP as Query Planner<br/>(LLM) participant WS as Web Search<br/>(Tavily) participant CS as Content Synthesizer<br/>(LLM) participant DB as PostgreSQL Client->>API: POST { query, companyContext, categories? } API->>API: Validate input (Zod) API->>Inngest: Dispatch "trend-search/run.requested" API-->>Client: 202 { jobId, status: "queued" } Inngest->>QP: Step 1: Plan queries QP-->>Inngest: sub-queries[] loop For each sub-query Inngest->>WS: Step 2: Execute search WS-->>Inngest: raw results[] end Inngest->>CS: Step 3: Synthesize results CS-->>Inngest: SearchResult[5] Inngest->>DB: Step 4: Persist results DB-->>Inngest: saved Note over Client,DB: Client polls GET /api/trend-search/[jobId] for resultsThe module integrates with the existing architecture:
Components and Interfaces
1. Input Types (
src/server/trend-search/types.ts)2. Query Planner (
src/server/trend-search/query-planner.ts)Responsible for taking the user's prompt and company context and generating optimized sub-queries for Tavily.
Implementation approach:
ChatOpenAI) with a structured output prompt3. Web Search Executor (
src/server/trend-search/web-search.ts)Executes sub-queries against Tavily and collects raw results.
Implementation approach:
@langchain/communityTavilySearchResults tool or direct Tavily APIsearch_depth: "advanced"andtopic: "news"to focus on news/events4. Content Synthesizer (
src/server/trend-search/synthesizer.ts)Takes raw search results and produces exactly 5 structured results.
Implementation approach:
sourceUrl: "",summary: "Insufficient results")5. Inngest Function (
src/server/inngest/functions/trendSearch.ts)Orchestrates the pipeline as a multi-step Inngest function.
6. API Routes
POST
/api/trend-search— Initiate a search202 { jobId, status: "queued" }GET
/api/trend-search/[jobId]— Poll for resultsGET
/api/trend-search— List past searches7. Module Entry Point (
src/server/trend-search/index.ts)Exposes the public interface for programmatic invocation (agent-callable in the future).
This function runs the pipeline directly (without Inngest, without DB) for synchronous invocation by agents or any caller. It accepts only the search input and returns the result — no side effects. Persistence is the responsibility of the caller.
The Inngest function is the only caller that persists results to the DB. This keeps
runTrendSearch()stateless and reusable across contexts (agents, tests, scripts) without requiring DB access.Data Models
New Table:
trend_search_jobsDrizzle schema definition in
src/server/db/schema/trend-search.ts:Correctness Properties
A property is a characteristic or behavior that should hold true across all valid executions of a system — essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.
Property 1: Valid input creates a job
For any valid Search_Query (non-empty, ≤1000 chars) and valid Company_Context (non-empty, ≤2000 chars), submitting a trend search should create a job record with status "queued" and return a job ID.
Validates: Requirements 1.1
Property 2: Invalid input is rejected
For any Search_Query composed entirely of whitespace characters, or for any Company_Context exceeding 2000 characters, the Trend_Search_Engine should reject the request with a validation error and not create a job record.
Validates: Requirements 1.4, 1.5
Property 3: Category inference produces valid categories
For any valid Search_Query and Company_Context where no Search_Categories are specified, the Query_Planner should return planned queries where every category is a member of the valid SearchCategory enum (fashion, finance, business, tech).
Validates: Requirements 1.2
Property 4: Specified categories are preserved in planned queries
For any set of specified Search_Categories, all planned queries produced by the Query_Planner should only reference categories from the specified set.
Validates: Requirements 1.3
Property 5: Query planner always produces sub-queries
For any valid Search_Query and Company_Context, the Query_Planner should return at least one PlannedQuery.
Validates: Requirements 2.1
Property 6: Every sub-query triggers a search call
For any list of PlannedQueries, the web search executor should invoke the search provider exactly once per sub-query (before retries).
Validates: Requirements 3.1
Property 7: Synthesizer output structure
For any set of at least 5 raw search results, the Content_Synthesizer should produce exactly 5 SearchResult objects, each containing a non-empty
sourceUrl, a non-emptysummary, and a non-emptydescription.Validates: Requirements 4.1, 4.2
Property 8: Source URL traceability
For any output from the Content_Synthesizer, every
sourceUrlin the results should be present in the set of URLs from the raw input results.Validates: Requirements 4.3
Property 9: Persistence round-trip
For any completed trend search, persisting the Result_Set and then retrieving it by job ID should return an equivalent Result_Set, including the original query, company context, categories, and a timestamp.
Validates: Requirements 5.1, 5.2, 5.3
Property 10: Company data isolation
For any two distinct company IDs, trend search results persisted under company A should never appear when querying results for company B.
Validates: Requirements 5.4
Property 11: Successful pipeline sets completed status
For any trend search pipeline that completes all steps without error, the job record status should be "completed" and the
completedAttimestamp should be non-null.Validates: Requirements 6.4
Property 12: Input serialization round-trip
For any valid TrendSearchInput, serializing it to the Inngest event JSON payload and then deserializing it back should produce an object equal to the original input.
Validates: Requirements 8.1, 8.2
Error Handling
Testing Strategy
Property-Based Testing
Use
fast-checkas the property-based testing library (already compatible with the Jest setup in this project).Each correctness property maps to a single property-based test with a minimum of 100 iterations. Tests should be tagged with the property reference.
Tag format:
Feature: ai-trend-search-engine, Property {N}: {title}Key property tests:
Unit Testing
Unit tests complement property tests for specific examples and edge cases:
Integration Testing