AI shopping assistant that sees what you see and tells you when you're getting a deal.
Unlike voice assistants that wait for commands, Delphi watches your shopping session and proactively alerts you when it spots a great price—while staying silent when it doesn't.
Two Gemini models run simultaneously: a silent recorder extracts prices from video frames, while a voice assistant handles conversation. When the recorder spots a great deal, it injects an alert into the assistant's Live API session.
# live/realtime_client.py:691-702
if verdict in ["great_deal", "good"] and normalized_key not in notified_deals:
notified_deals.add(normalized_key)
deal_msg = f"[SYSTEM: Great deal detected - {product_key} at ${price}. Alert the user briefly.]"
# Inject into Live API session
await assistant.send_client_content(
turns=types.Content(role="user", parts=[types.Part(text=deal_msg)])
)"Is Kirkland vitamins good?" → Gemini searches the web and synthesizes certifications, country of origin, and reputation.
# live/realtime_client.py:259-265
search_model = client.models.generate_content(
model="gemini-3-flash-preview",
contents=f"Research the brand '{brand}' for {product_type}...",
config=types.GenerateContentConfig(
tools=[types.Tool(google_search=types.GoogleSearch())]
)
)"Does this look ripe?" → Vision analysis with expert criteria for produce, meat quality, and supplement evaluation.
All extraction uses JSON schemas—no regex parsing of free-form text.
// server/extractor.ts:8-49
const productSchema = {
type: SchemaType.OBJECT,
properties: {
products: {
type: SchemaType.ARRAY,
items: {
type: SchemaType.OBJECT,
properties: {
name: { type: SchemaType.STRING },
price: { type: SchemaType.NUMBER, nullable: true },
category: { type: SchemaType.STRING, enum: ['grocery', 'car', 'electronics', ...] },
// ...
}
}
}
}
};Chrome extension that overlays price comparisons directly on product pages.
Five fallback methods ensure extraction works across sites:
flowchart LR
subgraph PAGE["Product Page"]
HTML[HTML]
JSONLD["JSON-LD"]
end
subgraph EXTRACT["Extraction Priority"]
E1["1. JSON-LD"]
E2["2. Learned Selectors"]
E3["3. Site Rules"]
E4["4. Meta Tags"]
E5["5. Vision Fallback"]
end
subgraph WIDGET["Price Widget"]
BADGE["Verdict Badge"]
ALT["Alternatives"]
end
PAGE --> E1 --> E2 --> E3 --> E4 --> E5
E5 --> WIDGET
// extension/product-extractor.ts:560-583
export function extractProductFromPage(): ExtractedProductInfo | null {
const methods = [
extractFromJsonLd,
extractFromLearnedSelectors,
extractFromSite,
extractFromMeta,
extractGeneric,
];
for (const method of methods) {
const result = method();
if (result && result.name) return result;
}
return null;
}When CSS selectors break (sites update their HTML), the extension:
- Falls back to Gemini Vision to extract product from screenshot
- Uses extracted values to teach Gemini which HTML elements contain them
- Gemini returns stable, semantic CSS selectors
- New selectors are cached for future visits
// server/extractor.ts:620-627
const valueHints = knownValues ? `
KNOWN VALUES - Find the elements containing these EXACT values:
${knownValues.name ? `- Product name: "${knownValues.name}"` : ''}
${knownValues.price ? `- Price: $${knownValues.price}` : ''}
` : '';The selector learner rejects unstable utility classes (Tailwind, CSS-in-JS hashes) and only accepts semantic selectors:
// server/extractor.ts:673-685
const badSelectorPatterns = [
/^\.(?:flex|block|hidden|inline|grid)/i, // Layout utilities
/^\.(?:text-|font-|bg-|border-)/i, // Tailwind-style
/^\.[a-z]{1,2}$/i, // Single/double letter classes
/^\.(?:css|sc|styled)-[a-z0-9]+$/i, // CSS-in-JS generated
];- Great Deal — Below typical price
- Good — At or slightly below average
- Fair — At market price
- Pricey — Above average
Each verdict includes alternatives: budget options (with tradeoffs), similar products, and premium upgrades (with benefits).
flowchart TB
subgraph USER["User"]
CAM["Camera/Screen"]
MIC["Microphone"]
end
subgraph DUAL["Dual-LLM System"]
subgraph REC["Silent Recorder"]
R["gemini-3-flash"]
end
subgraph AST["Voice Assistant"]
A["gemini-2.5-flash-live"]
end
R -->|"deal alert"| A
end
subgraph API["Delphi API Server"]
DB[("SQLite DB")]
EXT["Gemini Extractor"]
end
CAM --> R
CAM --> A
MIC --> A
R -->|"add_product()"| API
A -->|"check_price()"| API
EXT --> DB
sequenceDiagram
participant U as User
participant R as Recorder<br/>(gemini-3-flash)
participant A as Assistant<br/>(Live API)
participant API as Delphi API
U->>R: Video frames (1 fps)
U->>A: Audio + Video stream
loop Every Frame
R->>R: Extract visible prices
R->>API: add_product(name, price, store)
API->>API: Store in SQLite
end
R->>API: check_price(product)
API-->>R: verdict: "great_deal"
R->>A: [SYSTEM: Great deal detected!]
A->>U: "Hey, that D3 is actually a great price!"
U->>A: "Is Kirkland good quality?"
A->>API: get_brand_info("Kirkland", "vitamins")
API->>API: Google Search grounding
API-->>A: Brand info + certifications
A->>U: "Kirkland is USP verified..."
flowchart LR
subgraph CAPTURE["Frame Capture"]
SC["Screenshot<br/>1 fps"]
end
subgraph EXTRACT["Extraction"]
GEM["gemini-3-flash<br/>+ tools"]
end
subgraph EVAL["Evaluation"]
CHK["check_price()"]
DB[("Price DB")]
end
subgraph NOTIFY["Notification"]
INJ["Inject into<br/>Live session"]
end
SC --> GEM
GEM -->|"add_product()"| DB
GEM --> CHK
DB --> CHK
CHK -->|"great_deal"| INJ
flowchart TB
subgraph FAIL["Selector Failure"]
CSS["CSS selectors<br/>return null"]
end
subgraph VISION["Vision Fallback"]
SS["Capture screenshot"]
VIS["gemini-3-flash<br/>vision extraction"]
end
subgraph LEARN["Learn New Selectors"]
HTML["Page HTML"]
LLM["gemini-3-flash<br/>selector learning"]
end
subgraph CACHE["Cache"]
LS["localStorage<br/>7-day TTL"]
end
CSS --> SS
SS --> VIS
VIS -->|"name, price"| LLM
HTML --> LLM
LLM -->|"semantic selectors"| LS
LS -->|"next visit"| CSS
| Feature | Model | API | Code |
|---|---|---|---|
| Voice Assistant | gemini-2.5-flash-native-audio |
Live API (WebSocket) | realtime_client.py:549 |
| Silent Recorder | gemini-3-flash-preview |
Vision + Tool Calling | realtime_client.py:645 |
| Brand Research | gemini-3-flash-preview |
Google Search grounding | realtime_client.py:259 |
| Product Extraction | gemini-3-flash-preview |
Structured Output | extractor.ts:63 |
| Vision Fallback | gemini-3-flash-preview |
Vision + Structured Output | extractor.ts:519 |
| Selector Learning | gemini-3-flash-preview |
Structured Output | extractor.ts:598 |
| Alternative Suggestions | gemini-3-flash-preview |
Structured Output | extractor.ts:292 |
| Fuzzy Product Matching | gemini-3-flash-preview |
Structured Output | extractor.ts:420 |
cd server && npm install
GEMINI_API_KEY=... npm run devcd live
uv pip install -r requirements.txt
GEMINI_API_KEY=... python realtime_client.py --livechrome://extensions→ Enable Developer Mode- Load unpacked → select
extension/folder
├── server/ # API server (Express + SQLite)
│ ├── index.ts # API routes
│ ├── extractor.ts # Gemini extraction & alternatives
│ └── db.ts # Database operations
├── live/ # Real-time assistant
│ └── realtime_client.py # Dual-LLM live assistant
├── extension/ # Chrome extension
│ ├── content.ts # Product detection
│ ├── widget.ts # Price comparison overlay
│ └── product-extractor.ts # Multi-tier extraction
└── shared/ # TypeScript types