Skip to content

anyuan-chen/delphi

Repository files navigation

Delphi

AI shopping assistant that sees what you see and tells you when you're getting a deal.

Unlike voice assistants that wait for commands, Delphi watches your shopping session and proactively alerts you when it spots a great price—while staying silent when it doesn't.


How It Works

Proactive Deal Detection

Two Gemini models run simultaneously: a silent recorder extracts prices from video frames, while a voice assistant handles conversation. When the recorder spots a great deal, it injects an alert into the assistant's Live API session.

# live/realtime_client.py:691-702
if verdict in ["great_deal", "good"] and normalized_key not in notified_deals:
    notified_deals.add(normalized_key)
    deal_msg = f"[SYSTEM: Great deal detected - {product_key} at ${price}. Alert the user briefly.]"

    # Inject into Live API session
    await assistant.send_client_content(
        turns=types.Content(role="user", parts=[types.Part(text=deal_msg)])
    )

Brand Research with Google Search

"Is Kirkland vitamins good?" → Gemini searches the web and synthesizes certifications, country of origin, and reputation.

# live/realtime_client.py:259-265
search_model = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=f"Research the brand '{brand}' for {product_type}...",
    config=types.GenerateContentConfig(
        tools=[types.Tool(google_search=types.GoogleSearch())]
    )
)

Quality Evaluation

"Does this look ripe?" → Vision analysis with expert criteria for produce, meat quality, and supplement evaluation.

Structured Output Everywhere

All extraction uses JSON schemas—no regex parsing of free-form text.

// server/extractor.ts:8-49
const productSchema = {
  type: SchemaType.OBJECT,
  properties: {
    products: {
      type: SchemaType.ARRAY,
      items: {
        type: SchemaType.OBJECT,
        properties: {
          name: { type: SchemaType.STRING },
          price: { type: SchemaType.NUMBER, nullable: true },
          category: { type: SchemaType.STRING, enum: ['grocery', 'car', 'electronics', ...] },
          // ...
        }
      }
    }
  }
};

Browser Extension

Chrome extension that overlays price comparisons directly on product pages.

Multi-Tier Extraction

Five fallback methods ensure extraction works across sites:

flowchart LR
    subgraph PAGE["Product Page"]
        HTML[HTML]
        JSONLD["JSON-LD"]
    end

    subgraph EXTRACT["Extraction Priority"]
        E1["1. JSON-LD"]
        E2["2. Learned Selectors"]
        E3["3. Site Rules"]
        E4["4. Meta Tags"]
        E5["5. Vision Fallback"]
    end

    subgraph WIDGET["Price Widget"]
        BADGE["Verdict Badge"]
        ALT["Alternatives"]
    end

    PAGE --> E1 --> E2 --> E3 --> E4 --> E5
    E5 --> WIDGET
Loading
// extension/product-extractor.ts:560-583
export function extractProductFromPage(): ExtractedProductInfo | null {
  const methods = [
    extractFromJsonLd,
    extractFromLearnedSelectors,
    extractFromSite,
    extractFromMeta,
    extractGeneric,
  ];

  for (const method of methods) {
    const result = method();
    if (result && result.name) return result;
  }
  return null;
}

Self-Healing Selectors

When CSS selectors break (sites update their HTML), the extension:

  1. Falls back to Gemini Vision to extract product from screenshot
  2. Uses extracted values to teach Gemini which HTML elements contain them
  3. Gemini returns stable, semantic CSS selectors
  4. New selectors are cached for future visits
// server/extractor.ts:620-627
const valueHints = knownValues ? `
KNOWN VALUES - Find the elements containing these EXACT values:
${knownValues.name ? `- Product name: "${knownValues.name}"` : ''}
${knownValues.price ? `- Price: $${knownValues.price}` : ''}
` : '';

The selector learner rejects unstable utility classes (Tailwind, CSS-in-JS hashes) and only accepts semantic selectors:

// server/extractor.ts:673-685
const badSelectorPatterns = [
  /^\.(?:flex|block|hidden|inline|grid)/i,  // Layout utilities
  /^\.(?:text-|font-|bg-|border-)/i,        // Tailwind-style
  /^\.[a-z]{1,2}$/i,           // Single/double letter classes
  /^\.(?:css|sc|styled)-[a-z0-9]+$/i,       // CSS-in-JS generated
];

Widget Verdicts

  • Great Deal — Below typical price
  • Good — At or slightly below average
  • Fair — At market price
  • Pricey — Above average

Each verdict includes alternatives: budget options (with tradeoffs), similar products, and premium upgrades (with benefits).


Under the Hood

System Architecture

flowchart TB
    subgraph USER["User"]
        CAM["Camera/Screen"]
        MIC["Microphone"]
    end

    subgraph DUAL["Dual-LLM System"]
        subgraph REC["Silent Recorder"]
            R["gemini-3-flash"]
        end
        subgraph AST["Voice Assistant"]
            A["gemini-2.5-flash-live"]
        end
        R -->|"deal alert"| A
    end

    subgraph API["Delphi API Server"]
        DB[("SQLite DB")]
        EXT["Gemini Extractor"]
    end

    CAM --> R
    CAM --> A
    MIC --> A
    R -->|"add_product()"| API
    A -->|"check_price()"| API
    EXT --> DB
Loading

Shopping Session Flow

sequenceDiagram
    participant U as User
    participant R as Recorder<br/>(gemini-3-flash)
    participant A as Assistant<br/>(Live API)
    participant API as Delphi API

    U->>R: Video frames (1 fps)
    U->>A: Audio + Video stream

    loop Every Frame
        R->>R: Extract visible prices
        R->>API: add_product(name, price, store)
        API->>API: Store in SQLite
    end

    R->>API: check_price(product)
    API-->>R: verdict: "great_deal"
    R->>A: [SYSTEM: Great deal detected!]
    A->>U: "Hey, that D3 is actually a great price!"

    U->>A: "Is Kirkland good quality?"
    A->>API: get_brand_info("Kirkland", "vitamins")
    API->>API: Google Search grounding
    API-->>A: Brand info + certifications
    A->>U: "Kirkland is USP verified..."
Loading

Proactive Deal Detection Flow

flowchart LR
    subgraph CAPTURE["Frame Capture"]
        SC["Screenshot<br/>1 fps"]
    end

    subgraph EXTRACT["Extraction"]
        GEM["gemini-3-flash<br/>+ tools"]
    end

    subgraph EVAL["Evaluation"]
        CHK["check_price()"]
        DB[("Price DB")]
    end

    subgraph NOTIFY["Notification"]
        INJ["Inject into<br/>Live session"]
    end

    SC --> GEM
    GEM -->|"add_product()"| DB
    GEM --> CHK
    DB --> CHK
    CHK -->|"great_deal"| INJ
Loading

Self-Healing Selector Flow

flowchart TB
    subgraph FAIL["Selector Failure"]
        CSS["CSS selectors<br/>return null"]
    end

    subgraph VISION["Vision Fallback"]
        SS["Capture screenshot"]
        VIS["gemini-3-flash<br/>vision extraction"]
    end

    subgraph LEARN["Learn New Selectors"]
        HTML["Page HTML"]
        LLM["gemini-3-flash<br/>selector learning"]
    end

    subgraph CACHE["Cache"]
        LS["localStorage<br/>7-day TTL"]
    end

    CSS --> SS
    SS --> VIS
    VIS -->|"name, price"| LLM
    HTML --> LLM
    LLM -->|"semantic selectors"| LS
    LS -->|"next visit"| CSS
Loading

Gemini API Integration

Feature Model API Code
Voice Assistant gemini-2.5-flash-native-audio Live API (WebSocket) realtime_client.py:549
Silent Recorder gemini-3-flash-preview Vision + Tool Calling realtime_client.py:645
Brand Research gemini-3-flash-preview Google Search grounding realtime_client.py:259
Product Extraction gemini-3-flash-preview Structured Output extractor.ts:63
Vision Fallback gemini-3-flash-preview Vision + Structured Output extractor.ts:519
Selector Learning gemini-3-flash-preview Structured Output extractor.ts:598
Alternative Suggestions gemini-3-flash-preview Structured Output extractor.ts:292
Fuzzy Product Matching gemini-3-flash-preview Structured Output extractor.ts:420

Quick Start

Server

cd server && npm install
GEMINI_API_KEY=... npm run dev

Live Assistant

cd live
uv pip install -r requirements.txt
GEMINI_API_KEY=... python realtime_client.py --live

Browser Extension

  1. chrome://extensions → Enable Developer Mode
  2. Load unpacked → select extension/ folder

Project Structure

├── server/              # API server (Express + SQLite)
│   ├── index.ts         # API routes
│   ├── extractor.ts     # Gemini extraction & alternatives
│   └── db.ts            # Database operations
├── live/                # Real-time assistant
│   └── realtime_client.py   # Dual-LLM live assistant
├── extension/           # Chrome extension
│   ├── content.ts       # Product detection
│   ├── widget.ts        # Price comparison overlay
│   └── product-extractor.ts # Multi-tier extraction
└── shared/              # TypeScript types

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors