# FeedPrism PoC Evaluation Notebook

This notebook validates the core pipeline of FeedPrism:
1.  **Fetch** email via Gmail API
2.  **Extract** event details using OpenAI (GPT-4)
3.  **Embed & Store** in Qdrant (In-Memory)
4.  **Search & Retrieve** using vector similarity

In [1]:
import sys, os
print("Python executable:", sys.executable)
print("Virtual env:", os.getenv("VIRTUAL_ENV"))

Python executable: /Users/Shared/ALL WORKSPACE/Hackathons/mom_hack/feedprism-poc/.venv_new/bin/python
Virtual env: /Users/Shared/ALL WORKSPACE/Hackathons/mom_hack/feedprism-poc/.venv_new


In [2]:
# 1. Imports & Setup
import os
import json
from google.oauth2.credentials import Credentials
from googleapiclient.discovery import build
import openai
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

print("‚úÖ Libraries imported and environment loaded.")

‚úÖ Libraries imported and environment loaded.


In [4]:
# 2. Define Helper Functions

def fetch_sample_email():
    print("üìß Fetching email from Gmail...")
    try:
        creds = Credentials.from_authorized_user_file('token.json')
        service = build('gmail', 'v1', credentials=creds)
        results = service.users().messages().list(userId='me', maxResults=1).execute()
        messages = results.get('messages', [])
        
        if not messages:
            print("‚ö†Ô∏è No emails found in inbox.")
            return None

        msg_id = messages[0]['id']
        msg = service.users().messages().get(userId='me', id=msg_id, format='full').execute()
        snippet = msg.get('snippet', '')
        print(f"‚úÖ Fetched email snippet: {snippet[:100]}...")
        return snippet
    except Exception as e:
        print(f"‚ùå Gmail API Error: {e}")
        return None

def extract_event(email_text):
    print("ü§ñ Extracting event with LLM...")
    if not email_text:
        email_text = "Join us for the AI Summit 2024 on Dec 15 at the Convention Center. It will be a great event about LLMs."
        print("‚ö†Ô∏è Using fallback sample text.")

    prompt = f"""
    Extract event details from this email text in JSON format.
    Return ONLY raw JSON, no markdown formatting.
    
    Fields: title, date, location, description
    
    Email:
    {email_text}
    """
    
    try:
        client = openai.OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
        response = client.chat.completions.create(
            model="gpt-4",
            messages=[{"role": "user", "content": prompt}]
        )
        content = response.choices[0].message.content
        content = content.replace("```json", "").replace("```", "").strip()
        print(f"‚úÖ Extracted JSON: {content}")
        return json.loads(content)
    except Exception as e:
        print(f"‚ùå LLM Error: {e}")
        return {
            "title": "Error Event",
            "date": "2024-01-01",
            "location": "Error Land",
            "description": "Failed to extract"
        }

def store_in_qdrant(event_data):
    print("ww Storing in Qdrant...")
    try:
        client = QdrantClient(":memory:")
        
        client.create_collection(
            collection_name="events",
            vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
        )
        
        openai_client = openai.OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
        text_to_embed = f"{event_data.get('title', '')} {event_data.get('description', '')}"
        
        embedding_resp = openai_client.embeddings.create(
            input=text_to_embed,
            model="text-embedding-ada-002"
        )
        embedding = embedding_resp.data[0].embedding
        
        client.upsert(
            collection_name="events",
            points=[
                PointStruct(
                    id=1,
                    vector=embedding,
                    payload=event_data
                )
            ]
        )
        print("‚úÖ Stored vector in Qdrant")
        return client
    except Exception as e:
        print(f"‚ùå Qdrant Error: {e}")
        return None

def search_events(client, query):
    print(f"üîç Searching for: '{query}'...")
    if not client:
        return []
        
    try:
        openai_client = openai.OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
        embedding_resp = openai_client.embeddings.create(
            input=query,
            model="text-embedding-ada-002"
        )
        query_embedding = embedding_resp.data[0].embedding
        
        results = client.search(
            collection_name="events",
            query_vector=query_embedding,
            limit=5
        )
        return results
    except Exception as e:
        print(f"‚ùå Search Error: {e}")
        return []

### Step 1: Fetch Email

In [5]:
email_text = fetch_sample_email()
print(f"\nüìù Email Content:\n{email_text}")

üìß Fetching email from Gmail...
‚úÖ Fetched email snippet: Dishant Ghai, you can view the changes on our website. Hello, Dishant Ghai We&#39;re making some cha...

üìù Email Content:
Dishant Ghai, you can view the changes on our website. Hello, Dishant Ghai We&#39;re making some changes to our legal agreements that will apply to you. There is no action needed from you today, but if


### Step 2: Extract Event Data

In [6]:
event_data = extract_event(email_text)
print("\nüìÑ Extracted Data:")
print(json.dumps(event_data, indent=2))

ü§ñ Extracting event with LLM...
‚úÖ Extracted JSON: This email text does not contain any event details.
‚ùå LLM Error: Expecting value: line 1 column 1 (char 0)

üìÑ Extracted Data:
{
  "title": "Error Event",
  "date": "2024-01-01",
  "location": "Error Land",
  "description": "Failed to extract"
}


### Step 3: Store in Vector Database

In [7]:
event_data

{'title': 'Error Event',
 'date': '2024-01-01',
 'location': 'Error Land',
 'description': 'Failed to extract'}

In [8]:
qdrant_client = store_in_qdrant(event_data)

ww Storing in Qdrant...
‚úÖ Stored vector in Qdrant


### Step 4: Search & Evaluate

In [12]:
query = "upcoming AI events"
results = search_events(qdrant_client, query)

print(f"\nüìä Results for '{query}':")
for res in results:
    print(f"- Score: {res.score:.4f} | Title: {res.payload.get('title')}")

üîç Searching for: 'upcoming AI events'...

üìä Results for 'upcoming AI events':
- Score: 0.7443 | Title: Error Event
