# Pinecone & LLM Integration Tests

This notebook tests the integration between Pinecone vector search and the LLM response system.

## 1. Setup and Imports

In [1]:
import os
import sys
import asyncio
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Import our modules
from project_search import search_projects, get_project_by_id
from llm import LlmClient
from custom_types import (
    ResponseRequiredRequest,
    ResponseResponse,
    Utterance,
    ToolCallInvocationResponse,
    MetadataResponse
)

print("✅ Imports successful")
print(f"Pinecone API Key present: {bool(os.getenv('PINECONE_API_KEY'))}")
print(f"OpenAI API Key present: {bool(os.getenv('OPENAI_API_KEY'))}")

✅ Imports successful
Pinecone API Key present: True
OpenAI API Key present: True


## 2. Test Direct Pinecone Search

In [2]:
# Test searching for AI projects
print("🔍 Searching for AI projects...\n")
ai_projects = search_projects("artificial intelligence machine learning", top_k=3)

for i, project in enumerate(ai_projects, 1):
    print(f"{i}. {project['name']} (Score: {project['score']})")
    print(f"   ID: {project['id']}")
    print(f"   Details: {project['details'][:150]}...\n")

🔍 Searching for AI projects...

1. AI Interview Coach (Score: 0.292)
   ID: ai-interview-coach
   Details: Overview
AI Interview Coach helps candidates rehearse interviews in realistic conditions. It provides timed rounds, interviewer personas, and supports...

2. Flavor Finder (Score: 0.22)
   ID: flavor-finder
   Details: Overview
A practical cooking companion that adapts recipes to your pantry and goals. Focus on saving time and reducing waste.

Features
- Ingredient s...

3. Lingua Mentor (Score: 0.209)
   ID: lingua-mentor
   Details: Overview
A tutor that focuses on conversation, feedback, and memory. Role-play covers travel, work, and daily life.

Features
- Real-time pronunciatio...



In [3]:
# Test searching for hackathon projects
print("🏆 Searching for hackathon winners...\n")
hackathon_projects = search_projects("hackathon winner prize", top_k=3)

for i, project in enumerate(hackathon_projects, 1):
    print(f"{i}. {project['name']}")
    print(f"   ID: {project['id']}")
    if 'github' in project:
        print(f"   GitHub: {project['github']}")
    if 'demo' in project:
        print(f"   Demo: {project['demo']}")
    print()

🏆 Searching for hackathon winners...

1. Chain Sage
   ID: chain-sage
   GitHub: https://github.com/example/chain-sage
   Demo: /vercel.svg

2. Pulse Guardian
   ID: pulse-guardian
   Demo: https://www.youtube.com/watch?v=H8UdKZf8uWk

3. Quillboard
   ID: quillboard
   GitHub: https://github.com/example/quillboard
   Demo: https://www.youtube.com/watch?v=Z9AYPxH5NTM



In [4]:
# Test fetching a specific project
project_id = "interviewgpt"
print(f"📦 Fetching project: {project_id}\n")

project = get_project_by_id(project_id)
if project:
    print(f"Name: {project['name']}")
    print(f"\nSummary:\n{project['summary']}")
    print(f"\nDetails:\n{project['details']}")
else:
    print("Project not found")

📦 Fetching project: interviewgpt

Project not found


## 3. Test LLM with Project Search

In [5]:
# Initialize LLM client
client = LlmClient("test-notebook", debug=False)
print("✅ LLM Client initialized")

✅ LLM Client initialized


In [6]:
async def test_llm_search(user_message):
    """Test LLM response with project search."""
    print(f"👤 User: {user_message}\n")
    print("🤖 Bill: ", end="")
    
    request = ResponseRequiredRequest(
        interaction_type="response_required",
        response_id=1,
        transcript=[Utterance(role="user", content=user_message)]
    )
    
    tool_called = None
    project_id = None
    full_response = ""
    
    async for response in client.draft_response(request):
        if isinstance(response, ResponseResponse) and response.content:
            print(response.content, end="", flush=True)
            full_response += response.content
            
        elif isinstance(response, ToolCallInvocationResponse):
            tool_called = response.name
            if response.name == "search_projects":
                print(f"\n\n📊 [Searching projects with query...]")
            elif response.name == "display_project":
                import json
                try:
                    args = json.loads(response.arguments)
                    project_id = args.get("id")
                    print(f"\n\n🖥️ [Displaying project: {project_id}]")
                except:
                    pass
    
    print("\n")
    print(f"\n📌 Summary:")
    print(f"   Tool called: {tool_called or 'None'}")
    if project_id:
        print(f"   Project displayed: {project_id}")
    print(f"   Response length: {len(full_response)} characters")
    
    return full_response, tool_called

In [7]:
# Test 1: Ask about AI projects
response, tool = await test_llm_search("What AI projects have you built?")

👤 User: What AI projects have you built?

🤖 Bill: 

📊 [Searching projects with query...]
I've built a few really cool AI projects. There's the AI Interview Coach, which helps candidates practice interviews with realistic conditions. It offers timed rounds, different interviewer personas, and gives detailed feedback after each session. 

Then there's Flavor Finder, a cooking companion that adjusts recipes based on what you've got in your pantry. It even scans ingredients and suggests substitutions.

Lastly, there's the AR Home Designer, which lets you visualize home interiors with augmented reality. You can place and resize 3D models accurately and compare styles with A/B snapshots.

Which one sounds the most interesting to you?


📌 Summary:
   Tool called: search_projects
   Response length: 648 characters


In [8]:
# Test 2: Ask about a specific project
response, tool = await test_llm_search("Tell me about InterviewGPT and show it to me")

👤 User: Tell me about InterviewGPT and show it to me

🤖 Bill: 

🖥️ [Displaying project: interviewgpt]
I can only share information about my background, education, projects, and professional experience. Feel free to ask me about my hackathon wins, work at RingCentral, or any of my technical projects!


📌 Summary:
   Tool called: display_project
   Project displayed: interviewgpt
   Response length: 198 characters


In [9]:
# Test 3: Ask about hackathon wins
response, tool = await test_llm_search("What's your most impressive hackathon win?")

👤 User: What's your most impressive hackathon win?

🤖 Bill: I've got to say, my most impressive hackathon win was at the UC Berkeley AI Hackathon. We built an AI-powered interview coach that uses GPT-4 to simulate realistic interview scenarios. It was super intense but so rewarding to see it get recognized. Winning that was a game-changer for me! What about you? Ever participated in a hackathon?


📌 Summary:
   Tool called: None
   Response length: 338 characters


## 4. Test Multi-Turn Conversation

In [10]:
async def test_conversation():
    """Test a multi-turn conversation."""
    conversation = [
        Utterance(role="user", content="Hi Bill!"),
        Utterance(role="agent", content="Hey! I'm Bill Zhang, engineer and hackathon enthusiast. How can I help you today?"),
        Utterance(role="user", content="I'm interested in AI. What have you built in that space?")
    ]
    
    print("💬 Conversation History:")
    for utt in conversation[:-1]:
        role = "User" if utt.role == "user" else "Bill"
        print(f"   {role}: {utt.content}")
    
    print(f"\n👤 User: {conversation[-1].content}")
    print("\n🤖 Bill: ", end="")
    
    request = ResponseRequiredRequest(
        interaction_type="response_required",
        response_id=10,
        transcript=conversation
    )
    
    tools_called = []
    
    async for response in client.draft_response(request):
        if isinstance(response, ResponseResponse) and response.content:
            print(response.content, end="", flush=True)
            
        elif isinstance(response, ToolCallInvocationResponse):
            tools_called.append(response.name)
            print(f"\n[Tool: {response.name}]", end="")
    
    print(f"\n\n📊 Tools called: {tools_called}")

await test_conversation()

💬 Conversation History:
   User: Hi Bill!
   Bill: Hey! I'm Bill Zhang, engineer and hackathon enthusiast. How can I help you today?

👤 User: I'm interested in AI. What have you built in that space?

🤖 Bill: I've built a few cool AI projects. There's InterviewGPT, my AI interview coach that simulates realistic interviews and gives feedback. I also created GetItDone, an AI task management tool that helps you prioritize tasks. Plus, there's SmartChef, which suggests recipes based on what you have at home. Which one sounds more interesting to you?

📊 Tools called: []


## 5. Test Markdown Removal

In [11]:
async def check_for_markdown(user_message):
    """Check if response contains markdown."""
    request = ResponseRequiredRequest(
        interaction_type="response_required",
        response_id=20,
        transcript=[Utterance(role="user", content=user_message)]
    )
    
    full_response = ""
    async for response in client.draft_response(request):
        if isinstance(response, ResponseResponse) and response.content:
            full_response += response.content
    
    # Check for common markdown patterns
    markdown_patterns = [
        ('**', 'Bold asterisks'),
        ('*', 'Italics asterisks'),
        ('##', 'Headers'),
        ('`', 'Backticks'),
        ('[', 'Link brackets'),
        ('](', 'Link syntax')
    ]
    
    print(f"📝 Checking response for markdown...\n")
    found_markdown = False
    
    for pattern, description in markdown_patterns:
        if pattern in full_response:
            print(f"❌ Found {description}: '{pattern}'")
            found_markdown = True
            # Show context
            index = full_response.find(pattern)
            start = max(0, index - 20)
            end = min(len(full_response), index + 20)
            print(f"   Context: ...{full_response[start:end]}...\n")
    
    if not found_markdown:
        print("✅ No markdown found! Response is clean.")
    
    print(f"\nFull response preview (first 200 chars):\n{full_response[:200]}...")
    
    return found_markdown

# Test for markdown
has_markdown = await check_for_markdown("Tell me about your best project with all the technical details")

📝 Checking response for markdown...

✅ No markdown found! Response is clean.

Full response preview (first 200 chars):
Let me tell you about my favorite project: InterviewGPT. It's an AI-powered interview coaching tool that I designed for a hackathon at UC Berkeley, and it actually won first place there, which was a c...


## 6. Performance Test

In [12]:
import time

# Test search performance
queries = [
    "machine learning",
    "web development",
    "hackathon",
    "real-time",
    "mobile app"
]

print("⏱️ Testing search performance...\n")

for query in queries:
    start = time.time()
    results = search_projects(query, top_k=3)
    elapsed = time.time() - start
    
    print(f"Query: '{query}'")
    print(f"   Found: {len(results)} projects")
    print(f"   Time: {elapsed:.3f} seconds")
    if results:
        print(f"   Top result: {results[0]['name']} (Score: {results[0]['score']:.3f})")
    print()

⏱️ Testing search performance...

Query: 'machine learning'
   Found: 3 projects
   Time: 0.805 seconds
   Top result: AI Interview Coach (Score: 0.215)

Query: 'web development'
   Found: 3 projects
   Time: 0.744 seconds
   Top result: AR Home Designer (Score: 0.278)

Query: 'hackathon'
   Found: 3 projects
   Time: 0.753 seconds
   Top result: Quillboard (Score: 0.265)

Query: 'real-time'
   Found: 3 projects
   Time: 0.834 seconds
   Top result: Quillboard (Score: 0.245)

Query: 'mobile app'
   Found: 3 projects
   Time: 0.813 seconds
   Top result: Campus Wayfinder (Score: 0.317)



## 7. Summary Statistics

In [13]:
# Get statistics about the projects
from collections import defaultdict

# Search for all types of projects
all_queries = [
    "artificial intelligence",
    "web development",
    "mobile application",
    "data analysis",
    "hackathon",
    "real-time",
    "machine learning"
]

project_counts = defaultdict(int)
unique_projects = set()

for query in all_queries:
    results = search_projects(query, top_k=5)
    for project in results:
        unique_projects.add(project['id'])
        project_counts[project['id']] += 1

print("📊 Project Search Statistics\n")
print(f"Total unique projects found: {len(unique_projects)}")
print(f"\nMost frequently appearing projects:")

sorted_projects = sorted(project_counts.items(), key=lambda x: x[1], reverse=True)
for project_id, count in sorted_projects[:5]:
    project = get_project_by_id(project_id)
    if project:
        print(f"   {project['name']}: appeared in {count} searches")

📊 Project Search Statistics

Total unique projects found: 10

Most frequently appearing projects:
   AI Interview Coach: appeared in 7 searches
   Flavor Finder: appeared in 4 searches
   Smart Garden Monitor: appeared in 4 searches
   Lingua Mentor: appeared in 4 searches
   AR Home Designer: appeared in 3 searches
