# Part 2: Building AI-Powered Semantic Product Search with pgvector and Amazon Bedrock
### Building and Validating Semantic Search

Welcome to Part 2 of our workshop on building an AI-powered semantic search system. In this section, we'll bring together Amazon Bedrock's embedding capabilities and Aurora PostgreSQL's pgvector extension to create a powerful, context-aware product search engine. The semantic search implementation will understand the meaning behind user queries rather than just matching keywords.

## Contents
1. Basic Semantic Search: Learn how to implement pure vector similarity search
2. Advanced Search with Filters: Combine semantic search with traditional database filters
3. Example Queries and Testing: Explore real-world applications and test the system

## Setting Up Our Environment
First, let's set up our development environment by installing the necessary libraries and establishing connections to our services. This setup includes:
- psycopg3: For PostgreSQL database interactions
- pgvector: Enabling vector operations in PostgreSQL
- boto3: AWS SDK for Python
- ipywidgets: Creating interactive UI elements
- Additional utilities for progress tracking and data handling

The code establishes secure connections to both Amazon Bedrock for AI capabilities and Aurora PostgreSQL for data storage. We retrieve database credentials securely from AWS Secrets Manager to maintain best practices in security.

In [1]:
# Install Required Libraries
%pip install setuptools==65.5.0
%pip install "psycopg[binary]" pgvector pandarallel boto3 tqdm numpy ipywidgets

# Import Libraries and Set Up Connections
import boto3
import json
import psycopg
from pgvector.psycopg import register_vector
from IPython.display import display, HTML, clear_output
import ipywidgets as widgets
from tqdm.notebook import tqdm

# Initialize AWS and database connections
client = boto3.client('secretsmanager')
response = client.get_secret_value(SecretId='apgpg-pgvector-secret')
database_secrets = json.loads(response['SecretString'])

dbhost = database_secrets['host']
dbport = database_secrets['port']
dbuser = database_secrets['username']
dbpass = database_secrets['password']

# Initialize Bedrock client
bedrock_runtime = boto3.client('bedrock-runtime')

Collecting setuptools==65.5.0
  Using cached setuptools-65.5.0-py3-none-any.whl.metadata (6.3 kB)
Using cached setuptools-65.5.0-py3-none-any.whl (1.2 MB)
Installing collected packages: setuptools
  Attempting uninstall: setuptools
    Found existing installation: setuptools 80.9.0
    Uninstalling setuptools-80.9.0:
      Successfully uninstalled setuptools-80.9.0
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
jupyterlab 4.4.6 requires httpx<1,>=0.25.0, which is not installed.[0m[31m
[0mSuccessfully installed setuptools-65.5.0
Note: you may need to restart the kernel to use updated packages.
Collecting pgvector
  Using cached pgvector-0.4.1-py3-none-any.whl.metadata (18 kB)
Collecting pandarallel
  Using cached pandarallel-1.6.5-py3-none-any.whl
Collecting psycopg[binary]
  Using cached psycopg-3.2.10-py3-none-any.whl.metadata (4.6 kB)
Collecting psyc

ResourceNotFoundException: An error occurred (ResourceNotFoundException) when calling the GetSecretValue operation: Secrets Manager can't find the specified secret.

## Semantic Search Implementation
Our implementation brings together several key components to create an intuitive search experience:

1. **Embedding Generation**: We use Amazon Bedrock's Titan model to convert text queries into high-dimensional vectors that capture semantic meaning. These embeddings allow us to find products based on contextual similarity rather than exact keyword matches.

2. **Vector Similarity Search**: Using pgvector's specialized operators, we can efficiently find the closest matching products in our database. The `<=>` operator computes cosine similarity between vectors, helping us rank results by relevance.

3. **Interactive Interface**: We've created a user-friendly interface with both basic and advanced search capabilities:
   - Basic Search: Simple query input with adjustable number of results
   - Advanced Search: Additional filters for price, ratings, and categories
   - Example Queries: Quick-access buttons to demonstrate various search scenarios

## Implementation Details
Our search interface combines several sophisticated features:

1. **Dual Search Modes**:
   - Basic mode for quick, straightforward searches
   - Advanced mode with filters for refined product discovery

2. **Real-time Feedback**:
   - Loading indicators during searches
   - Clear result displays with product details
   - Similarity scores to show match relevance

3. **Enhanced User Experience**:
   - Hover effects on product cards
   - Star ratings visualization
   - Price and category highlighting

## Results Display
The search results are presented in an easy-to-scan format, with each product card showing:
- Product image and description
- Price and rating information
- Number of reviews
- Category classification
- Semantic match score

The interface updates dynamically as users:
- Switch between basic and advanced search
- Adjust filter parameters
- Try different example queries
- Explore search results

In [2]:
def generate_embedding(text):
    """Generate embedding for a single text using Amazon Titan"""
    try:
        payload = json.dumps({'inputText': text})
        response = bedrock_runtime.invoke_model(
            body=payload,
            modelId='amazon.titan-embed-text-v2:0',
            accept="application/json",
            contentType="application/json"
        )
        response_body = json.loads(response.get("body").read())
        return response_body.get("embedding")
    except Exception as e:
        print(f"Error generating embedding: {str(e)}")
        return None

def search_products(query, num_results=3):
    """Basic semantic search for products"""
    query_embedding = generate_embedding(query)

    conn = psycopg.connect(
        host=dbhost,
        port=dbport,
        user=dbuser,
        password=dbpass,
        autocommit=True
    )

    register_vector(conn)

    results = conn.execute("""
        SELECT 
            \"productId\",
            product_description,
            imgUrl,
            stars,
            reviews,
            price,
            category_name,
            1 - (embedding <=> %s::vector) as similarity
        FROM bedrock_integration.product_catalog
        ORDER BY embedding <=> %s::vector
        LIMIT %s;
    """, (query_embedding, query_embedding, num_results)).fetchall()

    conn.close()
    return results

def advanced_search(query, category=None, max_price=None, min_stars=None, num_results=3):
    """Advanced search with multiple filters"""
    query_embedding = generate_embedding(query)

    conn = psycopg.connect(
        host=dbhost,
        port=dbport,
        user=dbuser,
        password=dbpass,
        autocommit=True
    )

    register_vector(conn)

    sql = """
        SELECT 
            \"productId\",
            product_description,
            imgUrl,
            stars,
            reviews,
            price,
            category_name,
            1 - (embedding <=> %s::vector) as similarity
        FROM bedrock_integration.product_catalog
        WHERE 1=1
    """
    params = [query_embedding]

    if category and category != 'All Categories':
        sql += " AND category_name = %s"
        params.append(category)
    if max_price:
        sql += " AND price <= %s"
        params.append(max_price)
    if min_stars:
        sql += " AND stars >= %s"
        params.append(min_stars)

    sql += """
        ORDER BY embedding <=> %s::vector
        LIMIT %s;
    """
    params.extend([query_embedding, num_results])

    results = conn.execute(sql, params).fetchall()
    conn.close()
    return results

def create_search_interface():
    """Create and display the interactive search interface"""
    # Create results area for displaying search results
    results_area = widgets.Output(
        layout=widgets.Layout(
            border='1px solid #ddd',
            padding='10px',
            margin='10px 0',
            min_height='100px'
        )
    )

    # Create search widgets for basic search
    basic_search_text = widgets.Text(
        value='',
        placeholder='Enter your search query...',
        description='Search:',
        layout=widgets.Layout(width='80%')
    )

    basic_results_slider = widgets.IntSlider(
        value=3,
        min=1,
        max=10,
        step=1,
        description='Results:',
        continuous_update=False
    )

    # Create widgets for advanced search
    advanced_search_text = widgets.Text(
        value='',
        placeholder='Enter your search query...',
        description='Search:',
        layout=widgets.Layout(width='80%')
    )

    category_dropdown = widgets.Dropdown(
        options=['All Categories', 
                'Smart Home: Security Cameras and Systems',
                'Smart Home: Voice Assistants and Hubs', 
                'Household Supplies',
                'Kitchen & Dining', 
                'Outdoor Recreation', 
                'Hair Care Products',
                'Gift Cards', 
                'Skin Care Products'],
        value='All Categories',
        description='Category:'
    )

    max_price_slider = widgets.FloatSlider(
        value=100,
        min=0,
        max=200,
        step=5,
        description='Max Price:$',
        continuous_update=False
    )

    min_stars_slider = widgets.FloatSlider(
        value=3.0,
        min=0,
        max=5,
        step=0.5,
        description='Min Stars:',
        continuous_update=False
    )

    advanced_results_slider = widgets.IntSlider(
        value=3,
        min=1,
        max=10,
        step=1,
        description='Results:',
        continuous_update=False
    )

    # Create search tabs
    basic_search_box = widgets.VBox([
        widgets.HTML(value="<h3>Basic Search</h3>"),
        basic_search_text,
        basic_results_slider
    ])

    advanced_search_box = widgets.VBox([
        widgets.HTML(value="<h3>Advanced Search</h3>"),
        advanced_search_text,
        category_dropdown,
        max_price_slider,
        min_stars_slider,
        advanced_results_slider
    ])

    search_type_tabs = widgets.Tab(children=[basic_search_box, advanced_search_box])
    search_type_tabs.set_title(0, 'Basic Search')
    search_type_tabs.set_title(1, 'Advanced Search')

    # Create search button and loading indicator
    search_button = widgets.Button(
        description='Search',
        button_style='primary',
        tooltip='Click to search',
        layout=widgets.Layout(width='150px')
    )

    loading_indicator = widgets.HTML(value="")

    def display_results(results):
        """Display search results with enhanced styling"""
        results_area.clear_output()

        with results_area:
            html_output = """
            <style>
                .search-results {
                    margin-top: 20px;
                    padding: 10px;
                }
                .product-card { 
                    margin: 15px 0; 
                    padding: 20px; 
                    border: 1px solid #ddd; 
                    border-radius: 8px; 
                    box-shadow: 0 2px 4px rgba(0,0,0,0.1);
                    transition: transform 0.2s ease-in-out;
                    background-color: white;
                }
                .product-card:hover {
                    transform: translateY(-5px);
                    box-shadow: 0 4px 8px rgba(0,0,0,0.2);
                }
                .product-grid { 
                    display: grid; 
                    grid-template-columns: 200px 1fr; 
                    gap: 20px; 
                }
                .product-info {
                    display: flex;
                    flex-direction: column;
                    gap: 8px;
                }
                .product-price { 
                    color: #B12704; 
                    font-weight: bold; 
                    font-size: 1.2em; 
                }
                .product-stars { color: #FFA41C; }
                .product-reviews { color: #007185; }
                .product-category { 
                    color: #565959; 
                    font-size: 0.9em;
                }
                .similarity-score { 
                    color: #007600; 
                    font-weight: bold;
                    background: #f0f8f0;
                    padding: 5px 10px;
                    border-radius: 4px;
                    display: inline-block;
                }
                .results-header {
                    color: #444;
                    margin-bottom: 20px;
                    padding-bottom: 10px;
                    border-bottom: 2px solid #eee;
                }
            </style>
            <div class="search-results">
                <h3 class="results-header">Search Results</h3>
            """

            if not results:
                html_output += "<p>No results found.</p>"
            else:
                for row in results:
                    similarity = round((row[-1] or 0) * 100, 2)
                    stars = "⭐" * int(row[3]) if row[3] else ""

                    html_output += f"""
                    <div class="product-card">
                        <div class="product-grid">
                            <div>
                                <img src="{row[2]}" style="max-width: 180px; height: auto;">
                            </div>
                            <div class="product-info">
                                <h3>{row[1][:100]}...</h3>
                                <div class="product-price">${row[5]:.2f}</div>
                                <div class="product-stars">{stars}</div>
                                <div class="product-reviews">({row[4]} reviews)</div>
                                <div class="product-category">Category: {row[6]}</div>
                                <div class="similarity-score">Match Score: {similarity}%</div>
                            </div>
                        </div>
                    </div>
                    """

            html_output += "</div>"
            display(HTML(html_output))

    def on_search_button_clicked(b):
        """Handle search button clicks"""
        loading_indicator.value = "<h4 style='color: #007bff'>🔍 Searching...</h4>"
        try:
            if search_type_tabs.selected_index == 0:
                # Basic search
                results = search_products(
                    basic_search_text.value,
                    basic_results_slider.value
                )
            else:
                # Advanced search
                results = advanced_search(
                    advanced_search_text.value,
                    category=category_dropdown.value,
                    max_price=max_price_slider.value,
                    min_stars=min_stars_slider.value,
                    num_results=advanced_results_slider.value
                )
            display_results(results)
        except Exception as e:
            loading_indicator.value = f"<h4 style='color: #dc3545'>❌ Error: {str(e)}</h4>"
            return
        loading_indicator.value = ""

    search_button.on_click(on_search_button_clicked)

    # Create example queries
    example_queries = [
        "phone charger and case",
        "smart home automation",
        "outdoor camping gear",
        "pet supplies and toys",
        "home office essentials"
    ]

    def create_example_button(query):
        """Create a button for an example query"""
        button = widgets.Button(
            description=query,
            layout=widgets.Layout(width='auto'),
            style={'button_color': '#e9ecef'}
        )

        def on_click(b):
            basic_search_text.value = query
            advanced_search_text.value = query

        button.on_click(on_click)
        return button

    example_buttons = [create_example_button(query) for query in example_queries]

    examples_box = widgets.VBox([
        widgets.HTML(value="<h4>Try these examples:</h4>"),
        widgets.HBox(example_buttons)
    ])

    # Combine all elements
    main_interface = widgets.VBox([
        search_type_tabs,
        widgets.HBox([search_button], layout=widgets.Layout(justify_content='center')),
        loading_indicator,
        examples_box,
        results_area
    ], layout=widgets.Layout(padding='10px'))

    display(main_interface)

# Initialize the interface
create_search_interface()

VBox(children=(Tab(children=(VBox(children=(HTML(value='<h3>Basic Search</h3>'), Text(value='', description='S…

## Conclusion
This implementation demonstrates the power of combining:
- Vector embeddings for semantic understanding
- Efficient vector similarity search with pgvector
- Traditional SQL filters for precise results

The result is a sophisticated product search system that understands user intent and provides relevant results with an intuitive interface.

### Next Steps
After completing this notebook, you'll be ready to:
- Experiment with different search queries
- Fine-tune the search parameters
- Explore the advanced filtering options
- Build upon this foundation for your own applications

Continue to the Blaize Bazaar lab to see these concepts in action in a real-world scenario.