<a href="https://colab.research.google.com/github/ashwin-yedte/visual-intelligence-travel-finance/blob/main/VLM_Pipeline.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# VLM Intelligence Layer


System Overview
Complete pipeline that takes user images and recommends similar travel destinations.
Pipeline Steps:

1. Image Analysis - Validate, preprocess, extract CLIP embeddings
2. Destination Matching - Match each image with top-10 destinations
3. Theme Aggregation - Majority vote, re-rank by frequency
4. Recommended Destinations - Enrich with metadata, display in gallery


---
## From: image_analysis.ipynb
---


STEP 1: IMAGE ANALYSIS
Comprehensive image validation, preprocessing, and CLIP embedding extraction

Features:
- Strict validation (size, format, integrity)
- EXIF orientation fixing
- Aspect ratio preservation
- Color statistics extraction
- Error handling with user-friendly messages

# =================================================================
Step 1: CONFIGURATION
# =================================================================


In [2]:
print("="*80)
print("STEP 1: IMAGE ANALYSIS - VLM INTELLIGENCE LAYER")
print("="*80)

class Config:
    """Centralized configuration for image analysis"""

    # Model configuration
    CLIP_MODEL_NAME = "openai/clip-vit-base-patch32"
    EMBEDDING_DIMENSION = 512

    # Image validation
    TARGET_IMAGE_SIZE = (224, 224)
    MAX_IMAGE_SIZE_MB = 10.0
    SUPPORTED_FORMATS = ['jpg', 'jpeg', 'png']
    MIN_DIMENSION = 100
    MAX_DIMENSION = 4000

    # Batch processing
    MIN_IMAGES = 1
    MAX_IMAGES = 5

    # Color analysis
    NUM_DOMINANT_COLORS = 3
    COLOR_SAMPLE_SIZE = 10000

    # Output
    OUTPUT_FILE = "step1_user_analysis.json"

print("Configuration loaded")
print("="*80)

STEP 1: IMAGE ANALYSIS - VLM INTELLIGENCE LAYER
Configuration loaded


# =================================================================
Step 2: IMAGE VALIDATION AND PREPROCESSING CLASS
# =================================================================


In [3]:
import io
import numpy as np
from PIL import Image, ImageOps
from typing import Dict, Any, List, Tuple
from sklearn.cluster import KMeans

class ImageValidator:
    """
    Comprehensive image validation with detailed error reporting.
    Returns user-friendly error messages for frontend display.
    """

    def __init__(self):
        self.max_size_mb = Config.MAX_IMAGE_SIZE_MB
        self.supported_formats = Config.SUPPORTED_FORMATS
        self.min_dimension = Config.MIN_DIMENSION
        self.max_dimension = Config.MAX_DIMENSION

    def validate_image(self, image_bytes: bytes, filename: str) -> Dict[str, Any]:
        """
        Comprehensive validation with user-friendly error messages.

        Returns:
            Dictionary with validation results
        """

        # Check 1: File size
        size_mb = len(image_bytes) / (1024 * 1024)

        if size_mb > self.max_size_mb:
            return {
                'valid': False,
                'error': "Image is too large. Maximum allowed is 10MB. Please compress or resize the image.",
                'error_code': 'FILE_TOO_LARGE',
                'size_mb': size_mb,
                'format': None,
                'dimensions': None
            }

        if size_mb == 0:
            return {
                'valid': False,
                'error': "Image appears to be empty. Please select a valid image file.",
                'error_code': 'FILE_EMPTY',
                'size_mb': 0,
                'format': None,
                'dimensions': None
            }

        try:
            # Attempt to open image
            img = Image.open(io.BytesIO(image_bytes))

            # Check 2: Format validation
            img_format = img.format.lower() if img.format else 'unknown'

            if img_format not in self.supported_formats:
                return {
                    'valid': False,
                    'error': "Unsupported format. Please upload JPG or PNG images only.",
                    'error_code': 'UNSUPPORTED_FORMAT',
                    'size_mb': size_mb,
                    'format': img_format,
                    'dimensions': None
                }

            # Check 3: Dimensions validation
            width, height = img.size

            if width < self.min_dimension or height < self.min_dimension:
                return {
                    'valid': False,
                    'error': "Image is too small. Minimum size is 100x100 pixels.",
                    'error_code': 'IMAGE_TOO_SMALL',
                    'size_mb': size_mb,
                    'format': img_format,
                    'dimensions': (width, height)
                }

            if width > self.max_dimension or height > self.max_dimension:
                return {
                    'valid': False,
                    'error': "Image is too large. Maximum size is 4000x4000 pixels. Please resize.",
                    'error_code': 'IMAGE_TOO_LARGE',
                    'size_mb': size_mb,
                    'format': img_format,
                    'dimensions': (width, height)
                }

            # Check 4: Image integrity
            img.verify()

            # Re-open after verify
            img = Image.open(io.BytesIO(image_bytes))

            # Try to load pixel data
            try:
                img.load()
            except Exception as e:
                return {
                    'valid': False,
                    'error': "Image appears to be corrupted. Please try a different image.",
                    'error_code': 'IMAGE_CORRUPTED',
                    'size_mb': size_mb,
                    'format': img_format,
                    'dimensions': (width, height)
                }

            # All checks passed
            return {
                'valid': True,
                'error': None,
                'error_code': None,
                'size_mb': size_mb,
                'format': img_format,
                'dimensions': (width, height)
            }

        except IOError:
            return {
                'valid': False,
                'error': "Unable to read file. The file may be corrupted or not a valid image.",
                'error_code': 'INVALID_IMAGE_FILE',
                'size_mb': size_mb,
                'format': None,
                'dimensions': None
            }

        except Exception as e:
            return {
                'valid': False,
                'error': "Error processing image: " + str(e),
                'error_code': 'PROCESSING_ERROR',
                'size_mb': size_mb,
                'format': None,
                'dimensions': None
            }


In [4]:
class ImagePreprocessor:
    """
    CLIP-optimized image preprocessing with comprehensive transformations.
    """

    def __init__(self):
        self.target_size = Config.TARGET_IMAGE_SIZE
        self.validator = ImageValidator()

    def preprocess_image(self, image_bytes: bytes) -> Image.Image:
        """
        Complete preprocessing pipeline for CLIP.

        Returns:
            PIL Image ready for CLIP (224x224, RGB)
        """

        # Load image
        img = Image.open(io.BytesIO(image_bytes))

        # Fix orientation from EXIF
        img = self._fix_orientation(img)

        # Convert to RGB
        if img.mode != 'RGB':
            if img.mode == 'RGBA':
                # Handle transparency
                background = Image.new('RGB', img.size, (255, 255, 255))
                background.paste(img, mask=img.split()[3] if len(img.split()) == 4 else None)
                img = background
            else:
                img = img.convert('RGB')

        # Resize with padding
        img = self._resize_with_padding(img, self.target_size)

        return img

    def _fix_orientation(self, img: Image.Image) -> Image.Image:
        """Auto-rotate image based on EXIF orientation tag."""
        try:
            img = ImageOps.exif_transpose(img)
        except Exception:
            pass
        return img

    def _resize_with_padding(self, img: Image.Image, target_size: Tuple[int, int]) -> Image.Image:
        """Resize maintaining aspect ratio, add white padding."""

        # Calculate aspect ratios
        img_ratio = img.width / img.height
        target_ratio = target_size[0] / target_size[1]

        # Determine new dimensions
        if img_ratio > target_ratio:
            new_width = target_size[0]
            new_height = int(new_width / img_ratio)
        else:
            new_height = target_size[1]
            new_width = int(new_height * img_ratio)

        # Resize using LANCZOS filter
        img = img.resize((new_width, new_height), Image.Resampling.LANCZOS)

        # Create white canvas
        canvas = Image.new('RGB', target_size, (255, 255, 255))

        # Center image
        offset_x = (target_size[0] - new_width) // 2
        offset_y = (target_size[1] - new_height) // 2
        canvas.paste(img, (offset_x, offset_y))

        return canvas

    def extract_color_statistics(self, img: Image.Image) -> Dict[str, Any]:
        """Extract color features for analysis."""

        img_array = np.array(img)

        return {
            'dominant_colors': self._get_dominant_colors(img_array),
            'brightness': float(np.mean(img_array)),
            'color_variance': float(np.var(img_array))
        }

    def _get_dominant_colors(self, img_array: np.ndarray) -> List[List[int]]:
        """Use K-means clustering to find dominant colors."""

        # Reshape to list of pixels
        pixels = img_array.reshape(-1, 3)

        # Sample if too many pixels
        if len(pixels) > Config.COLOR_SAMPLE_SIZE:
            indices = np.random.choice(len(pixels), Config.COLOR_SAMPLE_SIZE, replace=False)
            pixels = pixels[indices]

        # K-means clustering
        kmeans = KMeans(n_clusters=Config.NUM_DOMINANT_COLORS, random_state=42, n_init=10)
        kmeans.fit(pixels)

        # Get cluster centers and counts
        colors = kmeans.cluster_centers_.astype(int)
        counts = np.bincount(kmeans.labels_)

        # Sort by frequency
        sorted_indices = np.argsort(-counts)
        dominant_colors = colors[sorted_indices].tolist()

        return dominant_colors


print("Validation and Preprocessing classes loaded")
print("="*80)

Validation and Preprocessing classes loaded


# =================================================================
Step 3: INITIALIZE PREPROCESSOR
# =================================================================


In [5]:
print("\nInitializing preprocessor...")
preprocessor = ImagePreprocessor()
print("Preprocessor ready")
print("="*80)



Initializing preprocessor...
Preprocessor ready


# =================================================================
Step 4: LOAD CLIP MODEL
# =================================================================


In [6]:
print("\n" + "="*80)
print("LOADING CLIP MODEL")
print("="*80)

import torch
from transformers import CLIPModel, CLIPProcessor

device = "cuda" if torch.cuda.is_available() else "cpu"
print("Device: " + device)

model = CLIPModel.from_pretrained(Config.CLIP_MODEL_NAME)
processor = CLIPProcessor.from_pretrained(Config.CLIP_MODEL_NAME)

model.to(device)
model.eval()

print("Model loaded: " + Config.CLIP_MODEL_NAME)
print("Embedding dimension: " + str(Config.EMBEDDING_DIMENSION))
print("="*80)



LOADING CLIP MODEL
Device: cpu


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Loading weights:   0%|          | 0/398 [00:00<?, ?it/s]

CLIPModel LOAD REPORT from: openai/clip-vit-base-patch32
Key                                  | Status     |  | 
-------------------------------------+------------+--+-
text_model.embeddings.position_ids   | UNEXPECTED |  | 
vision_model.embeddings.position_ids | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.
The image processor of type `CLIPImageProcessor` is now loaded as a fast processor by default, even if the model checkpoint was saved with a slow processor. This is a breaking change and may produce slightly different outputs. To continue using the slow processor, instantiate this class with `use_fast=False`. 


Model loaded: openai/clip-vit-base-patch32
Embedding dimension: 512


# =================================================================
Step 5: CLIP EMBEDDING EXTRACTION
# =================================================================


In [7]:
def extract_clip_features(outputs):
    """Universal tensor extraction from CLIP outputs."""
    if torch.is_tensor(outputs):
        return outputs

    if hasattr(outputs, 'pooler_output') and outputs.pooler_output is not None:
        return outputs.pooler_output

    if hasattr(outputs, 'image_embeds') and outputs.image_embeds is not None:
        return outputs.image_embeds

    if hasattr(outputs, 'last_hidden_state') and outputs.last_hidden_state is not None:
        return outputs.last_hidden_state[:, 0, :]

    raise ValueError("Cannot extract features from output type: " + str(type(outputs)))


def extract_clip_embedding(image: Image.Image) -> np.ndarray:
    """
    Extract CLIP embedding from preprocessed image.

    Returns:
        Normalized 512-dim embedding as numpy array
    """

    try:
        # Process with CLIP
        inputs = processor(images=image, return_tensors="pt", padding=True)
        inputs = {k: v.to(device) for k, v in inputs.items()}

        with torch.no_grad():
            outputs = model.get_image_features(**inputs)
            image_features = extract_clip_features(outputs)
            # L2 normalization
            image_features = image_features / image_features.norm(dim=-1, keepdim=True)

        return image_features.cpu().numpy()[0]

    except Exception as e:
        raise Exception("CLIP embedding extraction failed: " + str(e))


print("CLIP extraction functions ready")
print("="*80)


CLIP extraction functions ready


# =================================================================
Step 6: MAIN ANALYSIS FUNCTION
# =================================================================


In [8]:
def analyze_user_images(files_dict: Dict[str, bytes]) -> Dict[str, Any]:
    """
    Complete Step 1: Validate, preprocess, and extract embeddings.

    Args:
        files_dict: Dictionary mapping filenames to image bytes

    Returns:
        Dictionary with status, results, embeddings, and summary
    """

    print("\n" + "="*80)
    print("ANALYZING " + str(len(files_dict)) + " USER IMAGES")
    print("="*80)

    # Check batch size
    if len(files_dict) < Config.MIN_IMAGES:
        return {
            'status': 'error',
            'error': "Please upload at least " + str(Config.MIN_IMAGES) + " image(s).",
            'error_code': 'TOO_FEW_IMAGES'
        }

    if len(files_dict) > Config.MAX_IMAGES:
        return {
            'status': 'error',
            'error': "Maximum " + str(Config.MAX_IMAGES) + " images allowed. You uploaded " + str(len(files_dict)) + ".",
            'error_code': 'TOO_MANY_IMAGES'
        }

    validation_errors = []
    results = []
    embeddings = []

    for filename, image_bytes in files_dict.items():
        print("\nProcessing: " + filename)

        # Step 1: Validate
        validation = preprocessor.validator.validate_image(image_bytes, filename)

        if not validation['valid']:
            validation_errors.append({
                'filename': filename,
                'error': validation['error'],
                'error_code': validation['error_code']
            })
            print("  Validation failed: " + validation['error'])
            continue

        print("  Validated (" + validation['format'].upper() + ", " +
              str(validation['dimensions'][0]) + "x" + str(validation['dimensions'][1]) +
              ", " + str(round(validation['size_mb'], 2)) + "MB)")

        try:
            # Step 2: Preprocess
            processed_img = preprocessor.preprocess_image(image_bytes)
            print("  Preprocessed to " + str(processed_img.size))

            # Step 3: Extract colors
            color_stats = preprocessor.extract_color_statistics(processed_img)
            print("  Color analysis complete")

            # Step 4: Extract CLIP embedding
            embedding = extract_clip_embedding(processed_img)
            print("  CLIP embedding extracted (" + str(embedding.shape) + ")")

            # Store results
            results.append({
                'filename': filename,
                'original_dimensions': validation['dimensions'],
                'file_size_mb': validation['size_mb'],
                'format': validation['format'],
                'color_statistics': color_stats,
                'embedding_shape': embedding.shape
            })

            embeddings.append(embedding)

        except Exception as e:
            validation_errors.append({
                'filename': filename,
                'error': "Processing failed: " + str(e),
                'error_code': 'PROCESSING_FAILED'
            })
            print("  Processing error: " + str(e))

    # Check if we have any successful results
    if len(embeddings) == 0:
        return {
            'status': 'error',
            'error': 'All images failed validation or processing. Please check the error messages and try again.',
            'error_code': 'ALL_IMAGES_FAILED',
            'validation_errors': validation_errors
        }

    # Calculate summary statistics
    avg_brightness = np.mean([r['color_statistics']['brightness'] for r in results])

    print("\n" + "="*80)
    print("STEP 1 COMPLETE")
    print("="*80)
    print("Successfully processed: " + str(len(embeddings)) + "/" + str(len(files_dict)) + " images")
    if validation_errors:
        print("Failed: " + str(len(validation_errors)) + " images")
    print("Average brightness: " + str(round(avg_brightness, 1)))
    print("="*80)

    return {
        'status': 'success' if len(embeddings) > 0 else 'partial',
        'num_uploaded': len(files_dict),
        'num_processed': len(embeddings),
        'num_failed': len(validation_errors),
        'validation_errors': validation_errors,
        'results': results,
        'embeddings': embeddings,
        'summary': {
            'avg_brightness': float(avg_brightness),
            'total_images': len(embeddings)
        }
    }


print("Main analysis function ready")
print("="*80)

Main analysis function ready


# =================================================================
Step 7: SAVE EMBEDDINGS FOR NEXT STEPS
# =================================================================


In [9]:
def save_step1_outputs(analysis_output: Dict) -> None:
    """
    Save Step 1 outputs for use in Steps 2 and 3.

    Saves:
    - user_embeddings.npy: Individual embeddings
    - step1_analysis.json: Full analysis results
    """

    if analysis_output['status'] != 'success':
        print("Cannot save - analysis did not succeed")
        return

    # Save embeddings as numpy array
    embeddings_array = np.array(analysis_output['embeddings'])
    np.save('/content/user_embeddings.npy', embeddings_array)
    print("\nSaved embeddings: " + str(embeddings_array.shape))

    # Save full analysis
    import json
    analysis_json = {
        'status': analysis_output['status'],
        'num_processed': analysis_output['num_processed'],
        'results': analysis_output['results'],
        'summary': analysis_output['summary'],
        'validation_errors': analysis_output['validation_errors']
    }

    with open('/content/' + Config.OUTPUT_FILE, 'w') as f:
        json.dump(analysis_json, f, indent=2)

    print("Saved analysis: " + Config.OUTPUT_FILE)
    print("\nReady for Step 2: Destination Matching")


print("Save functions ready")
print("="*80)
print("\nSTEP 1 INITIALIZED - Ready to analyze images")
print("="*80)

Save functions ready

STEP 1 INITIALIZED - Ready to analyze images



---
## From: destination_matching.ipynb
---


**VLM INTELLIGENCE LAYER**

STEP 2: DESTINATION MATCHING
Match each user image separately against VL Encoding database

Features:
- Load VL encoding embeddings (47 destinations)
- Match each user image independently
- Get top-10 destinations per image
- Track theme distribution
- Prepare for theme aggregation in Step 3

# =================================================================
 PREREQUISITES
# =================================================================

MUST RUN BEFORE THIS:
1. Step 1 cells (creates user_embeddings.npy)
2. VL Encoding must be complete in Google Drive

# =================================================================
Step 1: CONFIGURATION
# =================================================================


In [10]:
print("="*80)
print("STEP 2: DESTINATION MATCHING - VLM INTELLIGENCE LAYER")
print("="*80)

class Step2Config:
    """Configuration for Step 2"""

    # Paths (Google Drive)
    BASE_PATH = '/content/drive/MyDrive/visual-intelligence-travel-finance'
    EMBEDDINGS_PATH = BASE_PATH + '/data/vl_encoding/embeddings'
    PROMPTS_PATH = BASE_PATH + '/data/vl_encoding/prompts'
    METADATA_PATH = BASE_PATH + '/data/landmarks/metadata.json'

    # Matching parameters
    TOP_K_PER_IMAGE = 10
    MIN_SIMILARITY_SCORE = 0.20

    # Output
    OUTPUT_FILE = "step2_matches.json"

print("Configuration loaded")
print("="*80)

STEP 2: DESTINATION MATCHING - VLM INTELLIGENCE LAYER
Configuration loaded


# =================================================================
Step 2: MOUNT GOOGLE DRIVE
# =================================================================


In [11]:
print("\n" + "="*80)
print("MOUNTING GOOGLE DRIVE")
print("="*80)

from google.colab import drive
drive.mount('/content/drive')

print("Google Drive mounted")
print("="*80)



MOUNTING GOOGLE DRIVE
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Google Drive mounted


# =================================================================
Step 3: LOAD VL ENCODING DATABASE
# =================================================================


In [12]:
print("\n" + "="*80)
print("LOADING VL ENCODING DATABASE")
print("="*80)

import numpy as np
import json
from typing import Dict, List, Any

# Load destination embeddings
print("Loading embeddings...")
vl_data = np.load(Step2Config.EMBEDDINGS_PATH + '/all_embeddings.npz')
destination_ids = vl_data['destination_ids']
destination_embeddings = vl_data['destination_embeddings']

print("Loaded " + str(len(destination_ids)) + " destination embeddings")
print("Embedding shape: " + str(destination_embeddings.shape))

# Load embedding index
print("\nLoading embedding index...")
with open(Step2Config.EMBEDDINGS_PATH + '/embedding_index.json', 'r') as f:
    embedding_index = json.load(f)
print("Embedding index loaded")

# Load destination prompts
print("\nLoading destination prompts...")
with open(Step2Config.PROMPTS_PATH + '/destination_prompts.json', 'r') as f:
    destination_prompts = json.load(f)
print("Loaded prompts for " + str(len(destination_prompts)) + " destinations")

# Load metadata
print("\nLoading metadata...")
with open(Step2Config.METADATA_PATH, 'r') as f:
    metadata = json.load(f)
print("Metadata loaded: " + str(metadata['total_destinations']) + " destinations")

print("\n" + "="*80)
print("VL DATABASE LOADED SUCCESSFULLY")
print("="*80)



LOADING VL ENCODING DATABASE
Loading embeddings...
Loaded 168 destination embeddings
Embedding shape: (168, 512)

Loading embedding index...
Embedding index loaded

Loading destination prompts...
Loaded prompts for 168 destinations

Loading metadata...
Metadata loaded: 168 destinations

VL DATABASE LOADED SUCCESSFULLY


# =================================================================
Step 4: HELPER FUNCTIONS
# =================================================================


In [13]:
def get_destination_theme(dest_id: str) -> str:
    """
    Get theme for a destination from metadata.

    Args:
        dest_id: Destination ID (e.g., 'BEACH_GOA_AGONDA_BEACH')

    Returns:
        Theme name (e.g., 'Beach', 'Waterfall')
    """

    for theme in metadata['themes']:
        for state in theme['states']:
            for destination in state['destinations']:
                if destination['destination_id'] == dest_id:
                    return theme['theme_name']

    return 'Unknown'


def get_destination_info(dest_id: str) -> Dict[str, Any]:
    """
    Get full destination information from metadata.

    Returns:
        Dictionary with name, state, theme, etc.
    """

    for theme in metadata['themes']:
        for state in theme['states']:
            for destination in state['destinations']:
                if destination['destination_id'] == dest_id:
                    return {
                        'destination_id': dest_id,
                        'name': destination['destination_name'],
                        'state': state['state_name'],
                        'theme': theme['theme_name'],
                        'image_count': destination.get('image_count', 0),
                        'geo_location': destination.get('geo_location', {}),
                        'offers': destination.get('offers', {})
                    }

    return {
        'destination_id': dest_id,
        'name': 'Unknown',
        'state': 'Unknown',
        'theme': 'Unknown',
        'image_count': 0,
        'geo_location': {},
        'offers': {}
    }


print("Helper functions loaded")
print("="*80)


Helper functions loaded


# =================================================================
Step 5: DESTINATION MATCHING FUNCTION
# =================================================================


In [14]:
def match_destinations_per_image(user_embeddings: np.ndarray) -> Dict[str, Any]:
    """
    Match each user image separately against all destinations.

    This is the core of Step 2:
    - Takes each user image embedding
    - Compares with all 47 destination embeddings
    - Returns top-K matches per image
    - Tracks theme distribution

    Args:
        user_embeddings: Array of shape (N, 512) where N is number of images

    Returns:
        Dictionary with per-image matches and statistics
    """

    print("\n" + "="*80)
    print("MATCHING DESTINATIONS PER IMAGE")
    print("="*80)

    num_images = len(user_embeddings)
    print("Number of user images: " + str(num_images))
    print("Number of destinations: " + str(len(destination_ids)))
    print("Top-K per image: " + str(Step2Config.TOP_K_PER_IMAGE))

    per_image_matches = {}
    all_matched_destinations = set()
    theme_counter = {}

    for img_idx in range(num_images):
        image_key = "image_" + str(img_idx + 1)
        user_embedding = user_embeddings[img_idx]

        print("\n" + "-"*80)
        print("Processing " + image_key)
        print("-"*80)

        # Normalize user embedding
        user_embedding = user_embedding / np.linalg.norm(user_embedding)

        # Compute similarities with all destinations
        similarities = np.dot(destination_embeddings, user_embedding)

        # Get top-K indices
        top_indices = np.argsort(similarities)[::-1][:Step2Config.TOP_K_PER_IMAGE]

        # Build matches for this image
        matches = []

        for rank, dest_idx in enumerate(top_indices, 1):
            dest_id = destination_ids[dest_idx]
            score = float(similarities[dest_idx])

            # Skip if below threshold
            if score < Step2Config.MIN_SIMILARITY_SCORE:
                continue

            # Get destination info
            dest_info = get_destination_info(dest_id)
            theme = dest_info['theme']

            # Track theme
            if theme not in theme_counter:
                theme_counter[theme] = 0
            theme_counter[theme] += 1

            # Track destination
            all_matched_destinations.add(dest_id)

            # Build match object
            match = {
                'rank': rank,
                'destination_id': dest_id,
                'destination_name': dest_info['name'],
                'state': dest_info['state'],
                'theme': theme,
                'similarity_score': round(score * 100, 2),
                'raw_score': score
            }

            matches.append(match)

            if rank <= 3:
                print("  " + str(rank) + ". " + dest_info['name'] +
                      " (" + theme + ") - " + str(round(score * 100, 1)) + "%")

        per_image_matches[image_key] = {
            'matches': matches,
            'num_matches': len(matches),
            'top_theme': matches[0]['theme'] if matches else 'Unknown',
            'avg_score': round(np.mean([m['raw_score'] for m in matches]) * 100, 2) if matches else 0
        }

    print("\n" + "="*80)
    print("MATCHING COMPLETE")
    print("="*80)
    print("Total unique destinations matched: " + str(len(all_matched_destinations)))
    print("Theme distribution:")
    for theme, count in sorted(theme_counter.items(), key=lambda x: x[1], reverse=True):
        print("  " + theme + ": " + str(count) + " matches")
    print("="*80)

    return {
        'per_image_matches': per_image_matches,
        'theme_distribution': theme_counter,
        'total_unique_destinations': len(all_matched_destinations),
        'num_images_processed': num_images
    }


print("Matching function loaded")
print("="*80)


Matching function loaded


# =================================================================
Step 6: SAVE STEP 2 OUTPUTS
# =================================================================


In [15]:
def save_step2_outputs(matching_output: Dict) -> None:
    """
    Save Step 2 outputs for use in Step 3.

    Saves:
    - step2_matches.json: Full matching results
    """

    print("\n" + "="*80)
    print("SAVING STEP 2 OUTPUTS")
    print("="*80)

    # Convert to JSON-serializable format
    output_data = {
        'per_image_matches': matching_output['per_image_matches'],
        'theme_distribution': matching_output['theme_distribution'],
        'total_unique_destinations': matching_output['total_unique_destinations'],
        'num_images_processed': matching_output['num_images_processed']
    }

    # Save to JSON
    output_path = '/content/' + Step2Config.OUTPUT_FILE
    with open(output_path, 'w') as f:
        json.dump(output_data, f, indent=2)

    print("Saved to: " + output_path)
    print("="*80)
    print("\nReady for Step 3: Theme Extraction and Aggregation")
    print("="*80)


print("Save function loaded")
print("="*80)

Save function loaded


# =================================================================
Step 7: MAIN EXECUTION FUNCTION
# =================================================================


In [16]:
def run_step2(user_embeddings_path: str = '/content/user_embeddings.npy'):
    """
    Complete Step 2 execution.

    Args:
        user_embeddings_path: Path to user embeddings from Step 1

    Returns:
        Matching results dictionary
    """

    print("\n" + "="*80)
    print("EXECUTING STEP 2: DESTINATION MATCHING")
    print("="*80)

    # Load user embeddings from Step 1
    print("\nLoading user embeddings from Step 1...")
    try:
        user_embeddings = np.load(user_embeddings_path)
        print("Loaded user embeddings: " + str(user_embeddings.shape))
    except FileNotFoundError:
        print("ERROR: User embeddings not found at " + user_embeddings_path)
        print("Please run Step 1 first to generate user embeddings")
        return None

    # Match destinations
    matching_output = match_destinations_per_image(user_embeddings)

    # Save outputs
    save_step2_outputs(matching_output)

    return matching_output


print("Main execution function loaded")
print("="*80)
print("\nSTEP 2 INITIALIZED - Ready to match destinations")
print("="*80)
print("\nTO RUN:")
print("  result = run_step2()")
print("="*80)

Main execution function loaded

STEP 2 INITIALIZED - Ready to match destinations

TO RUN:
  result = run_step2()



---
## From: theme_aggregation.ipynb
---


VLM INTELLIGENCE LAYER

STEP 3: THEME EXTRACTION AND AGGREGATION
Majority vote theme extraction and smart re-ranking

Features:
- Analyze theme distribution from Step 2 matches
- Identify dominant theme via majority vote
- Re-rank destinations using multi-factor scoring
- Prioritize destinations that appear in multiple images
- Calculate confidence metrics

# =================================================================
 PREREQUISITES
# =================================================================

MUST RUN BEFORE THIS:
1. Step 1 cells (image analysis)
2. Step 2 cells (destination matching)
   This creates step2_matches.json



# =================================================================
Step 1: CONFIGURATION
# =================================================================


In [17]:
print("="*80)
print("STEP 3: THEME EXTRACTION AND AGGREGATION - VLM INTELLIGENCE LAYER")
print("="*80)

class Step3Config:
    """Configuration for Step 3"""

    # Scoring weights for final ranking
    AVG_SIMILARITY_WEIGHT = 0.4
    MAX_SIMILARITY_WEIGHT = 0.3
    FREQUENCY_WEIGHT = 0.2
    THEME_MATCH_WEIGHT = 0.1

    # Bonus points
    THEME_MATCH_BONUS = 20.0
    FREQUENCY_BONUS_PER_APPEARANCE = 10.0

    # Output
    OUTPUT_FILE = "step3_refined_ranking.json"
    TOP_N_RESULTS = 10

print("Configuration loaded")
print("="*80)


STEP 3: THEME EXTRACTION AND AGGREGATION - VLM INTELLIGENCE LAYER
Configuration loaded


# =================================================================
Step 2: IMPORT LIBRARY
# =================================================================


In [18]:
import json
import numpy as np
from typing import Dict, List, Any
from collections import Counter, defaultdict

print("Imports complete")
print("="*80)



Imports complete


# =================================================================
Step 3: THEME ANALYSIS FUNCTIONS
# =================================================================


In [19]:
def analyze_theme_distribution(per_image_matches: Dict) -> Dict[str, Any]:
    """
    Analyze theme distribution across all matched destinations.

    Args:
        per_image_matches: Results from Step 2

    Returns:
        Dictionary with theme analysis
    """

    print("\n" + "="*80)
    print("ANALYZING THEME DISTRIBUTION")
    print("="*80)

    theme_counter = Counter()
    total_matches = 0

    # Count themes from all images
    for image_key, image_data in per_image_matches.items():
        matches = image_data['matches']
        for match in matches:
            theme = match['theme']
            theme_counter[theme] += 1
            total_matches += 1

    # Find dominant theme
    if not theme_counter:
        return {
            'dominant_theme': 'Unknown',
            'theme_confidence': 0.0,
            'theme_distribution': {},
            'total_matches': 0
        }

    dominant_theme = theme_counter.most_common(1)[0][0]
    dominant_count = theme_counter[dominant_theme]
    theme_confidence = dominant_count / total_matches if total_matches > 0 else 0

    print("Total matches analyzed: " + str(total_matches))
    print("\nTheme distribution:")
    for theme, count in theme_counter.most_common():
        percentage = (count / total_matches) * 100
        print("  " + theme + ": " + str(count) + " (" + str(round(percentage, 1)) + "%)")

    print("\nDominant theme: " + dominant_theme)
    print("Confidence: " + str(round(theme_confidence * 100, 1)) + "%")
    print("="*80)

    return {
        'dominant_theme': dominant_theme,
        'theme_confidence': theme_confidence,
        'theme_distribution': dict(theme_counter),
        'total_matches': total_matches
    }


print("Theme analysis functions loaded")
print("="*80)


Theme analysis functions loaded


# =================================================================
Step 4: DESTINATION AGGREGATION FUNCTIONS
# =================================================================


In [20]:
def aggregate_destination_data(per_image_matches: Dict) -> Dict[str, Any]:
    """
    Aggregate data for each unique destination across all images.

    For each destination, tracks:
    - All similarity scores
    - Number of appearances
    - Theme

    Args:
        per_image_matches: Results from Step 2

    Returns:
        Dictionary mapping destination_id to aggregated data
    """

    print("\n" + "="*80)
    print("AGGREGATING DESTINATION DATA")
    print("="*80)

    destination_data = defaultdict(lambda: {
        'scores': [],
        'theme': None,
        'name': None,
        'state': None,
        'appearances': 0,
        'appeared_in_images': []
    })

    # Aggregate across all images
    for image_key, image_data in per_image_matches.items():
        matches = image_data['matches']

        for match in matches:
            dest_id = match['destination_id']

            destination_data[dest_id]['scores'].append(match['raw_score'])
            destination_data[dest_id]['theme'] = match['theme']
            destination_data[dest_id]['name'] = match['destination_name']
            destination_data[dest_id]['state'] = match['state']
            destination_data[dest_id]['appearances'] += 1
            destination_data[dest_id]['appeared_in_images'].append(image_key)

    print("Total unique destinations: " + str(len(destination_data)))

    # Show top destinations by frequency
    sorted_by_freq = sorted(
        destination_data.items(),
        key=lambda x: x[1]['appearances'],
        reverse=True
    )

    print("\nMost frequently matched destinations:")
    for i, (dest_id, data) in enumerate(sorted_by_freq[:5], 1):
        print("  " + str(i) + ". " + data['name'] + " - appeared in " +
              str(data['appearances']) + " image(s)")

    print("="*80)

    return dict(destination_data)


print("Aggregation functions loaded")
print("="*80)

Aggregation functions loaded


# =================================================================
Step 5: RE-RANKING FUNCTION
# =================================================================


In [21]:
def rerank_destinations(destination_data: Dict, theme_analysis: Dict) -> List[Dict]:
    """
    Re-rank destinations using multi-factor weighted scoring.

    Scoring formula:
    - Average similarity (40%)
    - Max similarity (30%)
    - Frequency bonus (20%)
    - Theme match bonus (10%)

    Args:
        destination_data: Aggregated destination data
        theme_analysis: Theme analysis results

    Returns:
        Sorted list of destinations with final scores
    """

    print("\n" + "="*80)
    print("RE-RANKING DESTINATIONS")
    print("="*80)

    dominant_theme = theme_analysis['dominant_theme']
    print("Dominant theme: " + dominant_theme)
    print("\nApplying weighted scoring...")

    ranked_destinations = []

    for dest_id, data in destination_data.items():
        # Calculate score components
        avg_score = np.mean(data['scores'])
        max_score = max(data['scores'])
        frequency = data['appearances']
        theme_match = 1.0 if data['theme'] == dominant_theme else 0.0

        # Weighted final score
        final_score = (
            avg_score * Step3Config.AVG_SIMILARITY_WEIGHT +
            max_score * Step3Config.MAX_SIMILARITY_WEIGHT +
            (frequency * Step3Config.FREQUENCY_BONUS_PER_APPEARANCE) * Step3Config.FREQUENCY_WEIGHT +
            (theme_match * Step3Config.THEME_MATCH_BONUS) * Step3Config.THEME_MATCH_WEIGHT
        )

        ranked_destinations.append({
            'destination_id': dest_id,
            'destination_name': data['name'],
            'state': data['state'],
            'theme': data['theme'],
            'avg_similarity': round(avg_score * 100, 2),
            'max_similarity': round(max_score * 100, 2),
            'appearances': frequency,
            'appeared_in_images': data['appeared_in_images'],
            'theme_match': data['theme'] == dominant_theme,
            'final_score': round(final_score, 2)
        })

    # Sort by final score
    ranked_destinations.sort(key=lambda x: x['final_score'], reverse=True)

    print("\nTop 10 ranked destinations:")
    for i, dest in enumerate(ranked_destinations[:10], 1):
        stars = "**" if dest['appearances'] > 1 else ""
        theme_indicator = " (THEME MATCH)" if dest['theme_match'] else ""
        print("  " + str(i) + ". " + dest['destination_name'] + stars +
              " - Score: " + str(dest['final_score']) + theme_indicator)
        print("     Avg: " + str(dest['avg_similarity']) + "%, " +
              "Max: " + str(dest['max_similarity']) + "%, " +
              "Appears: " + str(dest['appearances']) + "x")

    print("\n** = Appeared in multiple images")
    print("="*80)

    return ranked_destinations


print("Re-ranking function loaded")
print("="*80)


Re-ranking function loaded


# =================================================================
Step 6: SAVE STEP 3 OUTPUTS
# =================================================================


In [22]:
def save_step3_outputs(theme_analysis: Dict, ranked_destinations: List[Dict]) -> None:
    """
    Save Step 3 outputs for Step 4 (UI display).

    Saves:
    - step3_refined_ranking.json: Final ranked results
    """

    print("\n" + "="*80)
    print("SAVING STEP 3 OUTPUTS")
    print("="*80)

    output_data = {
        'theme_analysis': {
            'dominant_theme': theme_analysis['dominant_theme'],
            'theme_confidence': round(theme_analysis['theme_confidence'] * 100, 2),
            'theme_distribution': theme_analysis['theme_distribution'],
            'total_matches': theme_analysis['total_matches']
        },
        'ranked_destinations': ranked_destinations[:Step3Config.TOP_N_RESULTS],
        'total_destinations': len(ranked_destinations)
    }

    # Save to JSON
    output_path = '/content/' + Step3Config.OUTPUT_FILE
    with open(output_path, 'w') as f:
        json.dump(output_data, f, indent=2)

    print("Saved to: " + output_path)
    print("Top " + str(Step3Config.TOP_N_RESULTS) + " destinations saved")
    print("="*80)
    print("\nReady for Step 4: Visual Gallery Display")
    print("="*80)


print("Save function loaded")
print("="*80)

Save function loaded


# =================================================================
Step 7: MAIN EXECUTION FUNCTION
# =================================================================


In [23]:
def run_step3(step2_output_path: str = '/content/step2_matches.json'):
    """
    Complete Step 3 execution.

    Args:
        step2_output_path: Path to Step 2 output file

    Returns:
        Dictionary with theme analysis and ranked destinations
    """

    print("\n" + "="*80)
    print("EXECUTING STEP 3: THEME EXTRACTION AND AGGREGATION")
    print("="*80)

    # Load Step 2 results
    print("\nLoading Step 2 results...")
    try:
        with open(step2_output_path, 'r') as f:
            step2_data = json.load(f)
        print("Loaded Step 2 results")
    except FileNotFoundError:
        print("ERROR: Step 2 results not found at " + step2_output_path)
        print("Please run Step 2 first")
        return None

    per_image_matches = step2_data['per_image_matches']

    # Step 1: Analyze themes
    theme_analysis = analyze_theme_distribution(per_image_matches)

    # Step 2: Aggregate destination data
    destination_data = aggregate_destination_data(per_image_matches)

    # Step 3: Re-rank destinations
    ranked_destinations = rerank_destinations(destination_data, theme_analysis)

    # Step 4: Save outputs
    save_step3_outputs(theme_analysis, ranked_destinations)

    print("\n" + "="*80)
    print("STEP 3 COMPLETE")
    print("="*80)
    print("\nSummary:")
    print("  Dominant theme: " + theme_analysis['dominant_theme'])
    print("  Confidence: " + str(round(theme_analysis['theme_confidence'] * 100, 1)) + "%")
    print("  Top destination: " + ranked_destinations[0]['destination_name'])
    print("  Final score: " + str(ranked_destinations[0]['final_score']))
    print("="*80)

    return {
        'theme_analysis': theme_analysis,
        'ranked_destinations': ranked_destinations
    }


print("Main execution function loaded")
print("="*80)
print("\nSTEP 3 INITIALIZED - Ready to extract themes and rank destinations")
print("="*80)
print("\nTO RUN:")
print("  result = run_step3()")
print("="*80)

Main execution function loaded

STEP 3 INITIALIZED - Ready to extract themes and rank destinations

TO RUN:
  result = run_step3()



---
## From: recommended_destinations.ipynb
---


**VLM INTELLIGENCE LAYER **


STEP 4: RECOMMENDED DESTINATIONS
Enrich top destinations with full metadata for visual gallery display

Features:
- Load top-ranked destinations from Step 3
- Enrich with VL encoding prompts (characteristics)
- Add geo-location data
- Add offers (hotels, activities, flights, packages)
- Prepare gallery-ready JSON for frontend

# =================================================================
 PREREQUISITES
# =================================================================

MUST RUN BEFORE THIS:
1. Step 1 (image analysis)
2. Step 2 (destination matching)
3. Step 3 (theme aggregation and ranking)
   This creates step3_refined_ranking.json


# =================================================================
Step 1: CONFIGURATION
# =================================================================

In [24]:
print("="*80)
print("STEP 4: RECOMMENDED DESTINATIONS - VLM INTELLIGENCE LAYER")
print("="*80)

class Step4Config:
    """Configuration for Step 4"""

    # Paths
    BASE_PATH = '/content/drive/MyDrive/visual-intelligence-travel-finance'
    PROMPTS_PATH = BASE_PATH + '/data/vl_encoding/prompts'
    METADATA_PATH = BASE_PATH + '/data/landmarks/metadata.json'
    IMAGES_BASE_PATH = BASE_PATH + '/data/landmarks'

    # Display settings
    TOP_N_DISPLAY = 10
    MAX_PROMPTS_PER_CATEGORY = 3
    MAX_IMAGES_PER_DESTINATION = 2
    IMAGE_MAX_SIZE = (400, 300)
    IMAGE_QUALITY = 70

    # Output
    OUTPUT_FILE = "step4_recommendations.json"

print("Configuration loaded")
print("="*80)


STEP 4: RECOMMENDED DESTINATIONS - VLM INTELLIGENCE LAYER
Configuration loaded


# =================================================================
Step 2: IMPORT LIBRARY
# =================================================================


In [25]:
import json
import os
from typing import Dict, List, Any

print("Imports complete")
print("="*80)

Imports complete


# =================================================================
Step 3: LOAD REQUIRED DATA
# =================================================================


In [26]:
print("\n" + "="*80)
print("LOADING REQUIRED DATA")
print("="*80)
# Load destination prompts
print("Loading destination prompts...")
with open(Step4Config.PROMPTS_PATH + '/destination_prompts.json', 'r') as f:
    destination_prompts = json.load(f)
print("Loaded prompts for " + str(len(destination_prompts)) + " destinations")

# Load metadata
print("\nLoading metadata...")
with open(Step4Config.METADATA_PATH, 'r') as f:
    metadata = json.load(f)
print("Metadata loaded: " + str(metadata['total_destinations']) + " destinations")

print("="*80)



LOADING REQUIRED DATA
Loading destination prompts...
Loaded prompts for 168 destinations

Loading metadata...
Metadata loaded: 168 destinations


# =================================================================
Step 4: METADATA ENRICHMENT FUNCTIONS
# =================================================================


In [27]:
def get_full_destination_metadata(dest_id: str) -> Dict[str, Any]:
    """
    Get complete destination metadata including offers and geo-location.

    Args:
        dest_id: Destination ID

    Returns:
        Full metadata dictionary
    """

    for theme in metadata['themes']:
        for state in theme['states']:
            for destination in state['destinations']:
                if destination['destination_id'] == dest_id:
                    return {
                        'destination_id': dest_id,
                        'destination_name': destination['destination_name'],
                        'state': state['state_name'],
                        'theme': theme['theme_name'],
                        'folder': destination.get('folder', ''),
                        'images': destination.get('images', []),
                        'image_count': destination.get('image_count', 0),
                        'geo_location': destination.get('geo_location', {}),
                        'offers': destination.get('offers', {})
                    }

    return None


def get_destination_characteristics(dest_id: str) -> Dict[str, List[str]]:
    """
    Get semantic characteristics from VL encoding prompts.

    Args:
        dest_id: Destination ID

    Returns:
        Dictionary of category: [prompts]
    """

    if dest_id not in destination_prompts:
        return {}

    dest_data = destination_prompts[dest_id]
    aggregated = dest_data.get('aggregated_prompts', {})

    characteristics = {}

    for category, prompts in aggregated.items():
        # Get top N prompts by weighted score
        sorted_prompts = sorted(
            prompts,
            key=lambda x: x.get('weighted_score', 0),
            reverse=True
        )

        # Extract just the text
        characteristics[category] = [
            p['text'] for p in sorted_prompts[:Step4Config.MAX_PROMPTS_PER_CATEGORY]
        ]

    return characteristics


def get_image_urls(dest_id: str, folder: str, images: List[str]) -> List[str]:
    """
    Generate image URLs/paths for gallery display.

    Args:
        dest_id: Destination ID
        folder: Image folder path
        images: List of image filenames

    Returns:
        List of image paths
    """

    if not images:
        return []

    # In production, these would be actual URLs
    # For now, return Drive paths
    image_paths = []
    for img_filename in images[:5]:
        path = Step4Config.IMAGES_BASE_PATH + '/' + folder + '/' + img_filename
        image_paths.append(path)

    return image_paths


print("Enrichment functions loaded")
print("="*80)

"""
STEP 4 ENHANCED: Convert Google Drive images to base64 for web display
"""

# Add this function to your Step 4 code

import base64
import os
from PIL import Image
import io

def convert_image_to_base64(image_path: str) -> str:
    """Convert image with smaller size to save memory."""

    try:
        if not os.path.exists(image_path):
            return None

        img = Image.open(image_path)

        # Use config values for size
        img.thumbnail(Step4Config.IMAGE_MAX_SIZE, Image.Resampling.LANCZOS)

        if img.mode != 'RGB':
            img = img.convert('RGB')

        buffer = io.BytesIO()
        img.save(buffer, format='JPEG', quality=Step4Config.IMAGE_QUALITY)  # Lower quality
        buffer.seek(0)

        img_base64 = base64.b64encode(buffer.read()).decode('utf-8')
        return "data:image/jpeg;base64," + img_base64

    except Exception as e:
        print("    ERROR: " + str(e))
        return None


def get_destination_images_base64(dest_id: str, folder: str, images: List[str], max_images: int = 5) -> List[str]:
    """
    Get destination images as base64 strings.

    Args:
        dest_id: Destination ID
        folder: Image folder path
        images: List of image filenames
        max_images: Maximum number of images to include

    Returns:
        List of base64 data URIs
    """

    if not images:
        return []

    base64_images = []
    images_path = Step4Config.IMAGES_BASE_PATH + '/' + folder

    for img_filename in images[:max_images]:
        full_path = images_path + '/' + img_filename

        base64_str = convert_image_to_base64(full_path)

        if base64_str:
            base64_images.append(base64_str)

    return base64_images


# MODIFIED: Update the build_recommendations function

Enrichment functions loaded


# =================================================================
Step 5: BUILD RECOMMENDATIONS
# =================================================================


In [28]:
import base64
from PIL import Image
import io

# Add this helper function
def convert_image_to_base64(image_path: str) -> str:
    """Convert Google Drive image to base64."""
    try:
        if not os.path.exists(image_path):
            return None

        img = Image.open(image_path)
        img.thumbnail((800, 600), Image.Resampling.LANCZOS)

        if img.mode != 'RGB':
            img = img.convert('RGB')

        buffer = io.BytesIO()
        img.save(buffer, format='JPEG', quality=85)
        buffer.seek(0)

        img_base64 = base64.b64encode(buffer.read()).decode('utf-8')
        return "data:image/jpeg;base64," + img_base64

    except Exception as e:
        print("  Error converting image: " + str(e))
        return None



def build_recommendations(ranked_destinations: List[Dict], theme_analysis: Dict) -> Dict[str, Any]:
    """
    Build recommendations WITHOUT base64 images (memory efficient).
    """

    print("\n" + "="*80)
    print("BUILDING RECOMMENDATIONS")
    print("="*80)

    recommendations = []

    for i, dest in enumerate(ranked_destinations[:Step4Config.TOP_N_DISPLAY], 1):
        dest_id = dest['destination_id']

        print("\n[" + str(i) + "] " + dest['destination_name'])

        # Get metadata
        full_metadata = get_full_destination_metadata(dest_id)

        if not full_metadata:
            print("  SKIP: Metadata not found")
            continue

        # Get characteristics
        characteristics = get_destination_characteristics(dest_id)
        print("  Characteristics: " + str(len(characteristics)) + " categories")

        # Instead of converting images, just note they exist
        image_count = len(full_metadata.get('images', []))
        print("  Images available: " + str(image_count))

        # Build recommendation (no images array)
        recommendation = {
            'rank': i,
            'destination_id': dest_id,
            'destination_name': dest['destination_name'],
            'state': dest['state'],
            'theme': dest['theme'],
            'similarity_score': dest['avg_similarity'],
            'max_similarity': dest['max_similarity'],
            'appearances': dest['appearances'],
            'appeared_in_images': dest['appeared_in_images'],
            'theme_match': dest['theme_match'],
            'final_score': dest['final_score'],
            'characteristics': characteristics,
            'geo_location': full_metadata['geo_location'],
            'offers': full_metadata['offers'],
            'images': [],  # Empty array - no base64
            'image_count': image_count
        }

        recommendations.append(recommendation)

    print("\n" + "="*80)
    print("ENRICHMENT COMPLETE")
    print("="*80)
    print("Total recommendations: " + str(len(recommendations)))
    print("="*80)

    return {
        'user_profile': {
            'dominant_theme': theme_analysis['dominant_theme'],
            'theme_confidence': round(theme_analysis['theme_confidence'] * 100, 2),
            'theme_distribution': theme_analysis['theme_distribution']
        },
        'recommendations': recommendations,
        'total_recommendations': len(recommendations)
    }

# =================================================================
Step 6: SAVE STEP 4 OUTPUTS
# =================================================================


In [29]:
def save_step4_outputs(recommendations_data: Dict) -> None:
    """
    Save final recommendations for frontend display.

    Saves:
    - step4_recommendations.json: Complete gallery data
    """

    print("\n" + "="*80)
    print("SAVING STEP 4 OUTPUTS")
    print("="*80)

    output_path = '/content/' + Step4Config.OUTPUT_FILE
    with open(output_path, 'w') as f:
        json.dump(recommendations_data, f, indent=2)

    print("Saved to: " + output_path)
    print("="*80)
    print("\nRECOMMENDATIONS READY FOR DISPLAY")
    print("="*80)


print("Save function loaded")
print("="*80)


Save function loaded


# =================================================================
Step 7: MAIN EXECUTION FUNCTION
# =================================================================


In [30]:
def run_step4(step3_output_path: str = '/content/step3_refined_ranking.json'):
    """
    Complete Step 4 execution.

    Args:
        step3_output_path: Path to Step 3 output file

    Returns:
        Complete recommendations dictionary
    """

    print("\n" + "="*80)
    print("EXECUTING STEP 4: RECOMMENDED DESTINATIONS")
    print("="*80)

    # Load Step 3 results
    print("\nLoading Step 3 results...")
    try:
        with open(step3_output_path, 'r') as f:
            step3_data = json.load(f)
        print("Loaded Step 3 results")
    except FileNotFoundError:
        print("ERROR: Step 3 results not found at " + step3_output_path)
        print("Please run Step 3 first")
        return None

    ranked_destinations = step3_data['ranked_destinations']
    theme_analysis = step3_data['theme_analysis']

    # Build recommendations
    recommendations_data = build_recommendations(ranked_destinations, theme_analysis)

    # Save outputs
    save_step4_outputs(recommendations_data)

    print("\n" + "="*80)
    print("STEP 4 COMPLETE")
    print("="*80)
    print("\nSummary:")
    print("  Theme: " + recommendations_data['user_profile']['dominant_theme'])
    print("  Confidence: " + str(recommendations_data['user_profile']['theme_confidence']) + "%")
    print("  Recommendations: " + str(recommendations_data['total_recommendations']))
    print("  Top destination: " + recommendations_data['recommendations'][0]['destination_name'])
    print("="*80)

    return recommendations_data


print("Main execution function loaded")
print("="*80)
print("\nSTEP 4 INITIALIZED - Ready to build recommendations")
print("="*80)
print("\nTO RUN:")
print("  result = run_step4()")
print("="*80)

Main execution function loaded

STEP 4 INITIALIZED - Ready to build recommendations

TO RUN:
  result = run_step4()



---
## From: backend_api_server.ipynb
---


VLM INTELLIGENCE LAYER


UNIFIED BACKEND API


Complete pipeline: Steps 1-2-3-4 in single API call

Run this AFTER loading all Step 1-4 cells

# =================================================================
PREREQUISITES
# =================================================================

BEFORE RUNNING THIS:
1. Run ALL cells from Step1_Production_Clean.py
2. Run ALL cells from Step2_Destination_Matching.py
3. Run ALL cells from Step3_Theme_Aggregation.py
4. Run ALL cells from Step4_Recommended_Destinations.py
5. Set ngrok auth token
6. Start server with start_complete_server()


# =================================================================
Step 1: IMPORTS
# =================================================================


In [31]:
print("="*80)
print("UNIFIED BACKEND API - VLM INTELLIGENCE LAYER")
print("="*80)

# Install pyngrok if not already installed
!pip install pyngrok

from fastapi import FastAPI, UploadFile, File
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse
from typing import List
import uvicorn
from pyngrok import ngrok
from threading import Thread

print("Imports complete")
print("="*80)

UNIFIED BACKEND API - VLM INTELLIGENCE LAYER
Imports complete


# =================================================================
Step 2: CREATE FASTAPI APP
# =================================================================


In [32]:
app = FastAPI(
    title="VLM Intelligence API ",
    description="Visual travel destination recommendation system",
    version="1.0"
)

# Enable CORS
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

print("="*80)
print("FastAPI app created")
print("="*80)


FastAPI app created


# =================================================================
Step 3: COMPLETE PIPELINE ENDPOINT
# =================================================================


In [33]:
@app.get("/")
async def root():
    """Health check endpoint"""
    return {
        "status": "online",
        "service": "VLM Intelligence API - Complete Pipeline",
        "version": "1.0",
        "pipeline": "Steps 1-2-3-4",
        "endpoints": {
            "health": "GET /",
            "complete_pipeline": "POST /api/recommend-destinations"
        }
    }

@app.post("/api/recommend-destinations")
async def recommend_destinations_complete(files: List[UploadFile] = File(...)):
    """
    COMPLETE PIPELINE: Steps 1-2-3-4

    User uploads images -> Returns recommended destinations

    Pipeline:
    1. Image Analysis (validation, CLIP embeddings)
    2. Destination Matching (per-image top-10)
    3. Theme Aggregation (majority vote, re-ranking)
    4. Recommendations (enrich with metadata)

    Returns:
        JSON with user profile and top 10 recommended destinations
    """

    try:
        print("\n" + "="*80)
        print("COMPLETE PIPELINE REQUEST")
        print("="*80)
        print("Files received: " + str(len(files)))

        # ==================================================================
        # STEP 1: IMAGE ANALYSIS
        # ==================================================================

        print("\n" + "-"*80)
        print("STEP 1: IMAGE ANALYSIS")
        print("-"*80)

        # Convert files to dict
        files_dict = {}
        for file in files:
            image_bytes = await file.read()
            files_dict[file.filename] = image_bytes

        # Analyze images
        step1_result = analyze_user_images(files_dict)

        if step1_result['status'] == 'error':
            return JSONResponse(
                status_code=400,
                content={
                    "status": "error",
                    "step": "step1_image_analysis",
                    "error": step1_result.get('error'),
                    "error_code": step1_result.get('error_code'),
                    "validation_errors": step1_result.get('validation_errors', [])
                }
            )

        save_step1_outputs(step1_result)
        print("Step 1 complete: " + str(step1_result['num_processed']) + " images processed")

        # ==================================================================
        # STEP 2: DESTINATION MATCHING
        # ==================================================================

        print("\n" + "-"*80)
        print("STEP 2: DESTINATION MATCHING")
        print("-"*80)

        import numpy as np
        user_embeddings = np.array(step1_result['embeddings'])

        step2_result = match_destinations_per_image(user_embeddings)
        save_step2_outputs(step2_result)
        print("Step 2 complete: " + str(step2_result['total_unique_destinations']) + " destinations matched")

        # ==================================================================
        # STEP 3: THEME AGGREGATION
        # ==================================================================

        print("\n" + "-"*80)
        print("STEP 3: THEME AGGREGATION")
        print("-"*80)

        per_image_matches = step2_result['per_image_matches']

        theme_analysis = analyze_theme_distribution(per_image_matches)
        destination_data = aggregate_destination_data(per_image_matches)
        ranked_destinations = rerank_destinations(destination_data, theme_analysis)

        save_step3_outputs(theme_analysis, ranked_destinations)
        print("Step 3 complete: Dominant theme is " + theme_analysis['dominant_theme'])

        # ==================================================================
        # STEP 4: RECOMMENDED DESTINATIONS
        # ==================================================================

        print("\n" + "-"*80)
        print("STEP 4: RECOMMENDED DESTINATIONS")
        print("-"*80)

        recommendations_data = build_recommendations(ranked_destinations, theme_analysis)
        save_step4_outputs(recommendations_data)
        print("Step 4 complete: " + str(recommendations_data['total_recommendations']) + " recommendations ready")

        # ==================================================================
        # PREPARE RESPONSE
        # ==================================================================

        print("\n" + "="*80)
        print("PIPELINE COMPLETE")
        print("="*80)

        response = {
            "status": "success",
            "pipeline_summary": {
                "images_uploaded": step1_result['num_uploaded'],
                "images_processed": step1_result['num_processed'],
                "destinations_matched": step2_result['total_unique_destinations'],
                "dominant_theme": theme_analysis['dominant_theme'],
                "theme_confidence": round(theme_analysis['theme_confidence'] * 100, 2)
            },
            "user_profile": recommendations_data['user_profile'],
            "recommendations": recommendations_data['recommendations']
        }

        return JSONResponse(content=response)

    except NameError as e:
        return JSONResponse(
            status_code=500,
            content={
                "status": "error",
                "error": "Pipeline functions not loaded. Did you run all Step 1-4 cells?",
                "error_code": "PIPELINE_NOT_LOADED",
                "details": str(e)
            }
        )

    except Exception as e:
        import traceback
        return JSONResponse(
            status_code=500,
            content={
                "status": "error",
                "error": "Server error: " + str(e),
                "error_code": "SERVER_ERROR",
                "traceback": traceback.format_exc()
            }
        )


print("Complete pipeline endpoint defined")
print("="*80)



Complete pipeline endpoint defined


# =================================================================
Step 4: VERIFY ALL STEPS LOADED
# =================================================================


In [35]:
def verify_pipeline_loaded():
    """Check if all Steps 1-4 are loaded and ready."""

    print("\n" + "="*80)
    print("VERIFYING COMPLETE PIPELINE")
    print("="*80)

    required_functions = [
        ('analyze_user_images', 'Step 1'),
        ('save_step1_outputs', 'Step 1'),
        ('match_destinations_per_image', 'Step 2'),
        ('save_step2_outputs', 'Step 2'),
        ('analyze_theme_distribution', 'Step 3'),
        ('aggregate_destination_data', 'Step 3'),
        ('rerank_destinations', 'Step 3'),
        ('save_step3_outputs', 'Step 3'),
        ('build_recommendations', 'Step 4'),
        ('save_step4_outputs', 'Step 4')
    ]

    all_ok = True

    for func_name, step in required_functions:
        try:
            func = globals()[func_name]
            print("  Found " + func_name + " (" + step + ")")
        except KeyError:
            print("  MISSING: " + func_name + " (" + step + ")")
            all_ok = False

    print("="*80)

    if all_ok:
        print("SUCCESS: Complete pipeline loaded and ready")
    else:
        print("ERROR: Some pipeline functions missing")
        print("Solution: Run all cells from Steps 1-4 first")

    print("="*80)
    return all_ok



# ==================================================================
Step 5: START SERVER
# ==================================================================


In [36]:
def start_complete_server():
    """Start complete pipeline server with ngrok."""

    if not verify_pipeline_loaded():
        print("\nCANNOT START SERVER")
        print("Please run all Step 1-4 cells first")
        return

    print("\n" + "="*80)
    print("STARTING COMPLETE PIPELINE SERVER")
    print("="*80)

    # Start ngrok
    print("Creating ngrok tunnel...")
    ngrok_tunnel = ngrok.connect(8000)
    public_url = ngrok_tunnel.public_url

    print("\nSERVER IS RUNNING")
    print("="*80)
    print("Public URL: " + public_url)
    print("\nFrontend should use:")
    print("  " + public_url + "/api/recommend-destinations")
    print("\nEndpoints:")
    print("  GET  " + public_url + "/")
    print("  POST " + public_url + "/api/recommend-destinations")
    print("\nPipeline: Steps 1-2-3-4 (complete)")
    print("\nTo stop: Runtime -> Interrupt execution")
    print("="*80)

    # Run uvicorn
    def run_uvicorn():
        uvicorn.run(app, host="0.0.0.0", port=8000, log_level="info")

    thread = Thread(target=run_uvicorn, daemon=True)
    thread.start()

    try:
        thread.join()
    except KeyboardInterrupt:
        print("\nServer stopped")


print("\nComplete backend API ready")
print("="*80)
print("\nTO START SERVER:")
print("1. Verify pipeline: verify_pipeline_loaded()")
print("2. Set ngrok auth: ngrok.set_auth_token('YOUR_TOKEN')")
print("3. Start server: start_complete_server()")
print("="*80)


Complete backend API ready

TO START SERVER:
1. Verify pipeline: verify_pipeline_loaded()
2. Set ngrok auth: ngrok.set_auth_token('YOUR_TOKEN')
3. Start server: start_complete_server()


In [37]:
verify_pipeline_loaded()


VERIFYING COMPLETE PIPELINE
  Found analyze_user_images (Step 1)
  Found save_step1_outputs (Step 1)
  Found match_destinations_per_image (Step 2)
  Found save_step2_outputs (Step 2)
  Found analyze_theme_distribution (Step 3)
  Found aggregate_destination_data (Step 3)
  Found rerank_destinations (Step 3)
  Found save_step3_outputs (Step 3)
  Found build_recommendations (Step 4)
  Found save_step4_outputs (Step 4)
SUCCESS: Complete pipeline loaded and ready


True

In [38]:
ngrok.set_auth_token('39iRMARVIrn5qQDVIVVqYonwJz9_7NiXkJR82irsX58m1osCZ')

In [39]:
# Kill process on port 8000
!fuser -k 8000/tcp

# Wait a moment
import time
time.sleep(2)




In [None]:
# Now start server again
start_complete_server()


VERIFYING COMPLETE PIPELINE
  Found analyze_user_images (Step 1)
  Found save_step1_outputs (Step 1)
  Found match_destinations_per_image (Step 2)
  Found save_step2_outputs (Step 2)
  Found analyze_theme_distribution (Step 3)
  Found aggregate_destination_data (Step 3)
  Found rerank_destinations (Step 3)
  Found save_step3_outputs (Step 3)
  Found build_recommendations (Step 4)
  Found save_step4_outputs (Step 4)
SUCCESS: Complete pipeline loaded and ready

STARTING COMPLETE PIPELINE SERVER
Creating ngrok tunnel...


INFO:     Started server process [4845]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)



SERVER IS RUNNING
Public URL: https://dextrously-nonaddicting-alaya.ngrok-free.dev

Frontend should use:
  https://dextrously-nonaddicting-alaya.ngrok-free.dev/api/recommend-destinations

Endpoints:
  GET  https://dextrously-nonaddicting-alaya.ngrok-free.dev/
  POST https://dextrously-nonaddicting-alaya.ngrok-free.dev/api/recommend-destinations

Pipeline: Steps 1-2-3-4 (complete)

To stop: Runtime -> Interrupt execution

COMPLETE PIPELINE REQUEST
Files received: 5

--------------------------------------------------------------------------------
STEP 1: IMAGE ANALYSIS
--------------------------------------------------------------------------------

ANALYZING 5 USER IMAGES

Processing: baga_beach_001.jpg
  Validated (JPEG, 1080x810, 0.1MB)
  Preprocessed to (224, 224)
  Color analysis complete
  CLIP embedding extracted ((512,))

Processing: baga_beach_002.jpg
  Validated (JPEG, 1080x810, 0.18MB)
  Preprocessed to (224, 224)
  Color analysis complete
  CLIP embedding extracted ((512,))


In [39]:
import signal
import os

def stop_server():
    """Gracefully stop the server"""
    print("\nStopping server...")

    # Kill ngrok
    from pyngrok import ngrok
    ngrok.kill()

    # Kill the uvicorn process
    os.kill(os.getpid(), signal.SIGTERM)

    print("Server stopped")

In [None]:
stop_server()