# FAISS Demo: Media-to-Destination Recommendations

This notebook demonstrates our backend retrieval pipeline:

**Input media → Recommended destination cities**

Data sources (sampled):
- `movie_sample_50.csv`
- `book_sample_100.csv`
- `music_sample_100.csv`
- `destination_sample_wikipedia.csv`
- Precomputed embeddings in `data/embeddings/`
- FAISS index `destinations_faiss_ip.index`

Workflow:
1. User selects a media type (movie, book, or music) and enters a title.
2. We look up the corresponding embedding vector.
3. We query the FAISS index over destination embeddings.
4. We return Top-K recommended destinations with similarity scores.

This notebook is intended as a clear demo for the midpoint deliverable to show that the retrieval pipeline is working end-to-end.


In [1]:
import os
import sys
import pandas as pd

# Ensure we can import from src/
NOTEBOOK_DIR = os.path.dirname(os.path.abspath("__file__"))
REPO_ROOT = os.path.dirname(NOTEBOOK_DIR)

if REPO_ROOT not in sys.path:
    sys.path.append(REPO_ROOT)

from src.search_service import get_engine

In [2]:
engine = get_engine()
engine  # Display to confirm successful initialization

<src.search_service.RecommendationEngine at 0x164501a00>

In [3]:
from typing import Literal

MediaType = Literal["movie", "book", "music"]

def show_recommendations(media_type: MediaType, title: str, top_k: int = 5):
    """
    Convenience wrapper for the demo.

    - Calls the backend recommendation engine.
    - Prints basic context.
    - Displays a table of top-k destinations.
    """
    print(f"Media type: {media_type}")
    print(f"Input title: {title}\n")

    results = engine.recommend_from_media(media_type, title, top_k=top_k)

    if not results:
        print("No exact match found for that title.")
        suggestions = engine.suggest_titles(media_type, title, max_suggestions=5)
        if suggestions:
            print("Did you mean:")
            for s in suggestions:
                print(f" - {s}")
        else:
            print("No similar titles found in the sample dataset. Try another query.")
        return

    df = pd.DataFrame(results)
    # Keep key columns for readability if they exist
    cols = [c for c in ["rank", "score", "name", "city", "country", "region"] if c in df.columns]
    display(df[cols] if cols else df)

## Example 1: Movie → Destinations

Use one of the sample movie titles from `movie_sample_50.csv`.
For example: `Inception`, `The Dark Knight`, `The Godfather`.


In [4]:
show_recommendations("movie", "Inception", top_k=5)

Media type: movie
Input title: Inception



Unnamed: 0,rank,score,name,city,country,region
0,1,0.176968,Lagos,lagos,NG,5
1,2,0.097664,Adana,adana,TR,81
2,3,0.093941,Ibadan,ibadan,NG,32
3,4,0.082915,Douala,douala,CM,5
4,5,0.078682,Nairobi,nairobi,KE,5


## Example 2: Book → Destinations

Use one of the sample book titles from `book_sample_100.csv`.
For example: `Harry Potter and the Deathly Hallows (Harry Potter, #7)`.


In [5]:
show_recommendations("book", "Harry Potter and the Deathly Hallows (Harry Potter, #7)", top_k=5)

Media type: book
Input title: Harry Potter and the Deathly Hallows (Harry Potter, #7)



Unnamed: 0,rank,score,name,city,country,region
0,1,0.17728,New York,new york,US,NY
1,2,0.13359,Xian,xian,CN,26
2,3,0.106699,Delhi,delhi,IN,07
3,4,0.103552,New Delhi,new delhi,IN,07
4,5,0.095273,Bombay,bombay,IN,16


## Example 3: Music → Destinations

Use one of the sample tracks from `music_sample_100.csv` by `track_name`.
For example: `Blinding Lights`, `As It Was`, `Heat Waves` (if present in the sample).


In [6]:
show_recommendations("music", "Blinding Lights", top_k=5)

Media type: music
Input title: Blinding Lights



Unnamed: 0,rank,score,name,city,country,region
0,1,0.204283,Maracaibo,maracaibo,VE,23
1,2,0.101712,Dar es Salaam,dar es salaam,TZ,23
2,3,0.099754,Xian,xian,CN,26
3,4,0.089034,Tashkent,tashkent,UZ,13
4,5,0.08849,Manila,manila,PH,D9


## How the Frontend Will Use This

The Streamlit / Hugging Face Spaces frontend can directly reuse the same engine:

```python
from src.search_service import get_engine

engine = get_engine()
results = engine.recommend_from_media(media_type, user_input_title, top_k=5)
```
