## Recommendation Systems Overview

### Why It Matters
- Recommendation systems filter content intelligently.
- They guide users toward options they are likely to enjoy or need.
- Improves user satisfaction and business outcomes.

### Types & Real-World Applications

1. **News Platforms**
   - **Examples:** Google News, Flipboard
   - **Use Case:** Prioritize articles based on user reading habits (e.g., climate change, international policy).

2. **E-Commerce**
   - **Examples:** Amazon, eBay, Etsy
   - **Use Case:** Suggest related products (e.g., phone cases after buying a smartphone).

3. **Video Streaming Services**
   - **Examples:** Netflix, YouTube, Disney+
   - **Use Case:** Recommend movies/shows based on viewing history and similar user preferences.

4. **Social Media**
   - **Examples:** Instagram, TikTok, Twitter/X
   - **Use Case:** Suggest accounts to follow, videos to watch, and content to engage with.

5. **Production Planning & Control**
   - **Examples:** Manufacturing systems
   - **Use Case:** Recommend optimal machine settings, maintenance schedules, and job assignments.
   - **Benefits:** Improves efficiency, reduces downtime, enhances product quality.

## Types of Recommendation Systems

### A. Content-Based Filtering
- **How it works:** The system analyzes item attributes (genre, keywords, product category, etc.) and recommends items similar to what the user has liked in the past
- **Advantage:** Works well for new users without social or historical data.
- **Limitation:** Can become too narrow or repetitive.

### B. Collaborative Filtering
- **How it works:** Identifies users with similar behavior and recommends what those users liked.
- **Types:**
  - **User-based:** "People similar to you also liked..."
  - **Item-based:** "People who liked this item also liked..."
- **Example:** Amazon recommends Product B if users who bought Product A also bought B.
- **Advantage:** Uncovers hidden patterns and suggests diverse items.
- **Limitation:** Cold-start problem for new users or items (no interaction data yet).

### Simply note that: Content-based filtering focuses on item attributes, while collaborative filtering focuses on user-item interaction patterns (both user-based and item-based)  <br> <br>


### C. Hybrid Systems
- Combines content-based and collaborative filtering for better accuracy and diversity.
- **Example:** YouTube uses both watch history (content) and similar users’ behavior (collaboration).


## Utility Matrix in Recommendation Systems

### What is a Utility Matrix?
- A utility matrix maps **user preferences or ratings** to various items (e.g., movies, books, ads).
- Used by recommendation systems to predict which content or product a user is most likely to engage with.

### Example Scenario
- Streaming platforms like **Netflix** use utility matrices for email campaigns promoting new releases.
- Users = Customers (C1 to C4), Items = Movies/TV shows.
- Example ratings table:

|        | Stranger Things | Bridgerton | Dune | The Office | Oppenheimer | The Crown |
|--------|-----------------|------------|------|------------|-------------|-----------|
| **C1** | 5               | 4          |      |            | 2           |           |
| **C2** |                 | 2          | 5    | 4          |             |           |
| **C3** | 1               |            |      | 5          | 4           |           |
| **C4** |                 |            | 4    |            | 3           | 5         |

### How Recommendations Work
- **Example 1:** If C1 liked *Stranger Things* and *Bridgerton*, recommend *The Witcher* (similar genre/tone).
- **Example 2:** If C3 watched *The Office* and *Oppenheimer*, suggest *Peaky Blinders* (similar themes or patterns).



## Item Profiles in Content-Based Recommendation Systems

### What is an Item Profile?
- A structured representation of an item's important features.
- Used to compare items and recommend those most relevant to a user's preferences.

---

### Examples

#### 1. Movie Recommendation
- **Features:** Actors, Director, Release Year, Genre
- **Use Case:** Users who prefer recent sci-fi thrillers by a specific filmmaker or classic romantic comedies with certain actors.

#### 2. Product Recommendation (E-Commerce)
- **Features:** Brand, Category, Color/Size, Technical Specs
- **Use Case:** A user browsing 4K smart TVs with 55-inch screens may be recommended similar models with slight variations.


## Building Item Profiles Using Vector Space Model & Cosine Similarity

### Why Use Vector Representation?
- Represents each item (e.g., movie) as a **feature vector**.
- Enables comparison between items using **similarity measures** like cosine similarity.
- Helps recommend similar items to users based on their preferences.



### Steps to Create an Item Profile

#### **Step 1: Define Features**
- **Binary Features (0/1):**
  - Actors: Tom Hanks, Mark Rylance, Amy Ryan, Leonardo DiCaprio, Joseph Gordon-Levitt
  - Genres: Drama, History, Thriller, Action, Sci-Fi
  - Directors: Steven Spielberg, Christopher Nolan
- **Numerical Features:**
  - Average User Rating (scaled 1.0–5.0)
  - Release Year


## Step 2: Encode as Feature Vector

### How?
- Represent each item (e.g., movie) as a **vector of features**.
- Categorical Features: Use binary encoding (1 = present, 0 = absent).
- Numerical Features: Normalize or scale values.



### Scaling Rules
- **AvgRating_scaled** = AvgRating / 5  (range: 0–1)
- **Year_scaled** = (Year - 2000) / 20  (range: 0–1)


### Example: Bridge of Spies
- Actors: Tom Hanks, Mark Rylance, Amy Ryan
- Genres: Drama, History, Thriller
- Director: Steven Spielberg
- Avg Rating: 4.3 → 0.84
- Release Year: 2015 → 0.75


### Vector for Bridge of Spies:
[1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0.84, 0.75]


# How to decide whether these two movies (items) are similar or different?

## Cosine Similarity Formula & Example



### Formula
$$
\text{Cosine Similarity} = \frac{\mathbf{A} \cdot \mathbf{B}}{\|\mathbf{A}\| \times \|\mathbf{B}\|}
$$
$$
\text{Cosine Similarity} =\frac{\sum_{i=1}^{n} A_i \times B_i}{\sqrt{\sum_{i=1}^{n} A_i^2} \times \sqrt{\sum_{i=1}^{n} B_i^2}}
$$



### Bounds### 
$$
-1 \leq \text{Cosine Similarity} \leq 1
$$

**Interpretation:**
- **1** → Vectors point in the same direction (perfect similarity).
- **0** → Vectors are orthogonal (no similarity).
- **-1** → Vectors point in opposite directions (completely dissimilar).

✅ In recommendation systems (non-negative features), practical range:
$$
0 \leq \text{Cosine Similarity} \leq 1
$$


# EX: Given the folllowing two vectors represenitng two items, find their simialrity.
bridge_of_spies = [1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0.84, 0.75] <br>
inception = [0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0.96, 0.5]



In [0]:
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

# Feature vectors for each movie
# Format: [Tom Hanks, Mark Rylance, Amy Ryan, Leonardo DiCaprio, Joseph Gordon-Levitt,
# Elliot Page, Action, Drama, Sci-Fi, Thriller, Steven Spielberg, Christopher Nolan,
# Avg Rating, Release Year]

bridge_of_spies = np.array([
    1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 4.3, 2015
])

inception = np.array([
    0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 4.8, 2010
])

# Compute cosine similarity
similarity = cosine_similarity([bridge_of_spies], [inception])[0][0]

print(f"Cosine Similarity between Bridge of Spies and Inception: {similarity:.6f}")

Given these results, the two movies are very similar. But this is wrong, why?

**Apply normaliztion to the input vectors**

In [0]:

# Scaling function
max_rating = 5
base_year = 2000

def scale_vector(vec):
    vec[-2] = vec[-2] / max_rating  # Scale rating to [0,1]
    vec[-1] = (vec[-1] - base_year) / 20  # Scale year to [0,1] assuming 2000–2020 range
    return vec

# Apply scaling
bridge_scaled = scale_vector(bridge_of_spies)
inception_scaled = scale_vector(inception)

# Print vec vectors
print("Bridge of Spies vec:", bridge_scaled)
print("Inception vec:", inception_scaled)

# Compute cosine similarity
similarity = cosine_similarity([bridge_scaled], [inception_scaled])[0][0]
print(f"Cosine Similarity after scaling: {similarity:.4f}")

## Collaborative Filtering Overview

Unlike content-based methods that rely on item features to compute similarities, collaborative filtering focuses on **user behavior**, specifically patterns of ratings or interactions. 


### Key Principles
- Each **item** is represented by its **column** in the utility matrix (ratings from all users).
- Each **user** is represented by their **row** in the utility matrix (ratings given to items).
- Users are considered similar if their rating vectors are close, based on a similarity metric (e.g., **cosine similarity**).

### How Recommendations Are Made
1. Identify users who are similar to the target user.
2. Recommend items that these similar users liked, which the target user has not yet interacted with.



## Building User Profile Using Feature Aggregation

To make personalized recommendations in collaborative filtering approach, we need to create user vectorsthat align with item vectors. These user profiles are built using the utility matrix, which captures user-item interactions. Entries can be:
- **Binary indicators** (e.g., 1 for purchase)
- **Numeric ratings** (e.g., 1–5 stars)



### Example: Creating a User Profile Based on Movie Ratings

Assume each movie is described by 3 main features:
- **Genres:** Action, Drama, Comedy (binary: 1 = yes, 0 = no)
- **Lead Actor:** Actor A, Actor B, Actor C (binary)
- **Average Rating:** From the user (1 to 5 stars)



#### Movie Profiles
| Movie    | Action | Drama | Comedy | Actor A | Actor B | Actor C |
|----------|--------|-------|--------|---------|---------|---------|
| Movie X  | 1      | 0     | 0      | 1       | 0       | 0       |
| Movie Y  | 0      | 1     | 0      | 0       | 1       | 0       |
| Movie Z  | 1      | 1     | 0      | 0       | 0       | 1       |



**Next Step:** Aggregate these features weighted by user ratings to build the **user profile vector**.

## Building User Profile Using Feature Aggregation

To make personalized recommendations, we create a **user profile vector** by aggregating item features weighted by the user’s normalized ratings.



### Example: User Alex's Ratings
- Movie X: 5
- Movie Y: 3
- Movie Z: 4
- **Average Rating:** (5 + 3 + 4) / 3 = 4



### Normalized Ratings (difference from average)
- Movie X: 5 − 4 = **+1**
- Movie Y: 3 − 4 = **−1**
- Movie Z: 4 − 4 = **0**



### Movie Feature Matrix
| Movie    | Action | Drama | Comedy | Actor A | Actor B | Actor C |
|----------|--------|-------|--------|---------|---------|---------|
| Movie X  | 1      | 0     | 0      | 1       | 0       | 0       |
| Movie Y  | 0      | 1     | 0      | 0       | 1       | 0       |
| Movie Z  | 1      | 1     | 0      | 0       | 0       | 1       |


### Build User Profile Vector
$$
\text{User Profile} = \frac{(+1 \times X) + (-1 \times Y) + (0 \times Z)}{3}
$$

$$
= \frac{[1,0,0,1,0,0] - [0,1,0,0,1,0]}{3}
= \frac{[1,-1,0,1,-1,0]} {3}
$$

$$
= [0.33\,-0.33\,0\,0.33\,-0.33\,0]
$$

### Alex's Profile
| Feature  | Value  |
|----------|--------|
| Action   | 0.33   |
| Drama    | −0.33  |
| Comedy   | 0      |
| Actor A  | 0.33   |
| Actor B  | −0.33  |
| Actor C  | 0      |

---

### Interpretation
- **Prefers:** Action (+0.33), Actor A (+0.33)
- **Dislikes:** Drama (−0.33), Actor B (−0.33)
- **Neutral:** Comedy, Actor C

##  Now we can make use of the profile built for each user

Building user profiles allows us to compare different users to determine how similar their preferences are. Such comparisons provide valuable insights that can enhance the effectiveness of a recommendation system.

---

### Example: Two User Profiles
| User    | Action | Drama  | Comedy | Actor A | Actor B | Actor C |
|---------|--------|--------|--------|---------|---------|---------|
| Alex    | 0.33   | -0.33  | 0.00   | 0.33    | -0.33   | 0.00    |
| Jordan  | -0.33  | 0.00   | 0.33   | -0.33   | 0.00    | -0.33   |

---

**Next Step:**  
How to use Alex’s profile to recommend movies to Jordan (or vice versa) will be discussed in the following section, where we show **how to measure similarity between users**.

**Apply Coisne Similarity (task_1)** 

Are Alex and Jordan similar? verify

In [0]:
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

bridge_of_spies = np.array([1,1,1,0,0,1,1,1,0,0,1,0,0.84,0.75])
inception = np.array([0,0,0,1,1,0,0,1,1,1,0,1,0.96,0.5])


In [0]:
sim_before = cosine_similarity([bridge_of_spies], [inception])[0][0]
print("Similarity before normalization:", sim_before)


In [0]:
def normalize_vector(v):
    return v / np.linalg.norm(v)
    
bridge_norm = normalize_vector(bridge_of_spies)
inception_norm = normalize_vector(inception)


In [0]:
sim_after = cosine_similarity([bridge_norm], [inception_norm])[0][0]
print("Similarity after normalization:", sim_after)


In [0]:
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

# input vectors
bridge_of_spies = np.array([1,1,1,0,0,1,1,1,0,0,1,0,0.84,0.75])
inception        = np.array([0,0,0,1,1,0,0,1,1,1,0,1,0.96,0.5])

# similarity before
sim_before = cosine_similarity([bridge_of_spies], [inception])[0][0]

# vector normalization function
def normalize_vector(v):
    return v / np.linalg.norm(v)

# normalized vectors
bridge_norm = normalize_vector(bridge_of_spies)
inception_norm = normalize_vector(inception)

# similarity after normalization
sim_after = cosine_similarity([bridge_norm], [inception_norm])[0][0]

print("Similarity BEFORE normalization:", sim_before)
print("Similarity AFTER normalization:", sim_after)


In [0]:
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

# original vectors
bridge_of_spies = np.array([1,1,1,0,0,1,1,1,0,0,1,0,0.84,0.75])
inception =        np.array([0,0,0,1,1,0,0,1,1,1,0,1,0.96,0.50])

# scale last two values to match binary weight
def scale(v):
    v = v.copy()
    v[-2] = v[-2] / 2   # normalize rating-like variable
    v[-1] = v[-1] / 2   # normalize year/score
    return v

bridge_scaled = scale(bridge_of_spies)
inception_scaled = scale(inception)

# cosine similarity
similarity = cosine_similarity([bridge_scaled], [inception_scaled])[0][0]
print("Similarity:", similarity)
