# Alternating Least Squares (ALS) for Recommendations

ALS is a matrix factorization algorithm commonly used in recommendation systems.  
It works by decomposing a user-item interaction matrix into two smaller matrices:

- **User factors**: latent features representing user preferences
- **Item factors**: latent features representing item characteristics

The goal is to predict missing interactions (e.g., which movies a user might like) by learning these latent factors. A latent factor is a hidden feature that the model learns during training to represent underlying patterns in the data. These factors are not explicitly present in your dataset but are inferred by the algorithm.


## Simple Example

Imagine we have **3 users** and **4 movies**.  
Our interaction matrix (ratings or implicit feedback) looks like this:

|       | Movie A | Movie B | Movie C | Movie D |
|-------|---------|---------|---------|---------|
| User 1|    5    |    0    |    3    |    0    |
| User 2|    0    |    4    |    0    |    2    |
| User 3|    1    |    0    |    5    |    0    |

- `0` means no interaction.
- We want to predict which movies users might like next.

## Factorization Idea

ALS decomposes the matrix into:

**User Factors (U)**: 3 × k  
**Item Factors (V)**: 4 × k  

So that: User-Item Matrix ≈ U × Vᵀ

## ✅ What is happening in the example?

We assume:

- **k = 2** → There are **2 latent factors** (hidden features).
- **U** = user factor matrix (each row = a user, each column = a latent feature).
- **V** = item factor matrix (each row = an item, each column = a latent feature).
- The intial valaues of U and V are random, the least square method is used to get better values
- The algorithm stops after predetermined number of iterations; wiht more iterations, U and V will converge so that Predicted ≈ Original Ratings.
- `k` controls the complexity of the model


ex: <br>
U = [[0.9, 0.1],   # User 1 <br>
    [0.2, 0.8],   # User 2 <br>
     [0.7, 0.3]]   # User 3 <br>

V = [[0.8, 0.2],   # Movie A <br>
     [0.1, 0.9],   # Movie B <br>
     [0.7, 0.3],   # Movie C <br>
     [0.3, 0.7]]   # Movie D <br>

Predict the score that represents how User 1 likes Movie B: <br>
User 1 vector: [0.9, 0.1] <br>
Movie B vector: [0.1, 0.9] <br>
[0.9, 0.1] . [0.1, 0.9] = (0.9 * 0.1) + (0.1 * 0.9) = 0.18


## Visual Intuition

Think of each user and item as a point in a 2D space (because `k = 2`):
- Users who like similar things will be close together.
- Items with similar characteristics will be close together.
- The dot product measures how aligned a user and item are in this space.


**Example of using ALS**

In this example, we use the MovieLens 100k dataset, which is a popular benchmark in recommender system research, to demonstrate:
- How to load data, 
- Explore its structure, 
- Train an ALS model




In [1]:
!pip install implicit

Collecting implicit
  Downloading implicit-0.7.2-cp311-cp311-win_amd64.whl.metadata (6.3 kB)
Downloading implicit-0.7.2-cp311-cp311-win_amd64.whl (750 kB)
   ---------------------------------------- 0.0/750.8 kB ? eta -:--:--
   ---------------------------------------- 750.8/750.8 kB 6.3 MB/s eta 0:00:00
Installing collected packages: implicit
Successfully installed implicit-0.7.2




Imports and download the data

In [None]:
import urllib.request
import zipfile
import os

# Define download URL and output path
url = "http://files.grouplens.org/datasets/movielens/ml-100k.zip"
output_zip = "ml-100k.zip"
extract_folder = "ml-100k"

# Download the ZIP file
urllib.request.urlretrieve(url, output_zip)
print("Download complete.")

# Extract ZIP contents
with zipfile.ZipFile(output_zip, 'r') as zip_ref:
    zip_ref.extractall(extract_folder)
print(f"Files extracted to ./{extract_folder}")


Download complete.
Files extracted to ./ml-100k


**Have a look to the readme file to understand the data**

Get some metadata about this dataset

In [3]:
import pandas as pd
import numpy as np
from scipy.sparse import csr_matrix
from implicit.als import AlternatingLeastSquares

# Load datahttps://dbc-ce85d634-8982.cloud.databricks.com/editor/notebooks/1339708348203032?o=1466187410014757$0
# first add the data file to a volume, how to do that is given in this video "https://www.youtube.com/watch?v=s22KU8sub9s". Or you can copy the path of the downlaoded file
file_path = 'ml-100k/ml-100k/u.data' #change for your path
user_movie_data = pd.read_csv(file_path, sep='\t', names=['userId', 'movieId', 'rating', 'timestamp'])

# Filter for implicit feedback (rating >= 4)
user_movie_data_filtered = user_movie_data[user_movie_data['rating'] >= 4]

user_movie_data_implicit = user_movie_data_filtered.copy()

# Map userId and movieId to zero-based indices, we do that because most recommendation algorithms (like ALS in implicit) require user and item IDs to be zero-based integer indices, not arbitrary IDs like userId or movieId from the dataset.
user_movie_data_implicit['userIndex'] = pd.factorize(user_movie_data_implicit['userId'])[0]
user_movie_data_implicit['movieIndex'] = pd.factorize(user_movie_data_implicit['movieId'])[0]

# Create user-item matrix using actual ratings
from scipy.sparse import csr_matrix
user_item_matrix = csr_matrix((
    user_movie_data_implicit['rating'],  # Use actual rating values
    (user_movie_data_implicit['userIndex'], user_movie_data_implicit['movieIndex'])
))


num_users, num_items = user_item_matrix.shape  # Get the shape of the matrix: number of users and items

# Summary stats
ratings_per_user = user_item_matrix.getnnz(axis=1)  # Count non-zero entries per row (number of items each user interacted with)
ratings_per_item = user_item_matrix.getnnz(axis=0)  # Count non-zero entries per column (number of users interacted with each item)

summary = pd.DataFrame({
    'Total Users': [num_users],  # Number of unique users in the dataset
    'Total Items': [num_items],  # Number of unique items (movies)
    'Avg Ratings per User': [ratings_per_user.mean()],  # Average interactions per user
    'Avg Ratings per Item': [ratings_per_item.mean()],  # Average interactions per item
    'Max Ratings per User': [ratings_per_user.max()],   # Most interactions by a single user
    'Max Ratings per Item': [ratings_per_item.max()],   # Most interactions for a single item
    'Min Ratings per User': [ratings_per_user.min()],   # Least interactions by any user
    'Min Ratings per Item': [ratings_per_item.min()]    # Least interactions for any item
})
display(summary) 


Unnamed: 0,Total Users,Total Items,Avg Ratings per User,Avg Ratings per Item,Max Ratings per User,Max Ratings per Item,Min Ratings per User,Min Ratings per Item
0,942,1447,58.784501,38.268832,378,501,3,1


In [4]:
print(user_item_matrix)

<Compressed Sparse Row sparse matrix of dtype 'int64'
	with 55375 stored elements and shape (942, 1447)>
  Coords	Values
  (0, 0)	4
  (0, 1)	4
  (0, 8)	4
  (0, 9)	5
  (0, 13)	4
  (0, 14)	5
  (0, 15)	5
  (0, 17)	5
  (0, 28)	4
  (0, 29)	4
  (0, 31)	5
  (0, 32)	5
  (0, 37)	5
  (0, 42)	5
  (0, 47)	4
  (0, 53)	5
  (0, 55)	4
  (0, 57)	5
  (0, 59)	5
  (0, 60)	5
  (0, 61)	5
  (0, 62)	4
  (0, 68)	4
  (0, 70)	4
  (0, 71)	4
  :	:
  (940, 531)	4
  (940, 550)	5
  (940, 571)	4
  (940, 585)	5
  (940, 610)	5
  (940, 618)	5
  (940, 645)	4
  (940, 682)	5
  (940, 707)	5
  (940, 793)	5
  (940, 844)	5
  (940, 861)	5
  (940, 985)	4
  (940, 987)	4
  (940, 1066)	4
  (940, 1373)	4
  (941, 73)	4
  (941, 109)	5
  (941, 165)	4
  (941, 172)	4
  (941, 195)	4
  (941, 262)	5
  (941, 480)	4
  (941, 572)	4
  (941, 709)	4


In [8]:
# Train ALS model
model = AlternatingLeastSquares(factors=50, regularization=0.01, iterations=15)
model.fit(user_item_matrix.T)





  0%|          | 0/15 [00:00<?, ?it/s]

**What is the recommneded movies to a specific user?**

In [9]:
user_index= 0 # we are interensted in the first user
user_items = user_item_matrix.T.tocsr()[user_index]  # Convert to CSR, as it should be a sparse matrix
recommendations = model.recommend(user_index, user_items)
print(recommendations) # this prints the results

# Provide better representaion of the recommendeations
item_indices, scores = recommendations
# Create DataFrame
recommendations_df = pd.DataFrame({
    'Movie_Index': item_indices,
    'Score': scores
})

# Sort by Score (optional, though ALS already returns sorted)
recommendations_df = recommendations_df.sort_values(by='Score', ascending=False)

# Display
print(recommendations_df.to_string(index=False))

(array([382, 522, 870,   9, 428,  32, 302, 363, 528, 657], dtype=int32), array([1.07428   , 1.0656908 , 1.0124986 , 0.9670208 , 0.9382111 ,
       0.91306907, 0.90153325, 0.8906461 , 0.8865737 , 0.8803128 ],
      dtype=float32))
 Movie_Index    Score
         382 1.074280
         522 1.065691
         870 1.012499
           9 0.967021
         428 0.938211
          32 0.913069
         302 0.901533
         363 0.890646
         528 0.886574
         657 0.880313


**What it the top N movies similar to a specific one?**


In [10]:

# Get top 5 similar movies to movie index 10
movie_id = 100
ids, scores = model.similar_items(movie_id, N=5)

print(f"Top 5 similar movies to movie {movie_id}")
for idx, score in zip(ids, scores):
    print(f"Movie {idx} with similarity score {score:.4f}")

Top 5 similar movies to movie 100
Movie 100 with similarity score 1.0000
Movie 850 with similarity score 0.7153
Movie 603 with similarity score 0.6803
Movie 774 with similarity score 0.6699
Movie 698 with similarity score 0.6531


**What it the top N users similar to a specific one?**

In [11]:

# Get top 5 similar user to user index 10
user_id = 10
ids, scores = model.similar_items(user_id, N=5)

print(f"Top 5 similar movies to movie {user_id}")
for idx, score in zip(ids, scores):
    print(f"User {idx} with similarity score {score:.4f}")


Top 5 similar movies to movie 10
User 10 with similarity score 1.0000
User 604 with similarity score 0.5439
User 911 with similarity score 0.4997
User 37 with similarity score 0.4139
User 170 with similarity score 0.3817


Now, build another model that is more sphosticated for example let k = 500, and iterations=1500, refind the top 5 similar user to user index 10, is it siginifcanlty different?