## **Proyek Mata Kuliah Sistem Rekomendasi**
## **Collaborative Filtering Using SVD**

<table align="left">
    <tr>
        <td style="text-align:left">Kode Mata Kuliah</td>
        <td style="text-align:left">:</td>
        <td style="text-align:left">12S4054</td>
    </tr>
    <tr>
        <td style="text-align:left">Nama Mata Kuliah</td>
        <td style="text-align:left">:</td>
        <td style="text-align:left">Sistem Rekomendasi</td>
    </tr>
    <tr>
        <td style="text-align:left">Anggota Kelompok 3</td>
        <td style="text-align:left">:</td>
        <td style="text-align:left">
            1. 12S21046 Ruth Marelisa Hutagalung <br>
            2. 12S21052 Griselda<br>
            3. 12S21054 Diah Anastasya
        </td>
    </tr>
</table>


# Data Understanding

In [None]:
import pandas as pd
import numpy as np
from sklearn.decomposition import TruncatedSVD
from sklearn.metrics import mean_squared_error, mean_absolute_error
from sklearn.model_selection import train_test_split
from sklearn.metrics import precision_score, recall_score

In [None]:
# Load Dataset
toba_tourism_data = pd.read_csv("Tempat-Wisata-Toba-Preprocessing.csv")

# Menampilkan 10 baris pertama dari data
toba_tourism_data.head()


Unnamed: 0.2,Unnamed: 0.1,Unnamed: 0,address,PlaceID,Nama_tempat_wisata,Category,ReviewerId,Rating,Reviews
0,0,0,"Jl. Sibola Hotang, Sibola Hotangsas, Kec. Bali...",0,PASIR PUTIH LUMBAN BULBUL,Wisata Bahari,1.12603e+20,5.0,
1,1,1,"Jl. Sibola Hotang, Sibola Hotangsas, Kec. Bali...",0,PASIR PUTIH LUMBAN BULBUL,Wisata Bahari,1.11909e+20,5.0,bagus
2,2,2,"Jl. Sibola Hotang, Sibola Hotangsas, Kec. Bali...",0,PASIR PUTIH LUMBAN BULBUL,Wisata Bahari,1.07886e+20,5.0,
3,3,3,"Jl. Sibola Hotang, Sibola Hotangsas, Kec. Bali...",0,PASIR PUTIH LUMBAN BULBUL,Wisata Bahari,1.13072e+20,5.0,sangat menyenagkan
4,4,4,"Jl. Sibola Hotang, Sibola Hotangsas, Kec. Bali...",0,PASIR PUTIH LUMBAN BULBUL,Wisata Bahari,1.06173e+20,5.0,bebas foto dimana aja cuma 2k


# Data Understanding

# Data Preprocessing

Dari hasil menampilkan dataset mencakup beberapa kolom, yaitu: `Unnamed: 0.1`, `Unnamed: 0`, `Address`, `PlaceID`, `Nama_tempat_wisata`, `Category`, `ReviewerId`, `Rating`, dan `Reviews`.

Pada dataset terdapat kolom yang tidak diperlukan seperti  `Unnamed: 0.1` dan `Unnamed: 0`. Maka dari itu tahap selanjutnya adalah menghapus kedua kolom tersebut

In [None]:
# Menghapus kolom yang tidak diperlukan
toba_tourism_data_cleaned = toba_tourism_data.drop(columns=['Unnamed: 0.1', 'Unnamed: 0', 'address', 'Reviews'])

# Melakukan pengecekan nilai yang hilang akibat menghapus kolom.
missing_data_summary = toba_tourism_data_cleaned.isnull().sum()

# Menampilkan pratinjau data yang dibersihkan dan ringkasan data yang hilang
toba_tourism_data_cleaned.head(), missing_data_summary

(   PlaceID         Nama_tempat_wisata       Category    ReviewerId  Rating
 0        0  PASIR PUTIH LUMBAN BULBUL  Wisata Bahari  1.126030e+20     5.0
 1        0  PASIR PUTIH LUMBAN BULBUL  Wisata Bahari  1.119090e+20     5.0
 2        0  PASIR PUTIH LUMBAN BULBUL  Wisata Bahari  1.078860e+20     5.0
 3        0  PASIR PUTIH LUMBAN BULBUL  Wisata Bahari  1.130720e+20     5.0
 4        0  PASIR PUTIH LUMBAN BULBUL  Wisata Bahari  1.061730e+20     5.0,
 PlaceID               0
 Nama_tempat_wisata    0
 Category              0
 ReviewerId            1
 Rating                1
 dtype: int64)

Dari hasil ringkasan menunjukkan bahwa terdapat satu nilai yang hilang pada kolom `ReviewerId` dan `Rating`. Namun karena nilai yang hilang hanya satu, maka tidak akan terlalu berdampak terhadap akurasi dan performa sistem rekomendasi yang akan dilakukan.

In [None]:
# Menghapus baris dengan nilai yang hilang di kolom ReviewerId dan Rating
toba_tourism_data_cleaned.dropna(subset=['ReviewerId', 'Rating'], inplace=True)

# Memverifikasi apakah masih ada data yang hilang di kolom penting
missing_data_check = toba_tourism_data_cleaned[['ReviewerId', 'Rating']].isnull().sum()


In [None]:
toba_tourism_data_cleaned = toba_tourism_data_cleaned.rename(columns={'ReviewerId': 'user_id', 'PlaceID': 'item_id', 'Rating': 'rating'})
toba_tourism_data_cleaned['user_id'] = toba_tourism_data_cleaned['user_id'].astype(str)
toba_tourism_data_cleaned['item_id'] = toba_tourism_data_cleaned['item_id'].astype(str)
toba_tourism_data_cleaned['rating'] = toba_tourism_data_cleaned['rating'].astype(float)
toba_tourism_data_cleaned.dropna(inplace=True)

In [None]:
# Memisahkan 15% untuk test set
train_val, test = train_test_split(toba_tourism_data_cleaned, test_size=0.15, random_state=42)

# Memisahkan sisa 15% dari 85% untuk validasi set, sehingga mendapatkan 70% untuk train dan 15% untuk validasi
train, val = train_test_split(train_val, test_size=0.176, random_state=42)  # 0.176 * 0.85 = 0.15 (15%)

# Menampilkan ukuran setiap split dan memastikan tidak ada data yang hilang di kolom penting
train_size, val_size, test_size = len(train), len(val), len(test)
missing_data_check, (train_size, val_size, test_size)

(ReviewerId    0
 Rating        0
 dtype: int64,
 (30274, 6467, 6484))

Dari hasil diatas dapat dilihat bahwa data telah berhasil dibersihkan dan dibagi menjadi:
- Training set: 30,274
- Validation set: 6,467
- Test set: 6,484

Tidak ada nilai yang hilang di kolom PlaceID, ReviewerId, and Rating.

In [None]:
def create_user_item_matrix(data, user_col='user_id', item_col='item_id', rating_col='rating'):
    data = data.groupby([user_col, item_col])[rating_col].mean().reset_index()
    return data.pivot(index=user_col, columns=item_col, values=rating_col).fillna(0)

# Membuat matriks user-item untuk masing-masing set
train_matrix = create_user_item_matrix(train)
val_matrix = create_user_item_matrix(val).reindex(columns=train_matrix.columns, fill_value=0)
test_matrix = create_user_item_matrix(test).reindex(columns=train_matrix.columns, fill_value=0)

In [None]:
train_matrix

item_id,0,1,10,100,101,11,12,13,14,15,...,90,91,92,93,94,95,96,97,98,99
user_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1.00001e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1.00003e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,0.0
1.00004e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,5.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1.00005e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1.00007e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1.18445e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5.0,0.0,0.0
1.18446e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1.1844e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1.184e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [None]:
val_matrix

item_id,0,1,10,100,101,11,12,13,14,15,...,90,91,92,93,94,95,96,97,98,99
user_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1.00009e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5.0
1.00012e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1.00016e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1.00017e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1.0001e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,5.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1.18439e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1.1843e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1.18441e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1.18443e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,0.0,0.0,5.0,0.0,0.0,0.0,0.0


In [None]:
test_matrix

item_id,0,1,10,100,101,11,12,13,14,15,...,90,91,92,93,94,95,96,97,98,99
user_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1.00002e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,5.0,0.0,0.0,0.0,0.0,0.0,0.0
1.00003e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1.00012e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,0.0,0.0,5.0,0.0,0.0,0.0,0.0
1.00015e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1.00017e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1.18433e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1.18435e+20,0.0,0.0,0.0,0.0,0.0,5.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1.18438e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,5.0,0.0,0.0,0.0,0.0,0.0,0.0
1.1843e+20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


# SVD

In [None]:
n_factors = 20
svd = TruncatedSVD(n_components=n_factors, random_state=42)
U_train = svd.fit_transform(train_matrix)
U_val = svd.transform(val_matrix)
U_test = svd.transform(test_matrix)
Sigma = svd.components_

train_approx_matrix = np.dot(U_train, Sigma)
val_approx_matrix = np.dot(U_val, Sigma)
test_approx_matrix = np.dot(U_test, Sigma)

In [None]:
# Fungsi untuk mendapatkan rekomendasi untuk satu pengguna
def get_recommendations(user_id, user_item_matrix, approx_matrix, top_n=5):
    """
    Memberikan rekomendasi untuk satu pengguna berdasarkan matriks rekonstruksi hasil SVD.

    Args:
        user_id (int): ID pengguna yang ingin diberikan rekomendasi.
        user_item_matrix (pd.DataFrame): Matriks user-item asli.
        approx_matrix (np.ndarray): Matriks rekonstruksi hasil SVD.
        top_n (int): Jumlah rekomendasi yang diinginkan.

    Returns:
        list: Daftar ID item yang direkomendasikan.
    """
    # Menyusun indeks item berdasarkan peringkat dari matriks perkiraan
    user_index = user_item_matrix.index.get_loc(user_id)
    user_approx_ratings = approx_matrix[user_index, :]

    # Mendapatkan indeks top-N item berdasarkan peringkat tertinggi
    top_n_idx = np.argsort(user_approx_ratings)[::-1][:top_n]

    # Mengambil nama item berdasarkan indeks top-N
    recommended_items = user_item_matrix.columns[top_n_idx].tolist()

    return recommended_items

# Fungsi untuk menampilkan rekomendasi dalam bentuk DataFrame
def get_recommendations_as_dataframe(user_ids, user_item_matrix, approx_matrix, top_n=5):
    """
    Memberikan rekomendasi untuk beberapa pengguna dalam bentuk DataFrame.

    Args:
        user_ids (list): Daftar ID pengguna.
        user_item_matrix (pd.DataFrame): Matriks user-item asli.
        approx_matrix (np.ndarray): Matriks rekonstruksi hasil SVD.
        top_n (int): Jumlah rekomendasi yang diinginkan per pengguna.

    Returns:
        pd.DataFrame: DataFrame dengan user_id sebagai indeks dan rekomendasi sebagai kolom.
    """
    recommendations = []
    for user_id in user_ids:
        recommended_items = get_recommendations(user_id, user_item_matrix, approx_matrix, top_n)
        recommendations.append([user_id] + recommended_items)

    # Membuat DataFrame dari rekomendasi
    recommendations_df = pd.DataFrame(recommendations, columns=['user_id'] + [f'Recommendation {i+1}' for i in range(top_n)])
    return recommendations_df

# Mendapatkan 5 ID pengguna pertama dari data validasi
sample_users = val_matrix.index[:5]

# Mendapatkan rekomendasi untuk pengguna-pengguna tersebut dalam bentuk DataFrame
recommendations_df = get_recommendations_as_dataframe(sample_users, train_matrix, train_approx_matrix, top_n=5)

# Menampilkan tabel rekomendasi
recommendations_df


Unnamed: 0,user_id,Recommendation 1,Recommendation 2,Recommendation 3,Recommendation 4,Recommendation 5
0,1.00009e+20,94,90,87,83,20
1,1.00012e+20,101,52,61,88,68
2,1.00016e+20,52,77,96,27,88
3,1.00017e+20,18,89,27,52,101
4,1.0001e+20,101,95,10,83,47


# Matrix Evaluation

### RMSE

In [None]:
def compute_rmse(true_matrix, approx_matrix):
    return np.sqrt(mean_squared_error(true_matrix, approx_matrix))

In [None]:
val_rmse = compute_rmse(val_matrix.values, val_approx_matrix)
test_rmse = compute_rmse(test_matrix.values, test_approx_matrix)

In [None]:
print(f"Validation RMSE: {val_rmse:.4f}")
print(f"Test RMSE: {test_rmse:.4f}")

Validation RMSE: 0.3086
Test RMSE: 0.3101


Model yang digunakan menunjukkan performa yang baik dengan Validation RMSE sebesar 0.3086, dan Test RMSE sebesar 0.3101. Nilai RMSE yang lebih rendah pada validasi dan uji dibandingkan dengan pelatihan menunjukkan bahwa model mampu generalisasi dengan baik tanpa overfitting. Selain itu, konsistensi antara Validation RMSE dan Test RMSE mengindikasikan performa yang stabil pada data baru, sehingga model ini dapat diandalkan untuk merekomendasikan item kepada pengguna.

In [None]:
# Refit SVD on full training with optimal number of factors
optimal_factors = 20 # replace with chosen optimal factor count
svd = TruncatedSVD(n_components=optimal_factors, random_state=42)
train_svd = svd.fit_transform(train_matrix)
train_approx_matrix = svd.inverse_transform(train_svd)

# Final test evaluation
test_svd = svd.transform(test_matrix)

test_approx_matrix = svd.inverse_transform(test_svd)
final_test_rmse = compute_rmse(test_matrix.values, test_approx_matrix)
print(f"Final Test RMSE: {final_test_rmse:.4f}")

Final Test RMSE: 0.3101


### MAE

In [None]:
def compute_mae(true_matrix, approx_matrix):
    return mean_absolute_error(true_matrix, approx_matrix)

In [None]:
val_mae = compute_mae(val_matrix.values, val_approx_matrix)
test_mae = compute_mae(test_matrix.values, test_approx_matrix)

In [None]:
print(f"Validation MAE: {val_mae:.4f}")
print(f"Test MAE: {test_mae:.4f}")

Validation MAE: 0.0443
Test MAE: 0.0441


Berdasarkan hasil perhitungan Mean Absolute Error (MAE), nilai MAE pada data validasi adalah 0.0443, sedangkan pada data uji (test) adalah 0.0441. Hal ini menunjukkan bahwa model memiliki performa yang sangat konsisten dalam memprediksi rating baik pada data yang digunakan untuk validasi maupun data yang digunakan untuk pengujian.

### Precision & Recall

In [None]:
# Fungsi untuk menghitung Precision@k dan Recall@k
def compute_precision_recall_at_k(true_matrix, approx_matrix, k, threshold=0.5):
    precision_at_k_list = []
    recall_at_k_list = []

    # Iterasi untuk setiap user (baris dalam matriks)
    for i in range(true_matrix.shape[0]):
        true_ratings = true_matrix[i, :]
        approx_ratings = approx_matrix[i, :]

        # Ambil top-k rekomendasi berdasarkan rating prediksi
        top_k_idx = np.argsort(approx_ratings)[::-1][:k]

        # Hitung jumlah rekomendasi relevan untuk top-k
        relevant_items = true_ratings[top_k_idx] >= threshold
        true_positives = np.sum(relevant_items)

        # Precision@k: proporsi rekomendasi relevan dari k teratas
        precision_at_k = true_positives / k if k > 0 else 0

        # Recall@k: proporsi item relevan yang ditemukan dalam k rekomendasi teratas
        recall_at_k = true_positives / np.sum(true_ratings >= threshold) if np.sum(true_ratings >= threshold) > 0 else 0

        precision_at_k_list.append(precision_at_k)
        recall_at_k_list.append(recall_at_k)

    # Menghitung rata-rata Precision@k dan Recall@k untuk seluruh data
    avg_precision_at_k = np.mean(precision_at_k_list)
    avg_recall_at_k = np.mean(recall_at_k_list)

    return avg_precision_at_k, avg_recall_at_k

k = 5

# Menghitung Precision@k dan Recall@k untuk data test
precision_at_k, recall_at_k = compute_precision_recall_at_k(test_matrix.values, test_approx_matrix, k=k, threshold=0.5)

# Menampilkan hasil
print(f"Precision@{k}: {precision_at_k:.4f}")
print(f"Recall@{k}: {recall_at_k:.4f}")


Precision@5: 0.1656
Recall@5: 0.6681


### MAP

In [None]:
def compute_map(true_matrix, approx_matrix, top_n=5):
    map_score = 0
    num_users = true_matrix.shape[0]

    for user_id in range(num_users):
        true_ratings = true_matrix[user_id]
        approx_ratings = approx_matrix[user_id]

        top_n_indices = np.argsort(approx_ratings)[::-1][:top_n]

        relevant_items = (true_ratings[top_n_indices] >= 3.0)
        precision_at_k = np.mean(relevant_items)

        map_score += precision_at_k

    return map_score / num_users

In [None]:
val_map = compute_map(val_matrix.values, val_approx_matrix)
test_map = compute_map(test_matrix.values, test_approx_matrix)

In [None]:
print(f"Validation MAP: {val_map:.4f}")
print(f"Test MAP: {test_map:.4f}")

Validation MAP: 0.1619
Test MAP: 0.1589


In [None]:
def compute_map_at_k(true_matrix, approx_matrix, k=5, threshold=3.0):
    map_at_k = 0
    num_users = true_matrix.shape[0]

    for user_id in range(num_users):
        true_ratings = true_matrix[user_id]
        approx_ratings = approx_matrix[user_id]

        # Get top-k recommendations based on predicted ratings
        top_k_indices = np.argsort(approx_ratings)[::-1][:k]

        # Calculate precision at each rank position
        relevant_items = (true_ratings[top_k_indices] >= threshold)
        precision_at_k = np.cumsum(relevant_items) / (np.arange(1, k + 1))

        map_at_k += np.mean(precision_at_k)

    return map_at_k / num_users

In [None]:
val_map_5 = compute_map_at_k(val_matrix.values, val_approx_matrix, k=5)
test_map_5 = compute_map_at_k(test_matrix.values, test_approx_matrix, k=5)

print(f"Validation MAP@5: {val_map_5:.4f}")
print(f"Test MAP@5: {test_map_5:.4f}")

Validation MAP@5: 0.3357
Test MAP@5: 0.3304
