### import Libraries dan Load Dataset
Mengimpor pustaka yang diperlukan.
Memuat dataset dari file yang diberikan.
Menampilkan beberapa baris pertama dari dataset untuk memastikan bahwa data dimuat dengan benar

In [8]:
# Import necessary libraries
import pandas as pd
from surprise import Dataset, Reader, SVD
from surprise.model_selection import train_test_split
from surprise.accuracy import rmse
import matplotlib.pyplot as plt
import os

# Load the dataset
file_path = '[Dataset]_(Rekomendasi.csv'
data = pd.read_csv(file_path)

# Display the first few rows of the dataset
print(data.head())

           userID   productID  rating   timestamp
0   AKM1MP6P0OYPR  0132793040       5  1365811200
1  A2CX7LUOHB2NDG  0321732944       5  1341100800
2  A2NWSAGRHCP8N5  0439886341       1  1367193600
3  A2WNBOD3WNDNKT  0439886341       3  1374451200
4  A1GI0U4ZRJA8WN  0439886341       1  1334707200


### Define Reader and Load Data
The Reader object is defined to interpret the data correctly. The Dataset.load_from_df function is used to load the data into a format that the Surprise library can work with.

In [9]:
# Define the reader with the appropriate format
reader = Reader(rating_scale=(data['rating'].min(), data['rating'].max()))

# Load the data into the surprise Dataset
surprise_data = Dataset.load_from_df(data[['userID', 'productID', 'rating']], reader)

### Split Data into Training and Test Sets
This code splits the data into training and test sets. 80% of the data is used for training, and 20% is used for testing

In [10]:
trainset, testset = train_test_split(surprise_data, test_size=0.2, random_state=42)

### Model Training
The SVD (Singular Value Decomposition) algorithm is chosen for the recommendation model. The model is then trained using the training set.

In [11]:
# Use SVD (Singular Value Decomposition) algorithm for recommendations
model = SVD()

# Train the model
model.fit(trainset)

<surprise.prediction_algorithms.matrix_factorization.SVD at 0x19660ac9950>

### Model Prediction and Evaluation
The model makes predictions on the test set. The root mean square error (RMSE) is calculated to evaluate the model's performance.

In [12]:
# Make predictions on the test set
predictions = model.test(testset)

# Evaluate the model
rmse_value = rmse(predictions)
print(f'RMSE: {rmse_value}')

RMSE: 1.2954
RMSE: 1.2954169720603028


### Generate Recommendations
A function get_top_n_recommendations is defined to get the top N product recommendations for each user based on the predictions. The recommendations for a sample user (userID=1) are then displayed.

In [13]:
# Function to get top N recommendations for a given user
def get_top_n_recommendations(predictions, n=10):
    # First map the predictions to each user
    top_n = {}
    for uid, iid, true_r, est, _ in predictions:
        if uid not in top_n:
            top_n[uid] = []
        top_n[uid].append((iid, est))
    
    # Then sort the predictions for each user and retrieve the N highest ones.
    for uid, user_ratings in top_n.items():
        user_ratings.sort(key=lambda x: x[1], reverse=True)
        top_n[uid] = user_ratings[:n]
    
    return top_n

# Get top 10 recommendations for all users
top_n_recommendations = get_top_n_recommendations(predictions, n=10)

# Display top 10 recommendations for a sample user (e.g., userID=1)
sample_user_id = 1
if sample_user_id in top_n_recommendations:
    print(f'Top 10 recommendations for user {sample_user_id}:')
    for product_id, estimated_rating in top_n_recommendations[sample_user_id]:
        print(f'Product ID: {product_id}, Estimated Rating: {estimated_rating}')
else:
    print(f'No recommendations available for user {sample_user_id}')


No recommendations available for user 1
