<a href="https://colab.research.google.com/github/novrian6/ml_product_prediction/blob/main/Complete_Product_Recommendation_using_SVD.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Programmed by Nova Novriansyah
### Created at: 26 December 2023 21:30
### Note:
####1.The following code will create a model using merged data provided (merged.csv). The output of the training will be saved as collab_filtering_model.pkl. The data is merging of 3 files (product_details, purchase_history_customer_interactions), and flatten to be used by ML model training.
####2.ML Algorithm: Due to small amount of data had, and understanding the  features/columns on the data, the algorithm used is collaborative filtering using SVD (Singular Value Decomposition).  
###3. Deep learning is not possible in this case (amount of data),   to get more accurate prediction might be possible using Deep Learning (RNN/LSTM) if more data provided.


##1. Load Library

In [122]:
#unmark and run below line if surprise is not installed
#!pip install surprise

In [174]:
from surprise import Dataset
from surprise import Reader
from surprise import SVD
from surprise.model_selection import train_test_split
from surprise import accuracy
import pandas as pd

##2. Load and preprocess Data as Dataframe

In [175]:
# Load data as data frame
# The data is merging of 3 files (product_details, purchase_history_customer_interactions), and flatten to be used by ML model training.
df = pd.read_csv("merged_data.csv")
df

Unnamed: 0,customer_id,product_id,purchase_date,page_views,time_spent,category,price,ratings
0,1,101,2023-01-01,25,120,Electronics,500,4.5
1,1,105,2023-01-05,25,120,Electronics,800,4.8
2,2,102,2023-01-02,20,90,Clothing,50,3.8
3,3,103,2023-01-03,30,150,Home & Kitchen,200,4.2
4,4,104,2023-01-04,15,80,Beauty,30,4.0
5,5,101,2023-01-05,22,110,Electronics,500,4.5


##3.Preprocess the data and create train set and test set

In [176]:
# Load the DataFrame into Surprise library's Dataset. In SVD  3 common features used are customer id, product id and ratings. Ratings contains latent factors that represent customer preferences.
reader = Reader(rating_scale=(1, 5))
data = Dataset.load_from_df(df[['customer_id', 'product_id', 'ratings']], reader)



In [177]:
# Split the dataset into training and testing sets
trainset, testset = train_test_split(data, test_size=0.2, random_state=42)


##4.Create the Model using SVD (Singular value decomposition)

In [178]:
# Use SVD algorithm for collaborative filtering
model = SVD()
model.fit(trainset)


<surprise.prediction_algorithms.matrix_factorization.SVD at 0x7d430a6f7640>

##5.Test the model using test data

In [179]:
# Predictions on test set
test_predictions = model.test(testset)

# Evaluation on test set - RMSE
test_rmse = accuracy.rmse(test_predictions)
print(f"Test RMSE: {test_rmse}")

RMSE: 0.3162
Test RMSE: 0.3162277660168382


##6.Code below to test the model prediction


In [170]:
# Make predictions for a particular user (example: customer_id = 1)
user_id = 4
user_predictions = []
for item_id in df['product_id'].unique():
    pred = model.predict(user_id, item_id)
    user_predictions.append({
        'user_id': user_id,
        'item_id': item_id,
        'predicted_rating': pred.est
    })

# Display predictions for the user
user_predictions_df = pd.DataFrame(user_predictions)
print(user_predictions_df)

   user_id  item_id  predicted_rating
0        4      101          4.416435
1        4      105          4.434552
2        4      102          4.344195
3        4      103          4.400000
4        4      104          4.400000


##7. Save the Model

In [171]:
#save the model
import pickle
model_filename = 'collab_filtering_model.pkl'
with open(model_filename, 'wb') as file:
    pickle.dump(model, file)

#Below code used for testing purpose

###1. Load saved Model

In [172]:
#Load the model
model_filename = 'collab_filtering_model.pkl'
# Load the model from the file
with open(model_filename, 'rb') as file:
    loaded_model = pickle.load(file)

# Now you can use loaded_model for predictions or other tasks


###2. Use the loaded model to predict


In [173]:
# Make predictions for a particular user (example: customer_id = 1) using loaded model
user_id = 4
user_predictions = []
for item_id in df['product_id'].unique():
    pred = loaded_model.predict(user_id, item_id)
    user_predictions.append({
        'user_id': user_id,
        'item_id': item_id,
        'predicted_rating': pred.est
    })

# Display predictions for the user
user_predictions_df = pd.DataFrame(user_predictions)
print(user_predictions_df)

   user_id  item_id  predicted_rating
0        4      101          4.416435
1        4      105          4.434552
2        4      102          4.344195
3        4      103          4.400000
4        4      104          4.400000
