## Demonstration of a Ring Buffer Baseline Recommender System
Here the demonstration of the ring buffer recommender is presented. It recommends items to the user based on which are present in the current *RingBuffer* (see implementation in *ring_buffer_baseline.py*). This way, it captures both recency of the items and the popularity through how many ring buffer entries each item has.

It simply recommends the first item which is not the one the user is currently browsing, found by looking back in the ring buffer.

The file also includes the evaluation of the recommender model using the metrics *Precision and Recall*.

In [7]:
import sys
import os

parent_dir = os.path.abspath(os.path.join(os.getcwd(), ".."))
sys.path.append(parent_dir)
from parquet_data_reader import ParquetDataReader
from models.ring_buffer_baseline import RingBufferBaseline

import polars as pl
pl.Config.set_tbl_cols(-1)
import numpy as np
parquet_reader = ParquetDataReader()

### Data Extraction and Processing

In [8]:
import polars as pl
from utils.baseline_processing import process_behavior_data, random_split, time_based_split

train_behavior_df = parquet_reader.read_data("../../data/train/behaviors.parquet")
test_behaviours_df = parquet_reader.read_data('../../data/validation/behaviors.parquet')

# Processes the data
combined_df = process_behavior_data(train_behavior_df, test_behaviours_df)

# ----- Method 1: Random Split -----
train_random, test_random = random_split(combined_df, test_ratio=0.30)
print("Random Split:")
print("Train shape:", train_random.shape)
print("Test shape:", test_random.shape)

# ----- Method 2: Time-based Split -----
train_time, test_time = time_based_split(combined_df, test_ratio=0.30)
print("\nTime-based Split:")
print("Train shape:", train_time.shape)
print("Test shape:", test_time.shape)


Random Split:
Train shape: (99163, 4)
Test shape: (42560, 4)

Time-based Split:
Train shape: (99207, 4)
Test shape: (42516, 4)


### Method 1: Random Split of Train/Test for Recommendations

In [9]:
# Creates a recommender and fits it to the training data split using the random split method
recommender = RingBufferBaseline(behaviors=train_random)
recommender.fit()

user_id_test = 151570
recommendations = recommender.recommend(user_id=user_id_test, n=5)

print(f"Recommendations for user {user_id_test}:")
print(recommendations)

Recommendations for user 151570:
[9771042, 9771042, 9767697, 9770541, 9769650]


### Method 2: Time-based Split of Train/Test for Recommendations
This methods splits the data into the oldest interactions *(test_ratio percent)*
are used for testing, and the newest interactions are used for training. This happens after the total data (train and test) has been combined. 

In [10]:
# Creates a recommender and fits it to the training data split using the time-based split method
recommender2 = RingBufferBaseline(behaviors=train_time)
recommender2.fit()

user_id_test2 = 151570
recommendations2 = recommender.recommend(user_id=user_id_test2, n=5)

print(f"Recommendations for user {user_id_test2}:")
print(recommendations2)

Recommendations for user 151570:
[9771042, 9771042, 9767697, 9770541, 9769650]


### Comparison: Evaluation of the Ring Buffer Baseline Recommender
Comparing the two different data-splits for this  ring buffer baseline recommender using the metrics *Precision and Recall*.
*FPR* is also printed for reference.

In [11]:
from utils.evaluation import perform_model_evaluation

# Evaluates the recommender using the same random split data
metrics = perform_model_evaluation(recommender, test_data=test_random, k=5)
print("\nEvaluation metrics (precision and recall at k):")
print(metrics)


# Evaluates the recommender using time split data
metrics2 = perform_model_evaluation(recommender2, test_data=test_time, k=5)
print("\nEvaluation metrics (precision and recall at k):")
print(metrics2)



Evaluation metrics (precision and recall at k):
{'precision@k': np.float64(0.0025112897896244083), 'recall@k': np.float64(0.0026844134814096024), 'fpr@k': np.float64(0.002187635351535912)}

Evaluation metrics (precision and recall at k):
{'precision@k': np.float64(0.01882155822395979), 'recall@k': np.float64(0.03013248050153789), 'fpr@k': np.float64(0.004282433544948659)}


### Carbon Footprint
This section creates an emissions.csv file in the "output"-folder
It utilizes the code_carbon (`codecarbon EmissionsTracker`) to record the carbon footprint of the `fit` and the `recommend` methods of the model.

In [12]:
from utils.evaluation import track_model_energy

print("\nCarbon footprint of the recommender:")
footprint = track_model_energy(recommender, "ring_buffer", user_id=user_id_test2, n=5)
footprint

[codecarbon INFO @ 10:00:05] [setup] RAM Tracking...
[codecarbon INFO @ 10:00:05] [setup] CPU Tracking...
 Windows OS detected: Please install Intel Power Gadget to measure CPU




Carbon footprint of the recommender:


[codecarbon INFO @ 10:00:07] CPU Model on constant consumption mode: 13th Gen Intel(R) Core(TM) i7-13700H
[codecarbon INFO @ 10:00:07] [setup] GPU Tracking...
[codecarbon INFO @ 10:00:07] No GPU found.
[codecarbon INFO @ 10:00:07] >>> Tracker's metadata:
[codecarbon INFO @ 10:00:07]   Platform system: Windows-10-10.0.26100-SP0
[codecarbon INFO @ 10:00:07]   Python version: 3.11.9
[codecarbon INFO @ 10:00:07]   CodeCarbon version: 2.8.3
[codecarbon INFO @ 10:00:07]   Available RAM : 15.731 GB
[codecarbon INFO @ 10:00:07]   CPU count: 20
[codecarbon INFO @ 10:00:07]   CPU model: 13th Gen Intel(R) Core(TM) i7-13700H
[codecarbon INFO @ 10:00:07]   GPU count: None
[codecarbon INFO @ 10:00:07]   GPU model: None
[codecarbon INFO @ 10:00:10] Saving emissions data to file c:\Users\magnu\NewDesk\An.sys\TDT4215\recommender_system\demostrations\output\ring_buffer_fit_emission.csv
[codecarbon INFO @ 10:00:10] Energy consumed for RAM : 0.000000 kWh. RAM Power : 5.899243354797363 W
[codecarbon INFO @

{'fit': (None, 4.0160640257330855e-08),
 'recommend': ([9771042, 9771042, 9767697, 9770541, 9769650],
  6.224970916294863e-09)}