## Demonstration of the User-Items Bias Baseline Recommender Model
In this file, the demonstration of how the user-item bias baseline model recommends news for a user is presented. 
It also includes the evaluation of the recommender model by calculating the metrics *Precision@K and NDCG@K*.

In [1]:
import sys
import os

parent_dir = os.path.abspath(os.path.join(os.getcwd(), ".."))
sys.path.append(parent_dir)
from utils.process_data import user_item_interaction_scores
from parquet_data_reader import ParquetDataReader
from models.baseline.user_item_bias import UserItemBiasRecommender


import polars as pl
pl.Config.set_tbl_cols(-1)
import numpy as np
parquet_reader = ParquetDataReader()

### Reading the Data and Preprocessing

In [2]:
train_behavior_df = parquet_reader.read_data("../../data/train/behaviors.parquet")
embeddings_df = parquet_reader.read_data("../../data/document_vector.parquet")
article_df = parquet_reader.read_data("../../data/articles.parquet")
test_behavior_df = parquet_reader.read_data("../../data/validation/behaviors.parquet")
processed_behavior_df = user_item_interaction_scores(train_behavior_df, article=article_df)

### Creates and Fits the Model to the Data

In [3]:
model = UserItemBiasRecommender(processed_behavior_df)
model.fit()

### Recommendations
Produces the recommended article-ids for the provided user id.

It uses the function: *r_hat(u, i) = mu + b_u + b_i*

In [4]:
model.recommend(2423448)

[9514727, 9667501, 9714376, 9419945, 9761391]

In [5]:
model.predict(2423448,9714376)

0.841934084892273

### Evaluation
Prints the *Precision@K and NDCG@K* metrics for the evaluation of the recommender.

In [6]:
results = model.evaluate_recommender(test_data=test_behavior_df,k=5,n_jobs=4,user_sample=1000)
print("Results")
results

Results


{'MAP@K': np.float64(0.0002), 'NDCG@K': np.float64(0.0003391602052736161)}

Prints the common metrics for all models for comparison; Precision@k, recall@k and fpr@k

In [7]:
from utils.evaluation import append_model_metrics
from utils.evaluation import perform_model_evaluation

metrics = perform_model_evaluation(model, test_behavior_df, k=5)
print("Metrics")
print(metrics)

append_model_metrics(metrics, "baseline")

Metrics
{'precision@k': np.float64(8.286148322054967e-05), 'recall@k': np.float64(2.2240331527095036e-05), 'fpr@k': np.float64(0.002763542707662782)}


### Diversity

In [8]:
diversity = model.aggregate_diversity(item_df=article_df,user_sample=1000)
print("Diversity")
diversity

Diversity


0.00024110328864885718

### Carbon Footprint
This section creates an emissions.csv file in the "output"-folder
It utilizes the code_carbon (`codecarbon EmissionsTracker`) to record the carbon footprint of the `fit` and the `recommend` methods of the model.

In [9]:
from utils.evaluation import track_model_energy

print("\nCarbon footprint of the recommender:")
footprint = track_model_energy(model, "baseline", user_id=2423448, n=5)
footprint

[codecarbon INFO @ 09:49:11] [setup] RAM Tracking...
[codecarbon INFO @ 09:49:11] [setup] CPU Tracking...
 Linux OS detected: Please ensure RAPL files exist at \sys\class\powercap\intel-rapl to measure CPU




Carbon footprint of the recommender:


[codecarbon INFO @ 09:49:12] CPU Model on constant consumption mode: 13th Gen Intel(R) Core(TM) i7-1370P
[codecarbon INFO @ 09:49:12] [setup] GPU Tracking...
[codecarbon INFO @ 09:49:12] No GPU found.
[codecarbon INFO @ 09:49:12] >>> Tracker's metadata:
[codecarbon INFO @ 09:49:12]   Platform system: Linux-6.12.20-2-MANJARO-x86_64-with-glibc2.41
[codecarbon INFO @ 09:49:12]   Python version: 3.13.2
[codecarbon INFO @ 09:49:12]   CodeCarbon version: 2.8.3
[codecarbon INFO @ 09:49:12]   Available RAM : 62.395 GB
[codecarbon INFO @ 09:49:12]   CPU count: 20
[codecarbon INFO @ 09:49:12]   CPU model: 13th Gen Intel(R) Core(TM) i7-1370P
[codecarbon INFO @ 09:49:12]   GPU count: None
[codecarbon INFO @ 09:49:12]   GPU model: None
[codecarbon INFO @ 09:49:15] Saving emissions data to file /home/pedropca/Documents/Datatek/Recommender systems/TDT4215_recommender_system/recommender_system/demostrations/output/baseline_fit_emission.csv
[codecarbon INFO @ 09:49:16] Energy consumed for RAM : 0.00000

{'fit': (None, 2.113799878957823e-07),
 'recommend': ([9514727, 9667501, 9714376, 9419945, 9761391],
  2.3483815619851475e-09)}

### Gini coefficient

In [10]:
gini = model.gini_coefficient(user_sample=1000)
print("Gini Coefficient")
gini

Sampling users
Computing Gini coefficient
[9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 966

np.float64(0.9970980847359258)