## Demonstration of the User-Items Bias Baseline Recommender Model
In this file, the demonstration of how the user-item bias baseline model recommends news for a user is presented. 
It also includes the evaluation of the recommender model by calculating the metrics *Precision@K and NDCG@K*.

In [11]:
import sys
import os

parent_dir = os.path.abspath(os.path.join(os.getcwd(), ".."))
sys.path.append(parent_dir)
from utils.process_data import user_item_interaction_scores
from parquet_data_reader import ParquetDataReader
from models.baseline.user_item_bias import UserItemBiasRecommender


import polars as pl
pl.Config.set_tbl_cols(-1)
import numpy as np
parquet_reader = ParquetDataReader()

### Reading the Data and Preprocessing

In [12]:
train_behavior_df = parquet_reader.read_data("../../data/train/behaviors.parquet")
embeddings_df = parquet_reader.read_data("../../data/document_vector.parquet")
article_df = parquet_reader.read_data("../../data/articles.parquet")
test_behavior_df = parquet_reader.read_data("../../data/validation/behaviors.parquet")
processed_behavior_df = user_item_interaction_scores(train_behavior_df, article=article_df)

### Creates and Fits the Model to the Data

In [13]:
model = UserItemBiasRecommender(processed_behavior_df)
model.fit()

### Recommendations
Produces the recommended article-ids for the provided user id.

It uses the function: *r_hat(u, i) = mu + b_u + b_i*

In [14]:
model.recommend(2423448)

[9514727, 9667501, 9714376, 9419945, 9761391]

In [15]:
model.predict(2423448,9714376)

0.841934084892273

### Evaluation
Prints the *Precision@K and NDCG@K* metrics for the evaluation of the recommender.

In [16]:
results = model.evaluate_recommender(test_data=test_behavior_df,k=5,n_jobs=4,user_sample=1000)
print("Results")
results

Results


{'Precision@K': np.float64(0.0004),
 'NDCG@K': np.float64(0.0005087403079104242)}

Prints the common metrics for all models for comparison; Precision@k, recall@k and fpr@k

In [17]:
from utils.evaluation import append_model_metrics
from utils.evaluation import perform_model_evaluation

metrics = perform_model_evaluation(model, test_behavior_df, k=5)
print("Metrics")
print(metrics)

append_model_metrics(metrics, "baseline")

Metrics
{'precision@k': np.float64(8.286148322054967e-05), 'recall@k': np.float64(2.224033152709503e-05), 'fpr@k': np.float64(0.0027635427076627816)}


### Diversity

In [18]:
from utils.evaluation import aggregate_diversity
from utils.evaluation import append_aggregate_diversity

diversity = aggregate_diversity(model, article_df, user_sample=1000)

print("Diversity")
print(diversity)

append_aggregate_diversity(diversity, "baseline")

Diversity
0.00024110328864885718


### Carbon Footprint
This section creates an emissions.csv file in the "output"-folder
It utilizes the code_carbon (`codecarbon EmissionsTracker`) to record the carbon footprint of the `fit` and the `recommend` methods of the model.

In [19]:
from utils.evaluation import track_model_energy

print("\nCarbon footprint of the recommender:")
footprint = track_model_energy(model, "baseline", user_id=2423448, n=5)
footprint

[codecarbon INFO @ 12:17:29] [setup] RAM Tracking...
[codecarbon INFO @ 12:17:29] [setup] CPU Tracking...
 Windows OS detected: Please install Intel Power Gadget to measure CPU




Carbon footprint of the recommender:


[codecarbon INFO @ 12:17:30] CPU Model on constant consumption mode: 13th Gen Intel(R) Core(TM) i7-13700H
[codecarbon INFO @ 12:17:30] [setup] GPU Tracking...
[codecarbon INFO @ 12:17:30] No GPU found.
[codecarbon INFO @ 12:17:30] >>> Tracker's metadata:
[codecarbon INFO @ 12:17:30]   Platform system: Windows-10-10.0.26100-SP0
[codecarbon INFO @ 12:17:30]   Python version: 3.11.9
[codecarbon INFO @ 12:17:30]   CodeCarbon version: 2.8.3
[codecarbon INFO @ 12:17:30]   Available RAM : 15.731 GB
[codecarbon INFO @ 12:17:30]   CPU count: 20
[codecarbon INFO @ 12:17:30]   CPU model: 13th Gen Intel(R) Core(TM) i7-13700H
[codecarbon INFO @ 12:17:30]   GPU count: None
[codecarbon INFO @ 12:17:30]   GPU model: None
[codecarbon INFO @ 12:17:34] Saving emissions data to file c:\Users\magnu\NewDesk\An.sys\TDT4215\recommender_system\demostrations\output\baseline_fit_emission.csv
[codecarbon INFO @ 12:17:34] Energy consumed for RAM : 0.000001 kWh. RAM Power : 5.899243354797363 W
[codecarbon INFO @ 12

{'fit': (None, 1.3030690604999218e-07),
 'recommend': ([9514727, 9667501, 9714376, 9419945, 9761391],
  5.767203056883588e-09)}

### Gini coefficient

In [20]:
gini = model.gini_coefficient(user_sample=1000)
print("Gini Coefficient")
gini

Sampling users
Computing Gini coefficient
[9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 9667501, 9714376, 9419945, 9761391, 9514727, 966

np.float64(0.9970980847359258)