# DX 704 Week 4 Project

This week's project will test the learning speed of linear contextual bandits compared to unoptimized approaches.
You will start with building a preference data set for evaluation, and then implement different variations of LinUCB and visualize how fast they learn the preferences.


The full project description, a template notebook and supporting code are available on GitHub: [Project 4 Materials](https://github.com/bu-cds-dx704/dx704-project-04).


## Example Code

You may find it helpful to refer to these GitHub repositories of Jupyter notebooks for example code.

* https://github.com/bu-cds-omds/dx601-examples
* https://github.com/bu-cds-omds/dx602-examples
* https://github.com/bu-cds-omds/dx603-examples
* https://github.com/bu-cds-omds/dx704-examples

Any calculations demonstrated in code examples or videos may be found in these notebooks, and you are allowed to copy this example code in your homework answers.

## Part 1: Collect Rating Data

The file "recipes.tsv" in this repository has information about 100 recipes.
Make a new file "ratings.tsv" with two columns, recipe_slug (from recipes.tsv) and rating.
Populate the rating column with values between 0 and 1 where 0 is the worst and 1 is the best.
You can assign these ratings however you want within that range, but try to make it reflect a consistent set of preferences.
These could be your preferences, or a persona of your choosing (e.g. chocolate lover, bacon-obsessed, or sweet tooth).
Make sure that there are at least 10 ratings of zero and at least 10 ratings of one.


Hint: You may find it more convenient to assign raw ratings from 1 to 5 and then remap them as follows.

`ratings["rating"] = (ratings["rating_raw"] - 1) * 0.25`

In [42]:
import pandas as pd
import numpy as np
from sklearn.linear_model import Ridge
from sklearn.metrics import mean_squared_error, r2_score


In [43]:
# Read the recipes
recipes = pd.read_csv('recipes.tsv', sep='\t')

# Create ratings DataFrame
ratings = pd.DataFrame({
    'recipe_slug': recipes['recipe_slug'].values,
    'rating': 0.5  # default middle rating
})

# Define bacon-obsessed persona ratings
# Bacon dishes = 1.0 (love them!)
bacon_dishes = ['bacon-fried-rice', 'bacon-chocolate-chip-cookies', 'bacon-wrapped-scallops',
                'bacon-egg-muffins', 'bacon-souffle', 'maple-bacon-donuts', 'maple-bacon-pancakes',
                'bacon-wrapped-dates', 'bacon-and-egg-breakfast-sandwich', 'bacon-wrapped-shrimp-skewers',
                'bacon-wrapped-chicken', 'bacon-wrapped-asparagus', 'bacon-mac-and-cheese']

# Chocolate/sweet treats = 0.8-0.9 (love sweets too!)
chocolate_dishes = ['brownies', 'chocolate-souffle', 'chocolate-peanut-butter-cake', 'pain-au-chocolat',
                    'chocolate-croissants', 'chocolate-babka', 'chocolate-cake', 'peanut-butter-brownies']

# Fried/comfort food = 0.7-0.8
fried_dishes = ['chicken-fingers', 'chicken-nuggets', 'fried-oysters', 'french-toast']

# Healthy vegetarian = 0.0 (hate veggies!)
veggie_dishes = ['falafel', 'asparagus-burger', 'asparagus-quiche', 'vegetable-lasagna',
                 'pickled-green-beans', 'pickled-asparagus', 'vegetarian-mushroom-lasagna',
                 'spinach-and-ricotta-lasagna', 'spinach-quiche', 'spinach-and-feta-quiche',
                 'spinach-and-ricotta-stuffed-shells']

# Apply ratings
for idx, slug in enumerate(ratings['recipe_slug']):
    if slug in bacon_dishes:
        ratings.loc[idx, 'rating'] = 1.0
    elif slug in chocolate_dishes:
        ratings.loc[idx, 'rating'] = np.random.uniform(0.85, 0.95)
    elif slug in fried_dishes:
        ratings.loc[idx, 'rating'] = np.random.uniform(0.7, 0.8)
    elif slug in veggie_dishes:
        ratings.loc[idx, 'rating'] = 0.0
    elif 'crisp' in slug or 'crumble' in slug or 'cobbler' in slug:
        ratings.loc[idx, 'rating'] = np.random.uniform(0.75, 0.85)  # fruit desserts
    elif 'nacho' in slug:
        ratings.loc[idx, 'rating'] = np.random.uniform(0.6, 0.75)  # nacho dishes
    elif 'lasagna' in slug or 'pasta' in slug:
        ratings.loc[idx, 'rating'] = np.random.uniform(0.5, 0.65)  # pasta
    else:
        ratings.loc[idx, 'rating'] = np.random.uniform(0.3, 0.6)  # everything else

# Sort alphabetically by recipe_slug before saving
ratings = ratings.sort_values('recipe_slug').reset_index(drop=True)

# Save to file
ratings.to_csv('ratings.tsv', sep='\t', index=False)

# Display summary
print(f"Total recipes: {len(ratings)}")
print(f"Ratings of 1.0: {(ratings['rating'] == 1.0).sum()}")
print(f"Ratings of 0.0: {(ratings['rating'] == 0.0).sum()}")
print(f"\nRating distribution:")
print(ratings['rating'].describe())


Total recipes: 100
Ratings of 1.0: 13
Ratings of 0.0: 11

Rating distribution:
count    100.000000
mean       0.572074
std        0.293038
min        0.000000
25%        0.399612
50%        0.574401
75%        0.787474
max        1.000000
Name: rating, dtype: float64


Submit "ratings.tsv" in Gradescope.

## Part 2: Construct Model Input

Use your file "ratings.tsv" combined with "recipe-tags.tsv" to create a new file "features.tsv" with a column recipe_slug, a column bias which is hard-coded to one, and a column for each tag that appears in "recipe-tags.tsv".
The tag column in this file should be a 0-1 encoding of the recipe tags for each recipe.
[Pandas reshaping function methods](https://pandas.pydata.org/docs/user_guide/reshaping.html) may be helpful.

The bias column will make later LinUCB calculations easier since it will just be another dimension.

Hint: For later modeling steps, it will be important to have the feature data (inputs) and the rating data (target outputs) in the same order.
It is highly recommended to make sure that "features.tsv" and "ratings.tsv" have the recipe slugs in the same order.

In [44]:
# YOUR CHANGES HERE

# Read recipe tags
tags_df = pd.read_csv('recipe-tags.tsv', sep='\t')

# Create one-hot encoding of tags (pivot to wide format)
tags_pivot = tags_df.pivot_table(
    index='recipe_slug',
    columns='recipe_tag',
    aggfunc=lambda x: 1,
    fill_value=0
)

# Add bias column (all 1s) as the first column
tags_pivot.insert(0, 'bias', 1)

# Reset index to make recipe_slug a column
tags_pivot = tags_pivot.reset_index()

print(f"Feature matrix shape: {tags_pivot.shape}")
print(f"Features (including bias): {tags_pivot.columns.tolist()[:10]}...")  # Show first 10
print(f"\nFirst few rows:")
print(tags_pivot.head())

# Save if needed
tags_pivot.to_csv('features.tsv', sep='\t', index=False)


Feature matrix shape: (100, 298)
Features (including bias): ['recipe_slug', 'bias', 'alfredo', 'almond', 'american', 'appetizer', 'appetizers', 'apple', 'asiancuisine', 'asparagus']...

First few rows:
recipe_tag          recipe_slug  bias  alfredo  almond  american  appetizer  \
0           almond-chip-cookies     1        0       1         0          0   
1             almond-croissants     1        0       1         0          0   
2                   apple-crisp     1        0       0         0          0   
3                 apple-crumble     1        0       0         0          0   
4                     apple-pie     1        0       0         0          0   

recipe_tag  appetizers  apple  asiancuisine  asparagus  ...  udonnoodles  \
0                    0      0             0          0  ...            0   
1                    0      0             0          0  ...            0   
2                    0      1             0          0  ...            0   
3                  

Submit "features.tsv" in Gradescope.

## Part 3: Linear Preference Model

Use your feature and rating files to build a ridge regression model with ridge regression's regularization parameter $\alpha$ set to 1.


Hint: If you are using scikit-learn modeling classes, you should use `fit_intercept=False` since that intercept value will be redundant with the bias coefficient.

Hint: The estimate component of the bounds should match the previous estimate, so you should be able to just focus on the variance component of the bounds now.

In [45]:
# YOUR CHANGES HERE

# Load features
features = pd.read_csv('features.tsv', sep='\t' )

# Load ratings
ratings = pd.read_csv('ratings.tsv', sep='\t')

# Merge features with ratings
data = features.merge(ratings, on='recipe_slug')

# Prepare X (features) and y (ratings)
X = data.drop(['recipe_slug', 'rating'], axis=1)
y = data['rating']

# Build ridge regression model with alpha=1
model = Ridge(alpha=1.0, fit_intercept=False)  # No intercept since we have a bias term
model.fit(X, y)

# Make predictions
y_pred = model.predict(X)

# Evaluate the model
mse = mean_squared_error(y, y_pred)
rmse = np.sqrt(mse)
r2 = r2_score(y, y_pred)

print(f"Ridge Regression Model (α=1)")
print(f"=" * 40)
print(f"Number of features: {X.shape[1]}")
print(f"Number of recipes: {X.shape[0]}")
print(f"\nModel Performance:")
print(f"  RMSE: {rmse:.4f}")
print(f"  R² Score: {r2:.4f}")

# Show top features by coefficient magnitude
feature_importance = pd.DataFrame({
    'feature': X.columns,
    'coefficient': model.coef_
}).sort_values('coefficient', ascending=False)

print(f"\nTop 10 Positive Coefficients:")
print(feature_importance.head(10))
print(f"\nTop 10 Negative Coefficients:")
print(feature_importance.tail(10))


Ridge Regression Model (α=1)
Number of features: 297
Number of recipes: 100

Model Performance:
  RMSE: 0.0449
  R² Score: 0.9763

Top 10 Positive Coefficients:
        feature  coefficient
0          bias     0.415975
10        bacon     0.284377
88      dessert     0.194194
51    chocolate     0.181334
256      spring     0.135172
245     souffle     0.108622
291  vegetables     0.107015
71       creamy     0.099675
178      nachos     0.082419
60      cobbler     0.082226

Top 10 Negative Coefficients:
               feature  coefficient
173      middleeastern    -0.078199
29             brioche    -0.094257
198        pastrycrust    -0.103486
12         bakeddishes    -0.103935
260         streetfood    -0.107570
255            spinach    -0.133958
139            healthy    -0.143612
292         vegetarian    -0.150802
204  pickledvegetables    -0.153124
81             custard    -0.158028


Save the coefficients of this model in a file "model.tsv" with columns "recipe_tag" and "coefficient".
Do not add anything for the `intercept_` attribute of a scikit-learn model; this will be covered by the coefficient for the bias column added in part 2.

In [46]:
# YOUR CHANGES HERE

# Save coefficients to TSV file
coefficients = pd.DataFrame({
    'recipe_tag': X.columns,
    'coefficient': model.coef_
})

coefficients.to_csv('model.tsv', sep='\t', index=False)
print(f"\nCoefficients saved to model.tsv")



Coefficients saved to model.tsv


Submit "model.tsv" in Gradescope.

## Part 4: Recipe Estimates

Use the recipe model to estimate the score of every recipe.
Save these estimates to a file "estimates.tsv" with columns recipe_slug and score_estimate.

In [47]:
# YOUR CHANGES HERE

# Save estimates to TSV file
estimates = pd.DataFrame({
    'recipe_slug': data['recipe_slug'],
    'score_estimate': y_pred
})

estimates.to_csv('estimates.tsv', sep='\t', index=False)
print(f"Estimates saved to estimates.tsv")

# Show some example estimates
print(f"\nSample Recipe Estimates:")
print(estimates.sort_values('score_estimate', ascending=False).head(10))


Estimates saved to estimates.tsv

Sample Recipe Estimates:
                     recipe_slug  score_estimate
8   bacon-chocolate-chip-cookies        1.022515
16        bacon-wrapped-scallops        0.996224
15           bacon-wrapped-dates        0.992947
14         bacon-wrapped-chicken        0.989720
10              bacon-fried-rice        0.989183
62            maple-bacon-donuts        0.985852
11          bacon-mac-and-cheese        0.966305
17  bacon-wrapped-shrimp-skewers        0.962450
13       bacon-wrapped-asparagus        0.951419
63          maple-bacon-pancakes        0.944309


Submit "estimates.tsv" in Gradescope.

## Part 5: LinUCB Bounds

Calculate the upper bounds of LinUCB using data corresponding to trying every recipe once and receiving the rating in "ratings.tsv" as the reward.
Keep the ridge regression regularization parameter at 1, and set LinUCB's $\alpha$ parameter to 2.
Save these upper bounds to a file "bounds.tsv" with columns recipe_slug and score_bound.

In [48]:
# Part 5: LinUCB Bounds

# LinUCB parameters
lambda_param = 1.0  # Regularization parameter (same as ridge regression alpha)
alpha_param = 2.0   # Exploration parameter

# Get feature matrix and ratings as numpy arrays
recipe_slugs = data['recipe_slug'].values
X_array = X.values
y_array = y.values

# Calculate A = λ*I + X^T X
d = X_array.shape[1]  # number of features
A = lambda_param * np.eye(d) + X_array.T @ X_array

# Calculate b = X^T y
b = X_array.T @ y_array

# Calculate theta = A^(-1) b
A_inv = np.linalg.inv(A)
theta = A_inv @ b

# Calculate upper bounds for each recipe
score_bounds = []

for i in range(len(X_array)):
    x = X_array[i]
    
    # Predicted score
    score_estimate = x @ theta
    
    # Confidence radius
    confidence_radius = alpha_param * np.sqrt(x @ A_inv @ x)
    
    # Upper confidence bound
    score_bound = score_estimate + confidence_radius
    
    score_bounds.append(score_bound)

# Save bounds to TSV file
bounds = pd.DataFrame({
    'recipe_slug': recipe_slugs,
    'score_bound': score_bounds
})

bounds.to_csv('bounds.tsv', sep='\t', index=False)

print(f"LinUCB Upper Bounds (λ=1, α=2)")
print(f"Bounds saved to bounds.tsv")
print(f"\nTop 10 Recipes by Upper Confidence Bound:")
print(bounds.sort_values('score_bound', ascending=False).head(10))


LinUCB Upper Bounds (λ=1, α=2)
Bounds saved to bounds.tsv

Top 10 Recipes by Upper Confidence Bound:
                     recipe_slug  score_bound
10              bacon-fried-rice     2.855340
14         bacon-wrapped-chicken     2.829197
11          bacon-mac-and-cheese     2.787860
17  bacon-wrapped-shrimp-skewers     2.787739
70              pain-au-chocolat     2.780581
16        bacon-wrapped-scallops     2.772102
13       bacon-wrapped-asparagus     2.763668
62            maple-bacon-donuts     2.742988
15           bacon-wrapped-dates     2.726076
9              bacon-egg-muffins     2.716313


Submit "bounds.tsv" in Gradescope.

## Part 6: Make Online Recommendations

Implement LinUCB to make 100 recommendations starting with no data and using the same parameters as in part 5.
One recommendation should be made at a time and you can break ties arbitrarily.
After each recommendation, use the rating from part 1 as the reward to update the LinUCB data.
Record the recommendations made in a file "recommendations.tsv" with columns "recipe_slug", "score_bound", and "reward".
The rows in this file should be in the same order as the recommendations were made.

Hint: do not remove recipes after each recommendation.
Repeating recommendations is expected.

In [49]:
# YOUR CHANGES HERE

# LinUCB parameters
lambda_param = 1.0
alpha_param = 2.0

# Get feature matrix and create lookup for rewards
X_full = X.values
recipe_slugs_all = data['recipe_slug'].values

# Create a dictionary for quick reward lookup
reward_lookup = dict(zip(data['recipe_slug'], data['rating']))

# Initialize LinUCB
d = X_full.shape[1]
A = lambda_param * np.eye(d)  # A = λI
b = np.zeros(d)  # b = 0


# Store recommendations
recommendations = []

# Make 100 recommendations
for iteration in range(100):
    # Calculate theta = A^(-1) b
    A_inv = np.linalg.inv(A)
    theta = A_inv @ b
    
    # Calculate UCB for all available recipes
    best_ucb = -np.inf
    best_idx = None
    
    for idx in range(len(recipe_slugs_all)):
        x = X_full[idx]
        
        # Predicted score
        score_estimate = x @ theta
        
        # Confidence radius
        confidence_radius = alpha_param * np.sqrt(x @ A_inv @ x)
        
        # Upper confidence bound
        ucb = score_estimate + confidence_radius
        
        if ucb > best_ucb:
            best_ucb = ucb
            best_idx = idx
    
    # Get the recommended recipe
    recommended_slug = recipe_slugs_all[best_idx]
    recommended_x = X_full[best_idx]
    reward = reward_lookup[recommended_slug]
    
    # Record the recommendation
    recommendations.append({
        'recipe_slug': recommended_slug,
        'score_bound': best_ucb,
        'reward': reward
    })
    
    # Update LinUCB with the observation
    A += np.outer(recommended_x, recommended_x)  # A += x x^T
    b += reward * recommended_x  # b += r * x
    
    
    if (iteration + 1) % 20 == 0:
        print(f"Made {iteration + 1} recommendations...")

# Save recommendations to TSV
recommendations_df = pd.DataFrame(recommendations)
recommendations_df.to_csv('recommendations.tsv', sep='\t', index=False)

print(f"\nLinUCB Online Recommendations Complete")
print(f"Saved {len(recommendations_df)} recommendations to recommendations.tsv")
print(f"\nFirst 10 Recommendations:")
print(recommendations_df.head(10))
print(f"\nCumulative Reward: {recommendations_df['reward'].sum():.2f}")
print(f"Average Reward: {recommendations_df['reward'].mean():.4f}")


Made 20 recommendations...
Made 40 recommendations...
Made 60 recommendations...
Made 80 recommendations...
Made 100 recommendations...

LinUCB Online Recommendations Complete
Saved 100 recommendations to recommendations.tsv

First 10 Recommendations:
        recipe_slug  score_bound    reward
0     apple-crumble     7.483315  0.757073
1     ma-la-chicken     7.243060  0.351308
2       quesadillas     7.209819  0.596724
3             ramen     7.222359  0.569491
4   chocolate-babka     6.980486  0.852636
5  pain-au-chocolat     7.007638  0.930376
6        spamburger     7.019395  0.410221
7  bacon-fried-rice     6.977467  1.000000
8       nacho-fries     6.810237  0.653250
9  cranberry-relish     6.756950  0.456189

Cumulative Reward: 58.21
Average Reward: 0.5821


Submit "recommendations.tsv" in Gradescope.

## Part 7: Acknowledgments

Make a file "acknowledgments.txt" documenting any outside sources or help on this project.
If you discussed this assignment with anyone, please acknowledge them here.
If you used any libraries not mentioned in this module's content, please list them with a brief explanation what you used them for.
If you used any generative AI tools, please add links to your transcripts below, and any other information that you feel is necessary to comply with the generative AI policy.
If no acknowledgements are appropriate, just write none in the file.


Submit "acknowledgments.txt" in Gradescope.

## Part 8: Code

Please submit a Jupyter notebook that can reproduce all your calculations and recreate the previously submitted files.


Submit "project.ipynb" in Gradescope.