# DX 704 Week 4 Project

This week's project will test the learning speed of linear contextual bandits compared to unoptimized approaches.
You will start with building a preference data set for evaluation, and then implement different variations of LinUCB and visualize how fast they learn the preferences.


The full project description, a template notebook and supporting code are available on GitHub: [Project 4 Materials](https://github.com/bu-cds-dx704/dx704-project-04).


## Example Code

You may find it helpful to refer to these GitHub repositories of Jupyter notebooks for example code.

* https://github.com/bu-cds-omds/dx601-examples
* https://github.com/bu-cds-omds/dx602-examples
* https://github.com/bu-cds-omds/dx603-examples
* https://github.com/bu-cds-omds/dx704-examples

Any calculations demonstrated in code examples or videos may be found in these notebooks, and you are allowed to copy this example code in your homework answers.

## Part 1: Collect Rating Data

The file "recipes.tsv" in this repository has information about 100 recipes.
Make a new file "ratings.tsv" with two columns, recipe_slug (from recipes.tsv) and rating.
Populate the rating column with values between 0 and 1 where 0 is the worst and 1 is the best.
You can assign these ratings however you want within that range, but try to make it reflect a consistent set of preferences.
These could be your preferences, or a persona of your choosing (e.g. chocolate lover, bacon-obsessed, or sweet tooth).
Make sure that there are at least 10 ratings of zero and at least 10 ratings of one.


Hint: You may find it more convenient to assign raw ratings from 1 to 5 and then remap them as follows.

`ratings["rating"] = (ratings["rating_raw"] - 1) * 0.25`

In [3]:
%pip install pandas

import pandas as pd

# Load datasets
recipes = pd.read_csv("recipes.tsv", sep="\t")
tags = pd.read_csv("recipe-tags.tsv", sep="\t")

# Merge tags into a single text field per recipe
tags_agg = tags.groupby("recipe_slug")["recipe_tag"].apply(lambda x: " ".join(x.astype(str))).reset_index()
df = recipes.merge(tags_agg, on="recipe_slug", how="left")

# Combine text fields for keyword scoring
df["text"] = (
    df["recipe_title"].fillna("") + " " +
    df["recipe_introduction"].fillna("") + " " +
    df["recipe_tag"].fillna("")
).str.lower()

# Sweet-tooth persona keywords
loves = [
    "chocolate", "brownie", "cookie", "cake", "cupcake", "muffin",
    "pancake", "waffle", "pie", "tart", "cinnamon", "vanilla",
    "strawberry", "blueberry", "raspberry", "banana", "apple",
    "honey", "caramel", "ice cream", "dessert", "sweet"
]
hates = [
    "anchovy", "sardine", "liver", "brussels", "kale", "beet",
    "spicy", "jalape", "chili", "hot sauce", "curry",
    "tofu", "mushroom", "broccoli"
]

# Score recipes
score = pd.Series(0, index=df.index)

for kw in loves:
    score += df["text"].str.contains(kw).astype(int) * 2

for kw in hates:
    score -= df["text"].str.contains(kw).astype(int) * 2

# Convert score to raw ratings 1–5
rating_raw = pd.cut(
    score,
    bins=[-999, -2, 0, 2, 5, 999],
    labels=[1, 2, 3, 4, 5]
).astype(int)

ratings = pd.DataFrame({
    "recipe_slug": df["recipe_slug"],
    "rating_raw": rating_raw
})

# Force at least 10 zeros (raw=1) and 10 ones (raw=5)
need_zeros = max(0, 10 - (ratings["rating_raw"] == 1).sum())
need_ones  = max(0, 10 - (ratings["rating_raw"] == 5).sum())

if need_zeros > 0:
    idx = score.sort_values().head(need_zeros).index
    ratings.loc[idx, "rating_raw"] = 1

if need_ones > 0:
    idx = score.sort_values(ascending=False).head(need_ones).index
    ratings.loc[idx, "rating_raw"] = 5

# Map raw ratings to [0, 1]
ratings["rating"] = (ratings["rating_raw"] - 1) * 0.25

# Sanity checks
assert ratings["rating"].between(0, 1).all()
assert (ratings["rating"] == 0).sum() >= 10
assert (ratings["rating"] == 1).sum() >= 10

# Save final file
ratings[["recipe_slug", "rating"]].to_csv("ratings.tsv", sep="\t", index=False)

print("Saved ratings.tsv")
print("Zeros:", (ratings["rating"] == 0).sum())
print("Ones:", (ratings["rating"] == 1).sum())


Collecting pandas
  Downloading pandas-3.0.0-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.metadata (79 kB)
Collecting numpy>=1.26.0 (from pandas)
  Downloading numpy-2.4.2-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (6.6 kB)
Downloading pandas-3.0.0-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (10.9 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.9/10.9 MB[0m [31m53.1 MB/s[0m  [33m0:00:00[0m6m0:00:01[0m
[?25hDownloading numpy-2.4.2-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (16.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.6/16.6 MB[0m [31m51.8 MB/s[0m  [33m0:00:00[0m6m0:00:01[0m
[?25hInstalling collected packages: numpy, pandas
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2/2[0m [pandas]2m1/2[0m [pandas]
[1A[2KSuccessfully installed numpy-2.4.2 pandas-3.0.0

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is av

Submit "ratings.tsv" in Gradescope.

## Part 2: Construct Model Input

Use your file "ratings.tsv" combined with "recipe-tags.tsv" to create a new file "features.tsv" with a column recipe_slug, a column bias which is hard-coded to one, and a column for each tag that appears in "recipe-tags.tsv".
The tag column in this file should be a 0-1 encoding of the recipe tags for each recipe.
[Pandas reshaping function methods](https://pandas.pydata.org/docs/user_guide/reshaping.html) may be helpful.

The bias column will make later LinUCB calculations easier since it will just be another dimension. 

Hint: For later modeling steps, it will be important to have the feature data (inputs) and the rating data (target outputs) in the same order.
It is highly recommended to make sure that "features.tsv" and "ratings.tsv" have the recipe slugs in the same order.

In [4]:
# YOUR CHANGES HERE

import pandas as pd

# Load files
ratings = pd.read_csv("ratings.tsv", sep="\t")
tags = pd.read_csv("recipe-tags.tsv", sep="\t")

# One-hot encode tags
tag_matrix = (
    tags
    .assign(value=1)
    .pivot_table(
        index="recipe_slug",
        columns="recipe_tag",
        values="value",
        fill_value=0
    )
    .reset_index()
)

# Merge with ratings to guarantee same order
features = ratings[["recipe_slug"]].merge(tag_matrix, on="recipe_slug", how="left")

# Fill any missing tags with 0 (in case a recipe had no tags)
tag_cols = features.columns.drop("recipe_slug")
features[tag_cols] = features[tag_cols].fillna(0).astype(int)

# Add bias column = 1
features.insert(1, "bias", 1)

# Save features.tsv
features.to_csv("features.tsv", sep="\t", index=False)

print("Saved features.tsv")
print("Shape:", features.shape)
print(features.head())


Saved features.tsv
Shape: (100, 298)
        recipe_slug  bias  alfredo  almond  american  appetizer  appetizers  \
0           falafel     1        0       0         0          1           0   
1        spamburger     1        0       0         0          0           0   
2  bacon-fried-rice     1        0       0         0          0           0   
3   chicken-fingers     1        0       0         0          1           0   
4       apple-crisp     1        0       0         0          0           0   

   apple  asiancuisine  asparagus  ...  udonnoodles  vanilla  vanillaicecream  \
0      0             0          0  ...            0        0                0   
1      0             0          0  ...            0        0                0   
2      0             0          0  ...            0        0                0   
3      0             0          0  ...            0        0                0   
4      1             0          0  ...            0        0                0   

 

  features.insert(1, "bias", 1)


Submit "features.tsv" in Gradescope.

## Part 3: Linear Preference Model

Use your feature and rating files to build a ridge regression model with ridge regression's regularization parameter $\alpha$ set to 1.


Hint: If you are using scikit-learn modeling classes, you should use `fit_intercept=False` since that intercept value will be redundant with the bias coefficient.

Hint: The estimate component of the bounds should match the previous estimate, so you should be able to just focus on the variance component of the bounds now.

In [1]:
# YOUR CHANGES HERE

%pip install scikit-learn

import pandas as pd
import numpy as np
from sklearn.linear_model import Ridge

# Load data
features = pd.read_csv("features.tsv", sep="\t")
ratings = pd.read_csv("ratings.tsv", sep="\t")

# Sanity: ensure same order
assert (features["recipe_slug"].values == ratings["recipe_slug"].values).all()

# Build X and y
X = features.drop(columns=["recipe_slug"]).values   # includes bias column
y = ratings["rating"].values

model = Ridge(alpha=1.0, fit_intercept=False)
model.fit(X, y)

# Coefficients (theta-hat)
theta_hat = model.coef_

# Predicted preferences
y_hat = model.predict(X)

print("X shape:", X.shape)
print("Theta shape:", theta_hat.shape)
print("First 5 predictions:", y_hat[:5])


Collecting scikit-learn
  Downloading scikit_learn-1.8.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (11 kB)
Collecting scipy>=1.10.0 (from scikit-learn)
  Downloading scipy-1.17.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (62 kB)
Collecting joblib>=1.3.0 (from scikit-learn)
  Downloading joblib-1.5.3-py3-none-any.whl.metadata (5.5 kB)
Collecting threadpoolctl>=3.2.0 (from scikit-learn)
  Downloading threadpoolctl-3.6.0-py3-none-any.whl.metadata (13 kB)
Downloading scikit_learn-1.8.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (8.9 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.9/8.9 MB[0m [31m46.2 MB/s[0m  [33m0:00:00[0m
[?25hDownloading joblib-1.5.3-py3-none-any.whl (309 kB)
Downloading scipy-1.17.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (35.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m35.0/35.0 MB[0m [31m48.3 MB/s[0m  [33m0:00:00[0m6m0:00:01

Save the coefficients of this model in a file "model.tsv" with columns "recipe_tag" and "coefficient".
Do not add anything for the `intercept_` attribute of a scikit-learn model; this will be covered by the coefficient for the bias column added in part 2.

In [2]:
# YOUR CHANGES HERE

import pandas as pd

# Reload features to get column names in the correct order
features = pd.read_csv("features.tsv", sep="\t")

# Get feature names (excluding recipe_slug)
feature_names = features.drop(columns=["recipe_slug"]).columns.tolist()

# Coefficients from Part 3 model
coefs = model.coef_

# Sanity check
assert len(feature_names) == len(coefs)

# Build model.tsv
model_df = pd.DataFrame({
    "recipe_tag": feature_names,
    "coefficient": coefs
})

# Save
model_df.to_csv("model.tsv", sep="\t", index=False)

print("Saved model.tsv")
print(model_df.head())


Saved model.tsv
  recipe_tag  coefficient
0       bias     0.307752
1    alfredo    -0.007444
2     almond     0.057050
3   american     0.017860
4  appetizer     0.048371


Submit "model.tsv" in Gradescope.

## Part 4: Recipe Estimates

Use the recipe model to estimate the score of every recipe.
Save these estimates to a file "estimates.tsv" with columns recipe_slug and score_estimate.

In [3]:
# YOUR CHANGES HERE

import pandas as pd

# Load features (for slugs + X)
features = pd.read_csv("features.tsv", sep="\t")

# Build X in the same way as training
X = features.drop(columns=["recipe_slug"]).values

# Predict scores using the ridge model from Part 3
score_estimate = model.predict(X)

# Build estimates.tsv
estimates = pd.DataFrame({
    "recipe_slug": features["recipe_slug"],
    "score_estimate": score_estimate
})

# Save
estimates.to_csv("estimates.tsv", sep="\t", index=False)

print("Saved estimates.tsv")
print (estimates.head()) 


Saved estimates.tsv
        recipe_slug  score_estimate
0           falafel        0.273263
1        spamburger        0.267449
2  bacon-fried-rice        0.247113
3   chicken-fingers        0.469007
4       apple-crisp        0.990936


Submit "estimates.tsv" in Gradescope.

## Part 5: LinUCB Bounds

Calculate the upper bounds of LinUCB using data corresponding to trying every recipe once and receiving the rating in "ratings.tsv" as the reward.
Keep the ridge regression regularization parameter at 1, and set LinUCB's $\alpha$ parameter to 2.
Save these upper bounds to a file "bounds.tsv" with columns recipe_slug and score_bound.

In [4]:
# YOUR CHANGES HERE

import pandas as pd
import numpy as np

# Load data
features = pd.read_csv("features.tsv", sep="\t")
ratings = pd.read_csv("ratings.tsv", sep="\t")

# Sanity: same order
assert (features["recipe_slug"].values == ratings["recipe_slug"].values).all()

# Build X and y
X = features.drop(columns=["recipe_slug"]).values
y = ratings["rating"].values

d = X.shape[1]
lambda_ridge = 1.0     # ridge regularization
alpha_ucb = 2.0       # LinUCB exploration parameter

# Initialize A and b (LinUCB / ridge)
A = lambda_ridge * np.eye(d)
b = np.zeros(d)

# "Try every recipe once" and update A, b
for i in range(len(X)):
    x = X[i]
    r = y[i]
    A += np.outer(x, x)
    b += r * x

# Compute A_inv and theta_hat
A_inv = np.linalg.inv(A)
theta_hat = A_inv @ b

# Compute LinUCB upper bounds for each recipe
score_bound = []
for i in range(len(X)):
    x = X[i]
    mean = x @ theta_hat
    var = np.sqrt(x @ A_inv @ x)
    ucb = mean + alpha_ucb * var
    score_bound.append(ucb)

bounds = pd.DataFrame({
    "recipe_slug": features["recipe_slug"],
    "score_bound": score_bound
})

# Save
bounds.to_csv("bounds.tsv", sep="\t", index=False)

print("Saved bounds.tsv")
print(bounds.head()) 

Saved bounds.tsv
        recipe_slug  score_bound
0           falafel     2.039518
1        spamburger     2.161785
2  bacon-fried-rice     2.113269
3   chicken-fingers     2.263123
4       apple-crisp     2.767278


Submit "bounds.tsv" in Gradescope.

## Part 6: Make Online Recommendations

Implement LinUCB to make 100 recommendations starting with no data and using the same parameters as in part 5.
One recommendation should be made at a time and you can break ties arbitrarily.
After each recommendation, use the rating from part 1 as the reward to update the LinUCB data.
Record the recommendations made in a file "recommendations.tsv" with columns "recipe_slug", "score_bound", and "reward".
The rows in this file should be in the same order as the recommendations were made.

Hint: do not remove recipes after each recommendation.
Repeating recommendations is expected.

In [6]:
# YOUR CHANGES HERE

import pandas as pd
import numpy as np

# Load data
features = pd.read_csv("features.tsv", sep="\t")
ratings = pd.read_csv("ratings.tsv", sep="\t")

# Sanity: same order
assert (features["recipe_slug"].values == ratings["recipe_slug"].values).all()

X = features.drop(columns=["recipe_slug"]).values
y = ratings["rating"].values
slugs = features["recipe_slug"].values

d = X.shape[1]
lambda_ridge = 1.0
alpha_ucb = 2.0
T = 100

# Initialize LinUCB state (no data)
A = lambda_ridge * np.eye(d)
b = np.zeros(d)

rows = []

for t in range(T):
    A_inv = np.linalg.inv(A)
    theta_hat = A_inv @ b

    # Compute UCB for all recipes
    bounds = []
    for i in range(len(X)):
        x = X[i]
        mean = x @ theta_hat
        var = np.sqrt(x @ A_inv @ x)
        ucb = mean + alpha_ucb * var
        bounds.append(ucb)

    bounds = np.array(bounds)

    # Pick best recipe (ties arbitrary)
    i_star = int(np.argmax(bounds))

    # Observe reward
    reward = y[i_star]
    x_star = X[i_star]

    # Record
    rows.append({
        "recipe_slug": slugs[i_star],
        "score_bound": bounds[i_star],
        "reward": reward
    })

    # Update LinUCB
    A += np.outer(x_star, x_star)
    b += reward * x_star

recommendations = pd.DataFrame(rows)
recommendations.to_csv("recommendations.tsv", sep="\t", index=False)

print("Saved recommendations.tsv")
print(recommendations.head())


Saved recommendations.tsv
        recipe_slug  score_bound  reward
0     apple-crumble     7.483315    1.00
1             ramen     7.270093    0.00
2       quesadillas     7.235617    0.00
3     ma-la-chicken     7.095599    0.00
4  pain-au-chocolat     6.936422    0.75


Submit "recommendations.tsv" in Gradescope.

## Part 7: Acknowledgments

Make a file "acknowledgments.txt" documenting any outside sources or help on this project.
If you discussed this assignment with anyone, please acknowledge them here.
If you used any libraries not mentioned in this module's content, please list them with a brief explanation what you used them for.
If you used any generative AI tools, please add links to your transcripts below, and any other information that you feel is necessary to comply with the generative AI policy.
If no acknowledgements are appropriate, just write none in the file.


In [12]:
from pathlib import Path

content = """I used ChatGPT (OpenAI) as a support tool to help clarify assignment requirements, debug errors in my Python code, and sanity-check intermediate results while implementing ridge regression and LinUCB.

Libraries used beyond standard Python:
- pandas: for loading, reshaping, and saving TSV files
- numpy: for matrix operations and linear algebra
- scikit-learn: for ridge regression (Part 3)

No other outside sources were used.
"""

Path("acknowledgments.txt").write_text(content)
print("Saved acknowledgments.txt") 

Saved acknowledgments.txt


Submit "acknowledgments.txt" in Gradescope.

## Part 8: Code

Please submit a Jupyter notebook that can reproduce all your calculations and recreate the previously submitted files.


Submit "project.ipynb" in Gradescope.