# DX 704 Week 4 Project

This week's project will test the learning speed of linear contextual bandits compared to unoptimized approaches.
You will start with building a preference data set for evaluation, and then implement different variations of LinUCB and visualize how fast they learn the preferences.


The full project description, a template notebook and supporting code are available on GitHub: [Project 4 Materials](https://github.com/bu-cds-dx704/dx704-project-04).


## Example Code

You may find it helpful to refer to these GitHub repositories of Jupyter notebooks for example code.

* https://github.com/bu-cds-omds/dx601-examples
* https://github.com/bu-cds-omds/dx602-examples
* https://github.com/bu-cds-omds/dx603-examples
* https://github.com/bu-cds-omds/dx704-examples

Any calculations demonstrated in code examples or videos may be found in these notebooks, and you are allowed to copy this example code in your homework answers.

## Part 1: Collect Rating Data

The file "recipes.tsv" in this repository has information about 100 recipes.
Make a new file "ratings.tsv" with two columns, recipe_slug (from recipes.tsv) and rating.
Populate the rating column with values between 0 and 1 where 0 is the worst and 1 is the best.
You can assign these ratings however you want within that range, but try to make it reflect a consistent set of preferences.
These could be your preferences, or a persona of your choosing (e.g. chocolate lover, bacon-obsessed, or sweet tooth).
Make sure that there are at least 10 ratings of zero and at least 10 ratings of one.


Hint: You may find it more convenient to assign raw ratings from 1 to 5 and then remap them as follows.

`ratings["rating"] = (ratings["rating_raw"] - 1) * 0.25`

In [1]:
import pandas as pd
import numpy as np

pd.set_option('display.max_rows', None)
df = pd.read_csv("recipes.tsv", sep="\t")

def add_feature(feature_name, multiplier=1):
    df[feature_name] = df['recipe_title'].str.lower().str.contains(feature_name.lower())
    df[feature_name] = df[feature_name].astype(int) * multiplier

# Add features to the dataframe
add_feature("chicken", multiplier=2)
add_feature("bacon", multiplier=10)
add_feature("burrito")
add_feature("salad")
add_feature("soup")
add_feature("burger", multiplier=2)
add_feature("pasta")
add_feature("taco")
add_feature("pizza", multiplier=2)
add_feature("lasagna", multiplier=2)
add_feature("souffle")

In [2]:
# Identify the ingredient columns and sum them to create a raw rating
items = df.select_dtypes(include=['int64', 'float64']).columns
df['rating_raw'] = np.sum(df[items], axis=1)

# Create a rating column that scales between 1 and 0
df['rating'] = df['rating_raw'].apply(lambda x: 1 if x >= 10 else (0 if x <= 0 else (np.random.uniform(0,1))))

# Save the dataframe to a new TSV file called ratings.tsv
df[['recipe_slug', 'rating']].to_csv("ratings.tsv", sep="\t", index=False)

Submit "ratings.tsv" in Gradescope.

## Part 2: Construct Model Input

Use your file "ratings.tsv" combined with "recipe-tags.tsv" to create a new file "features.tsv" with a column recipe_slug, a column bias which is hard-coded to one, and a column for each tag that appears in "recipe-tags.tsv".
The tag column in this file should be a 0-1 encoding of the recipe tags for each recipe.
[Pandas reshaping function methods](https://pandas.pydata.org/docs/user_guide/reshaping.html) may be helpful.

The bias column will make later LinUCB calculations easier since it will just be another dimension.

Hint: For later modeling steps, it will be important to have the feature data (inputs) and the rating data (target outputs) in the same order.
It is highly recommended to make sure that "features.tsv" and "ratings.tsv" have the recipe slugs in the same order.

In [3]:
# Build features.tsv: bias + one-hot tags, aligned to ratings order
ratings = pd.read_csv("ratings.tsv", sep="\t")
tags = pd.read_csv("recipe-tags.tsv", sep="\t")
tags = tags.pivot_table(index='recipe_slug', columns='recipe_tag', aggfunc='size', fill_value=0)

features = pd.merge(ratings, tags, on='recipe_slug', how='left')
features = features.fillna(0)
features = features.drop(columns=['rating'])
features.insert(1, 'bias', 1.0)
features.to_csv("features.tsv", sep="\t", index=False)

features.head()

Unnamed: 0,recipe_slug,bias,alfredo,almond,american,appetizer,appetizers,apple,asiancuisine,asparagus,...,udonnoodles,vanilla,vanillaicecream,vegan,vegetables,vegetarian,warm,whippedcream,winter,yeastdough
0,falafel,1.0,0,0,0,1,0,0,0,0,...,0,0,0,1,0,1,0,0,0,0
1,spamburger,1.0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,bacon-fried-rice,1.0,0,0,0,0,0,0,0,0,...,0,0,0,0,1,0,0,0,0,0
3,chicken-fingers,1.0,0,0,0,1,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,apple-crisp,1.0,0,0,0,0,0,1,0,0,...,0,0,0,0,0,0,0,0,1,0


Submit "features.tsv" in Gradescope.

## Part 3: Linear Preference Model

Use your feature and rating files to build a ridge regression model with ridge regression's regularization parameter $\alpha$ set to 1.


Hint: If you are using scikit-learn modeling classes, you should use `fit_intercept=False` since that intercept value will be redundant with the bias coefficient.

Hint: The estimate component of the bounds should match the previous estimate, so you should be able to just focus on the variance component of the bounds now.

In [4]:
# Run a ridge regression to predict ratings from features
from sklearn.linear_model import Ridge
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Split the data
X = features.drop(columns=['recipe_slug', 'bias'])
y = ratings['rating']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Fit the model
model = Ridge(alpha=1.0, fit_intercept=False)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

# Calculate the mean squared error
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse}")


Mean Squared Error: 0.05348248630468311


Save the coefficients of this model in a file "model.tsv" with columns "recipe_tag" and "coefficient".
Do not add anything for the `intercept_` attribute of a scikit-learn model; this will be covered by the coefficient for the bias column added in part 2.

In [5]:
# Save the coefficients of the Ridge model
coef_df = pd.DataFrame({
    'feature': X.columns,
    'coefficient': model.coef_
})

coef_df.to_csv("model.tsv", sep="\t", index=False)

Submit "model.tsv" in Gradescope.

## Part 4: Recipe Estimates

Use the recipe model to estimate the score of every recipe.
Save these estimates to a file "estimates.tsv" with columns recipe_slug and score_estimate.

In [6]:
# Estimate the score of every recipe in the dataset
all_X = features.drop(columns=['recipe_slug', 'bias'])
all_y_pred = model.predict(all_X)
all_ratings = pd.DataFrame({
    'recipe_slug': features['recipe_slug'],
    'predicted_rating': all_y_pred
})
all_ratings.to_csv("estimates.tsv", sep="\t", index=False)

Submit "estimates.tsv" in Gradescope.

## Part 5: LinUCB Bounds

Calculate the upper bounds of LinUCB using data corresponding to trying every recipe once and receiving the rating in "ratings.tsv" as the reward.
Keep the ridge regression regularization parameter at 1, and set LinUCB's $\alpha$ parameter to 2.
Save these upper bounds to a file "bounds.tsv" with columns recipe_slug and score_bound.

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
  <mtable columnalign="right center left" columnspacing="1em" rowspacing="4pt">
    <mtr>
      <mtd>
        <mi>&#x3B1;</mi>
      </mtd>
      <mtd>
        <mo>=</mo>
      </mtd>
      <mtd>
        <mn>1</mn>
        <mo>+</mo>
        <msqrt>
          <mi>ln</mi>
          <mo data-mjx-texclass="NONE">&#x2061;</mo>
          <mo stretchy="false">(</mo>
          <mn>2</mn>
          <mrow data-mjx-texclass="ORD">
            <mo>/</mo>
          </mrow>
          <mi>&#x3B4;</mi>
          <mo stretchy="false">)</mo>
          <mrow data-mjx-texclass="ORD">
            <mo>/</mo>
          </mrow>
          <mn>2</mn>
        </msqrt>
      </mtd>
    </mtr>
    <mtr>
      <mtd>
        <mrow data-mjx-texclass="ORD">
          <mi mathvariant="bold">D</mi>
        </mrow>
      </mtd>
      <mtd>
        <mo>=</mo>
      </mtd>
      <mtd>
        <mtext>matrix of rows of previous context vectors</mtext>
      </mtd>
    </mtr>
    <mtr>
      <mtd>
        <mrow data-mjx-texclass="ORD">
          <mover>
            <mi>&#x3B8;</mi>
            <mo stretchy="false">^</mo>
          </mover>
        </mrow>
      </mtd>
      <mtd>
        <mo>=</mo>
      </mtd>
      <mtd>
        <mrow>
          <mtext>estimate for&#xA0;</mtext>
          <mrow data-mjx-texclass="ORD">
            <mi>&#x3B8;</mi>
          </mrow>
          <mtext>&#xA0;using ridge regression</mtext>
        </mrow>
      </mtd>
    </mtr>
    <mtr>
      <mtd>
        <mrow data-mjx-texclass="ORD">
          <mi mathvariant="bold">z</mi>
        </mrow>
      </mtd>
      <mtd>
        <mo>=</mo>
      </mtd>
      <mtd>
        <mtext>new constant vector</mtext>
      </mtd>
    </mtr>
    <mtr>
      <mtd>
        <mtext>upper bound</mtext>
      </mtd>
      <mtd>
        <mo>=</mo>
      </mtd>
      <mtd>
        <msup>
          <mrow data-mjx-texclass="ORD">
            <mi mathvariant="bold">z</mi>
          </mrow>
          <mrow data-mjx-texclass="ORD">
            <mi data-mjx-auto-op="false" mathvariant="normal">T</mi>
          </mrow>
        </msup>
        <mrow data-mjx-texclass="ORD">
          <mover>
            <mi>&#x3B8;</mi>
            <mo stretchy="false">^</mo>
          </mover>
        </mrow>
        <mo>+</mo>
        <mi>&#x3B1;</mi>
        <msqrt>
          <msup>
            <mrow data-mjx-texclass="ORD">
              <mi mathvariant="bold">z</mi>
            </mrow>
            <mrow data-mjx-texclass="ORD">
              <mi data-mjx-auto-op="false" mathvariant="normal">T</mi>
            </mrow>
          </msup>
          <msup>
            <mrow data-mjx-texclass="INNER">
              <mo data-mjx-texclass="OPEN">(</mo>
              <msup>
                <mrow data-mjx-texclass="ORD">
                  <mi mathvariant="bold">D</mi>
                </mrow>
                <mrow data-mjx-texclass="ORD">
                  <mi data-mjx-auto-op="false" mathvariant="normal">T</mi>
                </mrow>
              </msup>
              <mrow data-mjx-texclass="ORD">
                <mi mathvariant="bold">D</mi>
              </mrow>
              <mo>+</mo>
              <msub>
                <mrow data-mjx-texclass="ORD">
                  <mi mathvariant="bold">I</mi>
                </mrow>
                <mi>d</mi>
              </msub>
              <mo data-mjx-texclass="CLOSE">)</mo>
            </mrow>
            <mrow data-mjx-texclass="ORD">
              <mo>&#x2212;</mo>
              <mn>1</mn>
            </mrow>
          </msup>
          <mrow data-mjx-texclass="ORD">
            <mi mathvariant="bold">z</mi>
          </mrow>
        </msqrt>
      </mtd>
    </mtr>
  </mtable>
</math>


In [8]:
# Calculate the upper bounds of LinUCB
alpha = 1.0
lambda_ridge = 1.0

# Use similar features and target from Ridge model
X_ucb = features.drop(columns=['recipe_slug', 'bias']).to_numpy()
y_ucb = ratings['rating'].to_numpy()

# weights for each feature
model_weights = model.coef_
mu = X_ucb @ model_weights

# Ridge covariance matrix
cov_matrix = np.linalg.inv(lambda_ridge * np.eye(X_ucb.shape[1]) + X_ucb.T @ X_ucb)
inv_cov_matrix = np.linalg.inv(cov_matrix)

# Uncertainty term and bounds calculation
std_term = np.sqrt(np.einsum('ij,jk,ik->i', X_ucb, inv_cov_matrix, X_ucb))
score_bounds = mu + alpha * std_term

bounds = pd.DataFrame({
    'recipe_slug': features['recipe_slug'],
    'score_bound': score_bounds
})

bounds.to_csv("bounds.tsv", sep="\t", index=False)

bounds.head()

Unnamed: 0,recipe_slug,score_bound
0,falafel,9.373732
1,spamburger,13.688931
2,bacon-fried-rice,16.517038
3,chicken-fingers,8.901273
4,apple-crisp,17.19157


Submit "bounds.tsv" in Gradescope.

## Part 6: Make Online Recommendations

Implement LinUCB to make 100 recommendations starting with no data and using the same parameters as in part 5.
One recommendation should be made at a time and you can break ties arbitrarily.
After each recommendation, use the rating from part 1 as the reward to update the LinUCB data.
Record the recommendations made in a file "recommendations.tsv" with columns "recipe_slug", "score_bound", and "reward".
The rows in this file should be in the same order as the recommendations were made.

In [9]:
# Implement LinUCB to make recommendations
# Contexts: same features used earlier (exclude slug and bias)
Z = features.drop(columns=['recipe_slug', 'bias']).to_numpy(dtype=float)
slugs = features['recipe_slug'].tolist()

# True rewards from Part 1 ratings
rating_map = dict(zip(ratings['recipe_slug'], ratings['rating']))

d = Z.shape[1]
A_inv = (1.0 / lambda_ridge) * np.eye(d)  # initial (lambda I)^-1
b = np.zeros(d)

def ucb_scores(Z_block, A_inv, theta, alpha):
    mean = Z_block @ theta
    var = np.einsum('ij,jk,ik->i', Z_block, A_inv, Z_block)
    return mean + alpha * np.sqrt(var)

recommendations = []
remaining = list(range(len(slugs)))

for t in range(min(100, len(slugs))):
    theta = A_inv @ b

    # UCB on remaining candidates
    Z_rem = Z[remaining]
    p = ucb_scores(Z_rem, A_inv, theta, alpha)

    # Pick best (break ties by first occurrence)
    pick_pos = int(np.argmax(p))
    i = remaining.pop(pick_pos)

    x = Z[i]
    slug = slugs[i]
    reward = float(rating_map[slug])
    score_bound = float(p[pick_pos])

    recommendations.append((slug, score_bound, reward))

    # Sherman-Morrison update: A_inv <- (A + x x^T)^-1
    Ax = A_inv @ x
    denom = 1.0 + x @ Ax
    A_inv = A_inv - np.outer(Ax, Ax) / denom

    # Update b
    b = b + reward * x

# Save recommendations
recs_df = pd.DataFrame(recommendations, columns=['recipe_slug', 'score_bound', 'reward'])
recs_df.to_csv("recommendations.tsv", sep="\t", index=False)

recs_df.head()
# ...existing code...

Unnamed: 0,recipe_slug,score_bound,reward
0,apple-crumble,3.605551,0.0
1,ma-la-chicken,3.464102,0.599422
2,quesadillas,3.49909,0.0
3,ramen,3.48894,0.0
4,bacon-fried-rice,3.337956,1.0


Submit "recommendations.tsv" in Gradescope.

## Part 7: Acknowledgments

Make a file "acknowledgments.txt" documenting any outside sources or help on this project.
If you discussed this assignment with anyone, please acknowledge them here.
If you used any libraries not mentioned in this module's content, please list them with a brief explanation what you used them for.
If you used any generative AI tools, please add links to your transcripts below, and any other information that you feel is necessary to comply with the generative AI policy.
If no acknowledgements are appropriate, just write none in the file.


Submit "acknowledgments.txt" in Gradescope.

In [10]:
with open("acknowledgments.txt", "w") as f:
    f.write("Acknowledgments:\n")
    f.write("I'd like to acknowledge my peers for their support and engagement on piazza, their conversations were helpful in debugging errors with the autograder\n")
    f.write("I'd also like to acknowledge Professor Considine for his video lectures, content on blackboard, and the example notebooks.\n")
    f.write("Finally, I'd like to acknowledge ChatGPT for helping me debug some of my code and providing suggestions on how to implement certain algorithms.\n")

## Part 8: Code

Please submit a Jupyter notebook that can reproduce all your calculations and recreate the previously submitted files.


Submit "project.ipynb" in Gradescope.