# MMD 2024, Problem Sheet 2: Latent Factor Models

Group: Daniela Fichiu, Aaron Maekel, Manuel Senger

## Exercise 1

**Task:** Consider a web shop that sells furniture and uses a recommendation system. When a
new user creates an account and likes one product, he will be presented with similar
products on his next visit.

How can a competitor - in principle - try to steal the valuable data for recommendation
from this website? Does this work better when the web shop implemented a content-
based (CB) or a collaborative filtering (CF) system? What data would the competitor be able to
infer? Would this technique have an impact on the recommendation system, i.e., would
this attack create a bias on the data? Why is this attack probably not viable in any
case?

**Solution:** By liking a product and receiving a list of similar products, the competitor can infer which products are similar to the one liked. This works better when the shop implements a CB approach. On one hand, the cold-start problem specific to CF disappears. On the other hand, user-user CF is based on matching a specific user to all the others. That is, a CF approach would mean the competitor would have to create specific users to receive recommendations for this specific user. The competitor would be able to infer which products are similar to the liked product. While a slight bias might be thinkable in the CF case, irrespective of the recommendation approach, nothing would happen. In the CB setting this is due to the features being hand-crafted. For the user-user CF setting, the competitor would have to create user vectors somewhat similar to the existing ones and additionally like items that the existing user would have never considered. The attack is not viable since it would require the creation of many user accounts which would like a single item or a very small set of items. This activity, however, would most certainly be detected by the server hosting the web shop.

## Exercise 2

The following table shows a utility matrix with ratings on a 1-5 star scale of eight items, a through h, by three users, A, B and C.

|   | a | b | c | d | e | f | g | h |
|---|---|---|---|---|---|---|---|---|
| A | 4 | 5 |   | 5 | 1 |   | 3 | 2 |
| B |   | 3 | 4 | 3 | 1 | 2 | 1 |   |
| C | 2 |   | 1 | 3 |   | 4 | 5 | 3 |

Perform the following tasks. Submit only your results (no code, even if you have used any).

a) **Task:** Treating each blank entry in the utility matrix as 0, compute the cosine distance between each pair of users.

**Solution:** 
The cosine distance is defined as $1 -$ the cosine similarity. Therefore, we first compute the dot products between the vectors and their norms, and then the cosine similarites.

Norms: 
- $\Vert A \Vert_2 = (4^2 + 5^2 + 5^2 + 1 + 3^2 + 2^2)^{1/2} \approx 9$
- $\Vert B \Vert_2 = (3^2 + 4^2 + 3^2 + 1 + 2^2 + 1)^{1/2} \approx 6$
- $\Vert C \Vert_2 = (2^2 + 1 + 3^2 + 4^2 + 5^2 + 3^2)^{1/2} = 8$

Dot products:
- $ A \cdot B = (4 \cdot 0 + 5 \cdot 3 + 0 \cdot 4 + 5 \cdot 3 + 1 \cdot 1 + 0 \cdot 2 + 3 \cdot 1 + 2 \cdot 0) = 34$
- $ A \cdot C = (4 \cdot 2 + 5 \cdot 0 + 0 \cdot 1 + 5 \cdot 3 + 1 \cdot 0 + 0 \cdot 4 + 3 \cdot 5 + 2 \cdot 3) = 44$
- $ B \cdot C = (0 \cdot 2 + 3 \cdot 0 + 4 \cdot 1 + 3 \cdot 3 + 1 \cdot 0 + 2 \cdot 4 + 1 \cdot 5 + 0 \cdot 3) = 26$

Cosine similarities:
- $\cos_\text{sim}(A, B) = A \cdot B / \Vert A \Vert_2 \Vert B \Vert_2 = 34 / 9 \cdot 6 \approx 0.6$
- $\cos_\text{sim}(A, C) = A \cdot C / \Vert A \Vert_2 \Vert C \Vert_2 = 44 / 9 \cdot 8 \approx 0.6$
- $\cos_\text{sim}(B, C) = B \cdot C / \Vert B \Vert_2 \Vert C \Vert_2 = 26 / 6 \cdot 8 \approx 0.55$

Cosiene distances:
- $\cos_\text{dist}(A, B) = 1 - \cos_\text{sim}(A, B) \approx 1 - 0.6 \approx 0.4$
- $\cos_\text{dist}(A, C) = 1 - \cos_\text{sim}(A, C) \approx 1 - 0.6 \approx 0.4$
- $\cos_\text{dist}(B, C) = 1 - \cos_\text{sim}(B, C) \approx 1 - 0.55 \approx 0.45$


b) **Task:** Treating ratings of 3, 4, and 5 as 1 and 1, 2, and blank as 0, compute the cosine distance between each pair of users. Compare these results to those obtain in Part a).

**Solution:** Converting 3 and 4 to 1 and 5 to 2, we obtain the following utility matrix:


|   | a | b | c | d | e | f | g | h |
|---|---|---|---|---|---|---|---|---|
| A | 1 | 2 |   | 2 | 1 |   | 1 | 2 |
| B |   | 1 | 1 | 1 | 1 | 2 | 1 |   |
| C | 2 |   | 1 | 1 |   | 1 | 2 | 1 |

For the obtained utility matrix, we repeat the steps from Part a).

Norms: 
- $\Vert A \Vert_2 = (1^2 + 2^2 + 2^2 + 1 + 1^2 + 2^2)^{1/2} \approx 4$
- $\Vert B \Vert_2 = (1^2 + 1^2 + 1^2 + 1 + 2^2 + 1)^{1/2} \approx 3$
- $\Vert C \Vert_2 = (2^2 + 1 + 1^2 + 1^2 + 2^2 + 1^2)^{1/2} \approx 3.5$

Dot products:
- $ A \cdot B = (1 \cdot 0 + 2 \cdot 1 + 0 \cdot 1 + 2 \cdot 1 + 1 \cdot 1 + 0 \cdot 2 + 1 \cdot 1 + 2 \cdot 0) = 6$
- $ A \cdot C = (1 \cdot 2 + 2 \cdot 0 + 0 \cdot 1 + 2 \cdot 1 + 1 \cdot 0 + 0 \cdot 1 + 1 \cdot 2 + 2 \cdot 1) = 8$
- $ B \cdot C = (0 \cdot 2 + 1 \cdot 0 + 1 \cdot 1 + 1 \cdot 1 + 1 \cdot 0 + 2 \cdot 1 + 1 \cdot 2 + 0 \cdot 1) = 6$

Cosine similarities:
- $\cos_\text{sim}(A, B) = A \cdot B / \Vert A \Vert_2 \Vert B \Vert_2 = 6 / 4 \cdot 3 \approx 0.5$
- $\cos_\text{sim}(A, C) = A \cdot C / \Vert A \Vert_2 \Vert C \Vert_2 = 8 / 4 \cdot 3.5 \approx 0.57$
- $\cos_\text{sim}(B, C) = B \cdot C / \Vert B \Vert_2 \Vert C \Vert_2 = 6 / 3 \cdot 3.5 \approx 0.57$

Cosine distances:
- $\cos_\text{dist}(A, B) = 1 - \cos_\text{sim}(A, B) \approx 1 - 0.5 \approx 0.5$
- $\cos_\text{dist}(A, C) = 1 - \cos_\text{sim}(A, C) \approx 1 - 0.57 \approx 0.43$
- $\cos_\text{dist}(B, C) = 1 - \cos_\text{sim}(B, C) \approx 1 - 0.57 \approx 0.43$

The results obtained are similar to those from Part a). The biggest changes are in the cosine distances of A. This is due to the fact that from the 6 non-blank entries of A (of which two were 5), four got changed.

c) **Task:** Normalize the matrix by substracting from each non-blank entry the average value for its user. Then compute the cosine distance between each pair of users (blank entries are treated as 0).

**Solution:** We compute the averages only over the existing values:

Averages:
- $A = \frac{(4 + 5 + 5 + 1 + 3 + 2)}{6} \approx 3.3$
- $B = \frac{(3 + 4 + 3 + 1 + 2 + 1)}{6} \approx 2.3$
- $C = \frac{(2 + 1 + 3 + 4 + 5 + 3)}{6} = 3$

The normalized matrix has is then given by


|   | a | b | c | d | e | f | g | h |
|---|---|---|---|---|---|---|---|---|
| A | 0.7 | 1.7 |   | 1.7 | -2.3 |   | -0.3 | -1.3 |
| B |   | 0.7 | 1.7 | 0.7 | -1.3 | -0.3 | -1.3 |   |
| C | -1 |   | -2 | 0 |   | 1 | 2 | 0 |


Norms: 
- $\Vert A \Vert_2 \approx 3.6$
- $\Vert B \Vert_2 \approx 2.7$
- $\Vert C \Vert_2 \approx 3.1$

Dot products:
- $ A \cdot B \approx 5.7$
- $ A \cdot C \approx -1.29$
- $ B \cdot C \approx -6.3$

Cosine similarities:
- $\cos_\text{sim}(A, B) = A \cdot B / \Vert A \Vert_2 \Vert B \Vert_2 = 5.7 / (3.6 \cdot 2.7) \approx 0.5$
- $\cos_\text{sim}(A, C) = A \cdot C / \Vert A \Vert_2 \Vert C \Vert_2 = -1.29 / (3.6 \cdot 3.1) \approx -0.1$
- $\cos_\text{sim}(B, C) = B \cdot C / \Vert B \Vert_2 \Vert C \Vert_2 = -6.3 / (2.7 \cdot 3.1) \approx -0.7$

Cosine distances:
- $\cos_\text{dist}(A, B) = 1 - \cos_\text{sim}(A, B) \approx 1 - 0.5 \approx 0.5$
- $\cos_\text{dist}(A, C) = 1 - \cos_\text{sim}(A, C) \approx 1 + 0.1 \approx 1.1$
- $\cos_\text{dist}(B, C) = 1 - \cos_\text{sim}(B, C) \approx 1 + 0.7 \approx 1.7$

d) **Task:** Compute the Pearson correlation coefficient between each pair of users as defined in Lecture 1 (slide "From Cosine to Pearson"). Compare these results to those obtained in Part c) and state your conclusions.

**Solution:** The only difference between the Pearson coefficient and the cosine similarites computed in Part c) is that the Pearson coefficient computes the norm-like denominators over the commonly rated items.

Denominators for the cosine similarity between:
- $A$ and $B$: $d_{AB} \approx 7$ (over items b, d, e, g)
- $A$ and $C$: $d_{AC} \approx 5$ (over items a, d, g, h)
- $B$ and $C$: $d_{BC} \approx 7$ (over items c, d, f, g)

Cosine similarities:
- $\cos_\text{sim}(A, B) = A \cdot B / d_{AB} \approx 5.7 / 7 \approx 0.8$
- $\cos_\text{sim}(A, C) = A \cdot C / d_{AC} \approx -1.29 / 5 \approx -0.2$
- $\cos_\text{sim}(B, C) = B \cdot C / d_{BC} \approx -6.3 / 7 \approx -0.9$

Cosine distances:
- $\cos_\text{dist}(A, B) = 1 - \cos_\text{sim}(A, B) \approx 1 - 0.8 \approx 0.2$
- $\cos_\text{dist}(A, C) = 1 - \cos_\text{sim}(A, C) \approx 1 + 0.2 \approx 1.2$
- $\cos_\text{dist}(B, C) = 1 - \cos_\text{sim}(B, C) \approx 1 + 0.9 \approx 1.9$

TODO: Write comparison.

## Exercise 3

Consider the function `uv_factorization_vec_no_reg` from `rec_sys/lf_algorithms.py` and the functions utilized there. Perform the following tasks:

a) **Task:** Implement the function `uv_factorization_vec_reg` (along with any necessary helper or sub-functions) which executes the SDG using a loss function with the regularization terms discussed in the lecture. Use one common regularization parameter for both matrices. Test your implementation using the settings from `rec_sys/config.py` (you might need to adjust  some hyperparameters). Then, use the function `show_metrics_and_examples` to compare the convergence and the accuracy of your function against the non-regularizd version of SGD from the original (i.e., repository) code.

**Solution:** See `uv_factorization_vec_reg` in `rec_sys/lf_algorithms.py`.

b) **Task:** TODO

## Exercise 6


# a)
The rating of a professor should be divided into multiple categories ( makes a good lecture, is a good supervisor etc.), as tchey an differ a lot. This means, each time a user rates a lecture,seminar, thesis supervision, the rating should only affect the categories it fits best. ( having had a bad thesis supervision is not important for the quality of the lecture the professor will organize)

# b) 

if the artworks are items, then there should be tags describing the essence of each artwork like {scary,nature,goofy}. These can be used to find out which genres the user prefers and are usable in content recommender systems.

A better focus than the artwork may be the artists themselves, If a user likes an artwork he will prefer other art made in the same style, thus being from the same artist, compared to different artwork which may have the same topics, but differ in style. Thus the algorithm should rather try to match user and artist compared to user and picture.


# c)
There is no user-item pairing, it is a user-user pairing. This changes the problem dramatically, as both users have to rate each other, thus a bidirectional pairing has to be done. 

 Features could be superficial ( haircolor/weight etc.), personal information(age, religion, etc.) and tastes/dislikes (music genres, food etc.).   

 
Another difficulty is in the aggregation of non-superficial data, as it is hard to put into categories, but may be even the most important data of all. 

# Exercise 7

## a)

**G (global effects)** will probably return a newly published and well rated movie - that is quite popular - as a recommendation. The R system (regional effects) would be more aligned with the users preferences in terms of genre, director, actors etc. These are more likely to align with the taste of the specific user. 

The **L system (local effects)** would be the most personalized, as it would take into account the users previous ratings and preferences and compare them with other users. Depending on the size of the user cluster these can include more niche movies.

**The System R** probably produces the most niche and exotic items, as it can infer personalized, latent factors in user preferences. System L cann offer some niche recommendations and System G mostly sticks to widely popular items.

## b)

Depending on the size of the Dataset, System G would probably be the most vulnerable to this attack. Thats because it bases recommendations on global ratings and trends. So if a lot of user rate a movie with 5 stars, it will be recommended to a lot of other users, regardless of their preferences. In contrast, System L is the most robust against this manipulation. The fake accounts would mainly affect the reommendations of the fake accounts within the fake cluster, so their impact on the overall system is limited. 

## c)
Both System G and System L fail for grey sheep, as they are based on global trends or user clusters. The grey sheep are users that do not fit into any cluster and have preferences that are not aligned with the global trends. System R is the only one that can recommend items to grey sheep, as it is based on the users preferences and not on global trends or user clusters and might align from time to time. Black sheep are very hard to capture and neither of the systems can reliable recommend items to them.    
