# Mean Normalization
- If we normalize the movie ratings to have a consistent average value, the recommended system will perform a bit better.
- By adding mean normalization will help the algorithm make better predictions on the user Eve.
- If the algorithm will predict that if we have a new user that has not yet rated anything, we think they'll rate all movies with zero stars and that's not particularly helpful.
- Doing mean normalization will help this algorithm come up with better predictions of the movie ratings for a new user that has not yet rated any movies. 
- ![image.png](attachment:image.png)
- The effect of the algorithm is it will cause the initial guesses for the new user Eve to be just equal to the mean of whatever other users have rated these 5 movies.
- ![image-2.png](attachment:image-2.png)
- There's one other alternative that we could use which is to instead normalize the columns of the matrix to have zero mean. Normalizing the columns would hope if there was a brand new moview that no one has rated yet.

# TensorFlow implementation of collaborative filtering
- One of the reasons everyone like TensorFlow is that for many applications in order to implement gradient descent, we need to find the derivatives of the cost function, but TensorFlow can automatically figure out for use what are the derivatives of the cost function.
- ![image.png](attachment:image.png)
- The gradient tape feature of TensorFlow, the main work we need to do is to tell it how to compute the cost function J.
    ![image-2.png](attachment:image-2.png)
- AutoGrad or AutoDiff
- ![image-3.png](attachment:image-3.png)
- Why couldn't we use a dense layer and then model compiler and model fit? The reason we couldn't use that old recipe is, the collateral filtering algorithm and cost function, it doesn't fit into the dense layer or the other standard neural network layer types of TensorFlow

In [3]:
import tensorflow as tf

w = tf.Variable(3.0)
x = 1.0
y = 1.0 # Target variable
alpha = 0.01

iterations = 30
for iter in range(iterations):
    # Use TensorFlow's Gradient tape to record the steps used to compute the cost J, to enable auto differentiation
    with tf.GradientTape() as tape:
        fwb  = w*x
        costJ = (fwb - y)**2
    
    # Use the gradient tape to calculate the gradients of the cost with resepect to the parameter w.
    [djdw] = tape.gradient(costJ, [w])
    
    # Run one step of gradient by updating the value of w to reduce the cost.
    w.assign_add(-alpha * djdw)

# Finding Related items
- It turns out the collaborative filtering algorithm that we've been talking about gives us a nice way to find related items.
- As part of the collaborative filtering we've discussed, we learned features x_(i) for every item i, for every movie i or other type of item they're recommending to users. These learned features, collectively x_1, x_2, x_3, other many features, and we have collectively these features do convey something about what that movie is like. I turns out that given features x^(i) of item i, if we want to find other items, say other movies related to movie i, then what we can do is try to find the item k with features x^(k) that is similar to x^(i).
- Given a feature vector x^(i), the way we determine what are known as similar to the feature x^(i) is as follows; which turns out to be the squared distance b/w x^(k) and x^(i)
    ![image.png](attachment:image.png)
- ![image-2.png](attachment:image-2.png)
- ![image-3.png](attachment:image-3.png)