In the class we try to predict the restaurant rating using the dot product. One thing we found was that it predict a lot of non-sense rating like negative number or 6. In this homework, we are going to fix this problem

The problem with dot product is that it is unbounded. So we need to bound it between 0 and 5. The most common way to do this is to use logistic function to turn $(-\infty, \infty)$ to a bounded region.
$$ \theta(s) = \frac{1}{1 + e^{-s}}$$

1) Given the restaurant attribute $\vec{\rho}^{(r)}$ and person preference $\vec{\pi}^{(p)}$. Write down the prediction formula which gives the output in the range of $(0,5)$.

Hint: use dot product and logistic function then scale it properly.

In [88]:
%matplotlib inline
import numpy as np
from matplotlib import pyplot as plt

In [89]:
#from exercise 10
a = 5
def bound(s):
    return (1./(1 + np.exp(-s))) * a
def guess(R,P):
    return bound(np.dot(R,P))
def cost(R, P):
    return sum((T - guess(R, P))**2)

def score(R, P):
    return np.sum(H * (T - guess(R, P))**2)










2) Write down the cost function with your prediction formula above.

3) (Optional) Show that if your predition formula is (Do it on paper by hand. It's actually just a chain rule.)
$$Guess_{r,p} = a \theta(\vec{\rho}^{(r)} \cdot \vec{\pi}^{(p)}) + d$$



Then the derivative is given by
$$
	\frac{\partial{c}}{\partial{\pi^{(p)}_i}} =
	\sum_r 2 h_{rp} \left[ a \cdot \frac{1}{1 + e^{ -\vec{\rho}^{(r)} \cdot \vec{\pi}^{(p)} } }  + d - T_{rp} \right] \cdot \frac{ a e^{ -\vec{\rho}^{(r)} \cdot \vec{\pi}^{(p)}} }{\left( 1 + e^{ -\vec{\rho}^{(r)} \cdot \vec{\pi}^{(p)}} \right)^2} \rho^{(r)}_i
$$
and
$$
	\frac{\partial{c}}{\partial{\rho^{(r)}_i}} =
	\sum_p 2 h_{rp} \left[ a \cdot \frac{1}{1 + e^{ -\vec{\rho}^{(r)} \cdot \vec{\pi}^{(p)} } }  + d - T_{rp} \right] \cdot \frac{ a e^{ -\vec{\rho}^{(r)} \cdot \vec{\pi}^{(p)}} }{\left( 1 + e^{ -\vec{\rho}^{(r)} \cdot \vec{\pi}^{(p)}} \right)^2} \pi^{(p)}_i
$$



4) Write the above two equations in matrix form given that matrix

$$
    S_{rp} = 2 h \otimes \left[ a \cdot \frac{1}{1 + e^{ -R^T P } }  + d - T \right] \cdot \frac{ a e^{ -R^T P} }{\left( 1 + e^{ R^T P} \right)^2}
$$
where the exponential is element-wise exponential(yes there is such thing as exponential of matrix but that's not what we want).

The partial derivative should look super simple in terms of $S$.

5) Write down the update rule for R and P. Use $a$ and $d$ you found in 1.

6) Given the rating matrix we use in class, use this new prediction function and update rule to find $R$ and $P$.

In [90]:
npeople = len(names)
nrest = len(rnames)
nfeature = 7
np.random.seed(17)
P = np.random.randn(nfeature, npeople)

l = 0.005

a = 5
def theta(s):
    return (1./(1 + np.exp(-s)))

def find_R_P(R, P):
    l = 0.001
    
    for i in xrange(100000):
        RP = np.dot(R.T, P)
        HT = (2 * H * l) * (  (a * theta(RP)) + d - T  ) * (  (a * np.exp(-RP))/(1 + np.exp(-RP))**2  )
        
        P = P - np.dot(R, HT)
        R = R - np.dot(P, HT.T)
    return R, P



7) Use the code we had in exercise to show the prediction table.

In [91]:
def read_rating():
    with open('rating.csv') as f:
        iline = 0
        lines = f.readlines()
        useful_lines = lines[3:]

        names = lines[2].split(',')[2:]
        names = map(lambda x: x.strip(), names)

        all_ratings = []
        all_defined = []
        rnames = []
        for iline, line in enumerate(useful_lines):
            tokens = line.split(',')
            tokens = map(lambda x: x.strip(), tokens)
            rname = tokens[1]
            ratings = tokens[2:]
            defined = map(lambda x: 0 if x=='' or x=='"' else 1, ratings )
            def clean_cast(x):
                # print x
                return 0 if x=='' or x=='"' else float(x)
            ratings = map(lambda x: clean_cast(x), ratings)
            all_ratings.append(ratings)
            all_defined.append(defined)
            rnames.append(rname)
        #print all_ratings, all_defined , rnames
        T = np.array(all_ratings)
        H = np.array(all_defined)
    return T, H, names, rnames
T, H, names, rnames = read_rating()



In [92]:
from IPython.display import HTML

class TableCell:
    
    def __init__(self, text, tc=None, color=None):
        self.text = text
        self.tc = tc
        self.color = color
    
    def to_html(self):
        return '<td>%s</td>'%self.text

#the rating and guess matrix has different convention from the notes so be sure to transpose it first
def maketable(rating, has_rating, guess, restaurants, names):
    n_rests = len(restaurants)
    n_names = len(names)
    tab = np.empty((n_rests+1, n_names+1),dtype='object')
    #print tab.shape

    for irest in range(n_rests):
        tab[irest+1,0] = restaurants[irest]

    for iname in range(n_names):
        tab[0,iname+1] = names[iname]

    for irest in range(n_rests):
        for iname in range(n_names):
            if not has_rating[iname, irest]:
                tab[irest+1, iname+1] = TableCell('<span style="color:red">%3.2f</span>'%(guess[iname, irest]))
            else:
                tab[irest+1, iname+1] = TableCell('<span style="color:blue">%3.2f</span><span style="color:red">(%3.2f)</span>'%(rating[iname, irest], guess[iname, irest]))
    #now convert tab array to nice html table
    nrow, ncol = tab.shape
    t = []
    t.append('<table>')
    for irow in range(nrow):
        t.append('<tr>')
        for icol in range(ncol):
            cell = tab[irow,icol]
            if cell is not None:
                if isinstance(cell,TableCell):
                    t.append(tab[irow, icol].to_html())
                else:
                    t.append('<td>')
                    t.append(tab[irow, icol])
                    t.append('</td>')
            else:
                t.append('<td></td>')
        t.append('</tr>')  
    t.append('</table>')
    return '\n'.join(t)

In [93]:
l = 0.005

a = 5
def theta(s):
    return (1./(1 + np.exp(-s)))

def find_R_P(R, P):
    l = 0.001
    
    for i in xrange(100000):
        RP = np.dot(R.T, P)
        HT = (2 * H * l) * (  (a * theta(RP)) + d - T  ) * (  (a * np.exp(-RP))/(1 + np.exp(-RP))**2  )
        
        P = P - np.dot(R, HT)
        R = R - np.dot(P, HT.T)
    return R, P

R,P = find_R_P(R,P)

G = guess(R,P)

HTML(maketable(T.T, H.T, G.T, rnames, names))



NameError: global name 'h' is not defined