##Word2 Vectors and Sentiment Analysis
- Softmax function
- simple neural network
- Back propagation
- Word2vec models

In [3]:
import random
import numpy as np
from cs224d.data_utils import *
import matplotlib.pyplot as plt

%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0,8.0)
plt.rcParams['image.interpolation']='nearest'
plt.rcParams['image.cmap']='gray'

%load_ext autoreload
%autoreload 2

##1. Softmax

In [10]:
def softmax(x):
    '''Softmax function
    实现Softmax函数
    输入：矩阵输入,[[x_11,x_12...,x_1n],[x_21,...,x_2n]...]
    '''
    xarr = np.array(x)
    xmax = np.max(xarr,axis=1,keepdims=True)
    e = np.exp(xarr - xmax)
    dist = e / np.sum(e,axis=1,keepdims=True)
    return dist

In [14]:
#test softmax
x = [[1,2],[3,4]]
print(softmax(x))
print(softmax(np.array([[1001,1002],[3,4]])))
print(softmax([[-1001,-1002]]))

[[ 0.26894142  0.73105858]
 [ 0.26894142  0.73105858]]
[[ 0.26894142  0.73105858]
 [ 0.26894142  0.73105858]]
[[ 0.73105858  0.26894142]]


## 2. Neural network basics

- sigmoid activation function and its gradient
- 向前传播和简单的神经网络（cross-entropy cost）
- 向后传播算法算法
- Gradient / derivation check

In [15]:
def sigmoid(x):
    '''Sigmoid function'''
    return 1.0 /( 1  + np.exp(-x))

def sigmoid_grad(f):
    '''Sigmoid gradient function'''
    return f*(1-f)

In [17]:
#test sigmoid
x = np.array([[1,2],[-1,-2]])
f = sigmoid(x)
g = sigmoid_grad(f)
print(f)
print(g)

[[ 0.73105858  0.88079708]
 [ 0.26894142  0.11920292]]
[[ 0.19661193  0.10499359]
 [ 0.19661193  0.10499359]]


In [16]:
def gradchec_naive(f,x):
    '''
    对函数f的梯度检查,对求导的结果进行数值上的检验，验证代码求导是否正确
    - f应该是一个函数，接收一个单独的参数，返回结果和它的梯度
    - x
    '''
    rndstate = random.getstate()
    random.setstate(rndstate)
    fx,grad = f(x)
    h =1e-4
    
    it = np.nditer(x,flags=['multi_index'],op_flags=['readwrite'])
    while not it.finished:
        ix = it.multi_index
        
        old_value = x[ix]
        x[ix] = old_value -h
        random.setstate(rndstate)
        fx_h1,grad_h1 = f(x)
        
        x[ix] = old_value + h
        random.setstate(rndstate)
        fx_h2,grad_h2 = f(x)
        
        numgrad = (fx_h2 - fx_h1)/(2*h)
        x[ix] = old_value
        
        # 比较梯度
        reidiff = abs(numgrad - grad[ix]) / max(1,abs(numgrad),abs(grad[ix]))
        if reidiff > 1e-5:
            print("梯度检查失败.")
            print("第一梯度错误在索引%s处"%str(ix))
            print("公式的梯度：%f;\t数字逼近的梯度：%f"%(grad[ix],numgrad))
            return
        it.iternext() #Step to next dimentsion
        
    print("梯度检查通过!")
        
        

$$\frac {d} {d \theta} J(\theta) = \lim_{\epsilon \rightarrow} \frac {J 
(\theta + \epsilon) - J(\theta - \epsilon)}{2 \epsilon}$$

那么对于任意的$\theta$值，都可以用右边的等式来近似梯度

In [18]:
#测试梯度检查
quad = lambda x: (np.sum(x**2),x*2)
gradchec_naive(quad,np.array(123.456)) #标量测试
gradchec_naive(quad,np.random.randn(3,)) #一维
gradchec_naive(quad,np.random.randn(4,5))#多维

梯度检查通过!
梯度检查通过!
梯度检查通过!
