# CS 224D: Assignment #1

# 1 - Softmax

#### a - Prove that the softmax is invariant to constant offsets in the input, that is, for any input vector $x$ and any constant $c$, 
$softmax(x) = softmax(x + c)$
when $x + c$ means adding the constant $c$ to every dimension of $x$. Remember that: 

$softmax(x)_i = \frac{e^{x_i}}{\sum_je^{x_j}}$

a - anwer: Adding a constant $c$ changes both the numerator and the denominator the same way, which does not affect the result after division. 

It is usually computationnaly difficult to compute sums of exponentials with low arguments. Hence, we will use this property to normalize the arguments  and set $c = -max_i x_i$, ie that we substract the maximum element from all elements of $x$.

#### b - Given an input matrix of N rows and d columns, compute the softmax predictions for each row. Write your implementation in q1_softmax.py. You may test by executing python q1_softmax.py

In [1]:
import numpy as np
import random

In [57]:
def softmax(x):
    """
    Compute the softmax function for each row of the input x.

    It is crucial that this function is optimized for speed because
    it will be used frequently in later code.
    You might find numpy functions np.exp, np.sum, np.reshape,
    np.max, and numpy broadcasting useful for this task. (numpy
    broadcasting documentation:
    http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

    You should also make sure that your code works for one
    dimensional inputs (treat the vector as a row), you might find
    it helpful for your later problems.

    You must implement the optimization in problem 1(a) of the 
    written assignment!
    """
    
    """
    Moreover, to be as general as possible, we suppose that x is a 2D array, 
    N rows and d columns
    """
    
    """
    Remark: we must also consider the specific case where N = 1
    """

    ### YOUR CODE HERE
    
    # Case N > 1
    if len(x.shape) > 1:
        
        ## normalization part

        # First define the column containing the max of each row in x
        max_row = x.max(axis = 1).reshape((x.shape[0], 1))
        # Then remove this max to the dataset
        x -= max_row

        ## computation of softmax

        x = np.exp(x)
        denominator = x.sum(axis = 1).reshape(x.shape[0], 1)
        x /= denominator
        
    else:
        
        ## normalization part
        
        max_row = x.max()
        x -= max_row
        x = np.exp(x)
        denominator = x.sum()
        x /= denominator
    
    ### END YOUR CODE
    
    return x

In [60]:
# Toy example
softmax(np.array([1,2]))

array([ 0.26894142,  0.73105858])

Now let's test this functiondef test_softmax_basic():
    """
    Some simple tests to get you started. 
    Warning: these are not exhaustive.
    """
    print "Running basic tests..."
    test1 = softmax(np.array([1,2]))
    print test1
    assert np.amax(np.fabs(test1 - np.array(
        [0.26894142,  0.73105858]))) <= 1e-6

    test2 = softmax(np.array([[1001,1002],[3,4]]))
    print test2
    assert np.amax(np.fabs(test2 - np.array(
        [[0.26894142, 0.73105858], [0.26894142, 0.73105858]]))) <= 1e-6

    test3 = softmax(np.array([[-1001,-1002]]))
    print test3
    assert np.amax(np.fabs(test3 - np.array(
        [0.73105858, 0.26894142]))) <= 1e-6

    print "You should verify these results!\n"def test_softmax_basic():
    """
    Some simple tests to get you started. 
    Warning: these are not exhaustive.
    """
    print "Running basic tests..."
    test1 = softmax(np.array([1,2]))
    print test1
    assert np.amax(np.fabs(test1 - np.array(
        [0.26894142,  0.73105858]))) <= 1e-6

    test2 = softmax(np.array([[1001,1002],[3,4]]))
    print test2
    assert np.amax(np.fabs(test2 - np.array(
        [[0.26894142, 0.73105858], [0.26894142, 0.73105858]]))) <= 1e-6

    test3 = softmax(np.array([[-1001,-1002]]))
    print test3
    assert np.amax(np.fabs(test3 - np.array(
        [0.73105858, 0.26894142]))) <= 1e-6

    print "You should verify these results!\n"

In [61]:
def test_softmax_basic():
    """
    Some simple tests to get you started. 
    Warning: these are not exhaustive.
    """
    print("Running basic tests...")
    test1 = softmax(np.array([1,2]))
    print(test1)
    assert np.amax(np.fabs(test1 - np.array(
        [0.26894142,  0.73105858]))) <= 1e-6

    test2 = softmax(np.array([[1001,1002],[3,4]]))
    print(test2)
    assert np.amax(np.fabs(test2 - np.array(
        [[0.26894142, 0.73105858], [0.26894142, 0.73105858]]))) <= 1e-6

    test3 = softmax(np.array([[-1001,-1002]]))
    print(test3)
    assert np.amax(np.fabs(test3 - np.array(
        [0.73105858, 0.26894142]))) <= 1e-6

    print("You should verify these results!\n")
test_softmax_basic()

Running basic tests...
[ 0.26894142  0.73105858]
[[ 0.26894142  0.73105858]
 [ 0.26894142  0.73105858]]
[[ 0.73105858  0.26894142]]
You should verify these results!

