## Warm-up: Numpy
#### Before introducing PyTorch, we will first implement the network using numpy.

Numpy provides an n-dimensional array object, and many functions for manipulating these arrays. Numpy is a generic framework for scientific computing; it does not know anything about computation graphs, or deep learning, or gradients. However we can easily use numpy to fit a two-layer network to random data by manually implementing the forward and backward passes through the network using numpy operations:





In [7]:
import numpy as np

batch_size = 64
input_dimension = 1000 
hidden_dimension = 100 
output_dimension = 10

#Generate random inpt and output data
x = np.random.randn(batch_size, input_dimension)
y = np.random.randn(batch_size, output_dimension)

#Initialize random weights 
weight1 = np.random.randn(input_dimension, hidden_dimension)
weight2 = np.random.randn(hidden_dimension, output_dimension)

learning_rate = 1e-6

for n in range(500):
    # Compute predicted y in a forward pass
    dot_product = x.dot(weight1)
    dot_product_relu = np.maximum(dot_product, 0)
    y_pred = dot_product_relu.dot(weight2)
    
    #Loss
    loss = np.square(y_pred - y).sum()
    print(n, loss)
    
    #Backprop to compute gradients of weights with respect to loss
    grad_y_pred = 2.0 * (y_pred - y)
    grad_weight2 = dot_product_relu.T.dot(grad_y_pred)
    grad_dot_product_relu = grad_y_pred.dot(weight2.T)
    grad_dot_product = grad_dot_product_relu.copy()
    grad_dot_product[dot_product < 0] = 0
    grad_weight1 = x.T.dot(grad_dot_product)
    
    #Update weights
    weight1 -= learning_rate * grad_weight1
    weight2 -= learning_rate * grad_weight2

0 37845034.629874274
1 34627241.03070473
2 32655704.484900925
3 27316809.63898759
4 19072673.913088143
5 11307269.320243374
6 6222184.290220011
7 3529787.1507394155
8 2217245.955557823
9 1556251.1263445052
10 1189439.6756037702
11 958768.355044817
12 797235.662821632
13 675189.2609308808
14 578564.1438420305
15 499963.82238470856
16 434840.2593742531
17 380242.74458235776
18 334074.11616686214
19 294793.59189050255
20 261123.98490504394
21 232103.3274082265
22 207005.84027004754
23 185185.95452089055
24 166107.30546100345
25 149367.16186894535
26 134624.7509342181
27 121592.4124593137
28 110036.59157502267
29 99764.26217950339
30 90610.67416706437
31 82432.42922971566
32 75108.22534878206
33 68542.51725454224
34 62647.00174122654
35 57336.81092451098
36 52539.5697387961
37 48198.82757174697
38 44263.91354191446
39 40691.86828552211
40 37445.56447657179
41 34489.827183858506
42 31796.92343559618
43 29339.31818143133
44 27092.01700907646
45 25036.334745737186
46 23153.417866915617
47 214

377 0.00038718655552408103
378 0.0003687235434394597
379 0.0003511404053084834
380 0.00033439595086117135
381 0.00031844305156515656
382 0.0003032537372648716
383 0.00028879183185950537
384 0.000275019587477439
385 0.0002619087967213051
386 0.0002494247892600361
387 0.00023753807615299068
388 0.00022621621054857142
389 0.0002154356553959668
390 0.0002051732860815496
391 0.00019539534266980048
392 0.0001860901174657919
393 0.00017722879827829712
394 0.0001687877912584345
395 0.00016074912138215532
396 0.0001530977942460862
397 0.0001458110365784273
398 0.0001388715274271851
399 0.00013226118151570012
400 0.00012596609053512907
401 0.00011997252517416165
402 0.0001142633510860908
403 0.00010882762627625635
404 0.00010364988226387395
405 9.872413658141584e-05
406 9.403017427004544e-05
407 8.955789155591537e-05
408 8.529855708210218e-05
409 8.124395975293687e-05
410 7.738069413085301e-05
411 7.370275329332453e-05
412 7.019951924022735e-05
413 6.686355410908152e-05
414 6.368574087467797e-05