Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chapter 6 - Array in array vs use vector #58

Open
oleg-tgn opened this issue Nov 10, 2021 · 1 comment
Open

Chapter 6 - Array in array vs use vector #58

oleg-tgn opened this issue Nov 10, 2021 · 1 comment

Comments

@oleg-tgn
Copy link

oleg-tgn commented Nov 10, 2021

Hello, Andrew!

Thank you very much for your book! It helps me a lot on the way to change my profession.

I am having difficulty understanding the code in Chapter 6 of the Putting it all Together section. It was difficult for me was to understand arrays in arrays and a lot of matrix transposition operations.

I've reproduced the code myself several times and tried using regular vectors. And I only needed the transpose operation once for the last calculation of weight_1.

I will provide the code below. I got the same result. The code seems easier to read to me. Tell me please, am I on the right way?
Or I misunderstood something important and this is bad code style?

I used constant values of the weights so that I could better understand the operation of the algorithm and in order to be able to duplicate the weights in the original algorithm and compare the results.

Sorry for my English.

Thank you in advance.

weights_1 = np.array([ [ -0.16595599,  0.44064899, -0.99977125, -0.39533485 ],
                       [ -0.70648822, -0.81532281, -0.62747958, -0.30887855 ],
                       [ -0.20646505, 0.07763347, -0.16161097,  0.370439 ] ] )

weights_2 = np.array([ -0.5910955, 0.75623487, -0.94522481, 0.34093502 ])

street_lights = np.array( [ [ 1, 0, 1 ],
                            [ 0, 1, 1 ],
                            [ 0, 0, 1 ],
                            [ 1, 1, 1 ]])

walk_vs_stop = np.array([1, 1, 0, 0])

for iteration in range(60):
    sum_error = 0
    for i in range(len(street_lights)):        
        layer_0 = street_lights[i]        
        layer_1 = relu(np.dot(layer_0, weights_1))
        layer_2 = np.dot(layer_1, weights_2)
        
                     
        sum_error += (layer_2 - walk_vs_stop[i]) ** 2        
        
        delta_2 = layer_2  - walk_vs_stop[i]
        delta_1 = np.dot(delta_2, weights_2) * relu2deriv(layer_1)
        
        
        weights_2 -= alpha * layer_1.dot(delta_2)        
        weights_1 -= alpha * np.dot(np.array([layer_0]).T, np.array([delta_1]))
       
    print(sum_error)          
@qiulang
Copy link

qiulang commented Dec 4, 2022

Hi, @oleg-tgn I have exactly the same question. It seems that letting weights_2 to be a matrix of (4,1) instead of an array of 4 elements introduces unnecessary complexity.

But my thought is that because the output of the network is one result so we can use the vector for weights_2. But normally the output of a DP is not just one result, but a vector, so the author use a matrix for weights_2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants