Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeated iterations #16

Closed
naruto678 opened this issue Feb 19, 2019 · 2 comments
Closed

Repeated iterations #16

naruto678 opened this issue Feb 19, 2019 · 2 comments

Comments

@naruto678
Copy link

naruto678 commented Feb 19, 2019

``
for k in range(batch_size):

        correct_cnt+=int(np.argmax(layer_2[k:k+1]==np.argmax(labels[batch_start+k:batch_start+k+1]))
        layer_2_delta = (labels[batch_start:batch_end]-layer_2)/batch_size
        layer_1_delta = layer_2_delta.dot(weights_1_2.T)* relu2deriv(layer_1)
        layer_1_delta *= dropout_mask
        weights_1_2 += alpha * layer_1.T.dot(layer_2_delta)
        weights_0_1 += alpha * layer_0.T.dot(layer_1_delta)

#####################################################
In the above code , why are we computing the values of layer_1_delta and layer_2_delta again and again...should not one iteration suffice ..what is the purpose..this is the code in regularization chapter for mnist digit classification with mini batched SGD...I changed some
####################################################

``

    layer_2_delta = (labels[batch_start:batch_end]-layer_2)/batch_size
    layer_1_delta = layer_2_delta.dot(weights_1_2.T)* relu2deriv(layer_1)
    weights_1_2 += (batch_size-1)*alpha * layer_1.T.dot(layer_2_delta)
    weights_0_1 += (batch_size-1)*alpha * layer_0.T.dot(layer_1_delta)
    layer_1_delta *= dropout_mask
    for k in range(batch_size):
        correct_cnt += int(np.argmax(layer_2[k:k+1]) == np.argmax(labels[batch_start+k:batch_start+k+1]))

##############################

this seems much faster and reaches the same bench marks
###############################

``

@Bering
Copy link

Bering commented Feb 21, 2019

I also struggled with that part. But if you look at the batching in the next chapter (on github), you'll see it's done differently. Only the correct_cnt calculation is in the loop:

for k in range(batch_size):
        correct_cnt += int(np.argmax(layer_2[k:k+1]) == np.argmax(labels[batch_start+k:batch_start+k+1]))

layer_2_delta = (labels[batch_start:batch_end]-layer_2) / (batch_size * layer_2.shape[0])
layer_1_delta = layer_2_delta.dot(weights_1_2.T) * tanh2deriv(layer_1)
layer_1_delta *= dropout_mask

weights_1_2 += alpha * layer_1.T.dot(layer_2_delta)
weights_0_1 += alpha * layer_0.T.dot(layer_1_delta)

@naruto678
Copy link
Author

naruto678 commented Feb 21, 2019

Thank you for clearing my doubt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants