How to isolate a particular hidden layer result, in a CNN? #6

capn-freako · 2017-10-05T16:44:55Z

Referring to Lars' MNIST.hs example, and given this layout of a CNN:

    c = reLULayer
      . cArr (Diff toVector)
      . (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28 28 1) (Volume 8 8 8))
      . cArr (Diff fromMatrix)

what's the best way to make one of the 8 8x8 convolution results available to the reporting pipe?

I'd like to have the reporting pipe dump this "image" every so often, so I can see the convolution kernels "tuning" themselves to certain image features.

Also, am I correct in assuming that I'll fail if I try to brute force this by picking apart the model component, via pattern matching, due to the existentially hidden shape of the model parameter set?

brunjlar · 2017-10-15T09:55:26Z

Hi David, my first intuition for things like this would be to make the 8x8 convolution you're interested in part of the output of the model (just identically putting it next to the "real" output and ignoring it during error computation). You might have to add support for this, not sure how easy it would be, but probably nicer than pattern matching on the internals. Greetings - Lars

…

On Thu, Oct 5, 2017 at 6:44 PM, David Banas ***@***.***> wrote: Referring to Lars' *MNIST.hs* example, and given this layout of a CNN: c = reLULayer . cArr (Diff toVector) . (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28 28 1) (Volume 8 8 8)) . cArr (Diff fromMatrix) what's the best way to make one of the 8 8x8 convolution results available to the reporting pipe? I'd like to have the reporting pipe dump this "image" every so often, so I can see the convolution kernels "tuning" themselves to certain image features. Also, am I correct in assuming that I'll fail if I try to brute force this by picking apart the model component, via pattern matching, due to the existentially hidden shape of the model parameter set? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#6>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3> .

capn-freako · 2017-10-19T15:21:58Z

Thanks, Lars! Looking into this, now. Hey, I’m wondering how you handle padding. Do you infer what’s needed, from the types? So, for instance, given my current CNN definition: c = reLULayer . cArr (Diff toVector) . (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 4 4 32) (Volume 1 1 32)) . (convolution (Proxy :: DP.Proxy 3) 1 reLULayer :: Component (Volume 6 6 16) (Volume 4 4 32)) . (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 24 24 16) (Volume 6 6 16)) . (convolution (Proxy :: DP.Proxy 5) 1 reLULayer :: Component (Volume 28 28 1) (Volume 24 24 16)) . cArr (Diff fromMatrix) Does that first convolution use VALID padding? Thanks, -db

…

On Oct 15, 2017, at 2:55 AM, Lars Brünjes ***@***.***> wrote: Hi David, my first intuition for things like this would be to make the 8x8 convolution you're interested in part of the output of the model (just identically putting it next to the "real" output and ignoring it during error computation). You might have to add support for this, not sure how easy it would be, but probably nicer than pattern matching on the internals. Greetings - Lars On Thu, Oct 5, 2017 at 6:44 PM, David Banas ***@***.***> wrote: > Referring to Lars' *MNIST.hs* example, and given this layout of a CNN: > > c = reLULayer > . cArr (Diff toVector) > . (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28 28 1) (Volume 8 8 8)) > . cArr (Diff fromMatrix) > > what's the best way to make one of the 8 8x8 convolution results available > to the reporting pipe? > > I'd like to have the reporting pipe dump this "image" every so often, so I > can see the convolution kernels "tuning" themselves to certain image > features. > > Also, am I correct in assuming that I'll fail if I try to brute force this > by picking apart the model component, via pattern matching, due to the > existentially hidden shape of the model parameter set? > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <#6>, or mute the thread > <https://github.com/notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3> > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#6 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA_7msaS3diZc9hIV4s1bpaVp2Q_OVCcks5ssdaQgaJpZM4PvYO3>.

capn-freako · 2017-10-21T01:11:53Z

Hi Lars, I’ve got the code wired up (I think) to dump one of the 5x5 convolution kernels, every 5 iterations. It doesn’t appear to be changing, which is consistent with the batch error being reported, which wanders around a little, but doesn’t really decrease steadily. My network doesn’t appear to be learning, and I’m trying to figure out why. I’m wondering if representing the maxPool operation as a differentiable function might be breaking back-propagation? Any thoughts about that? I’ve attached a few examples of the dumped kernel, for my last run, which cover the first 50 iterations. Also, I’ve pushed all my code to my fork of your neural repository, on GitHub: https://github.com/capn-freako/neural <https://github.com/capn-freako/neural> I was hoping you might have a moment to peruse the new code I’ve added and see if anything jumps out at you as obviously incorrect. Thanks! -db

…

On Oct 19, 2017, at 8:21 AM, David Banas ***@***.***> wrote: Thanks, Lars! Looking into this, now. Hey, I’m wondering how you handle padding. Do you infer what’s needed, from the types? So, for instance, given my current CNN definition: c = reLULayer . cArr (Diff toVector) . (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 4 4 32) (Volume 1 1 32)) . (convolution (Proxy :: DP.Proxy 3) 1 reLULayer :: Component (Volume 6 6 16) (Volume 4 4 32)) . (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 24 24 16) (Volume 6 6 16)) . (convolution (Proxy :: DP.Proxy 5) 1 reLULayer :: Component (Volume 28 28 1) (Volume 24 24 16)) . cArr (Diff fromMatrix) Does that first convolution use VALID padding? Thanks, -db > On Oct 15, 2017, at 2:55 AM, Lars Brünjes ***@***.*** ***@***.***>> wrote: > > Hi David, > > my first intuition for things like this would be to make the 8x8 > convolution you're interested in part of the output of the model (just > identically putting it next to the "real" output and ignoring it during > error computation). You might have to add support for this, not sure how > easy it would be, but probably nicer than pattern matching on the internals. > > Greetings - > Lars > > On Thu, Oct 5, 2017 at 6:44 PM, David Banas ***@***.*** ***@***.***>> > wrote: > > > Referring to Lars' *MNIST.hs* example, and given this layout of a CNN: > > > > c = reLULayer > > . cArr (Diff toVector) > > . (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28 28 1) (Volume 8 8 8)) > > . cArr (Diff fromMatrix) > > > > what's the best way to make one of the 8 8x8 convolution results available > > to the reporting pipe? > > > > I'd like to have the reporting pipe dump this "image" every so often, so I > > can see the convolution kernels "tuning" themselves to certain image > > features. > > > > Also, am I correct in assuming that I'll fail if I try to brute force this > > by picking apart the model component, via pattern matching, due to the > > existentially hidden shape of the model parameter set? > > > > — > > You are receiving this because you are subscribed to this thread. > > Reply to this email directly, view it on GitHub > > <#6 <#6>>, or mute the thread > > <https://github.com/notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3 <https://github.com/notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3>> > > . > > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub <#6 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA_7msaS3diZc9hIV4s1bpaVp2Q_OVCcks5ssdaQgaJpZM4PvYO3>. >

capn-freako · 2017-10-21T17:55:34Z

Hi Lars, As a test of my idea, re: the maxPool layers as culprits, I simplified my model structure: c = reLULayer . cArr (Diff toVector) . (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28 28 1) (Volume 8 8 4)) . cArr (Diff fromMatrix) However, my kernel still doesn’t appear to be changing. I let this run go for 750 iterations, and have attached 10 (every 75th) images. Note that the images are 8x7, this time. I’d forgotten about the bias column, last time. I’m wondering if you have any suggestions for what to try next. Thanks, -db =============== Hi Lars, I’ve got the code wired up (I think) to dump one of the 5x5 convolution kernels, every 5 iterations. It doesn’t appear to be changing, which is consistent with the batch error being reported, which wanders around a little, but doesn’t really decrease steadily. My network doesn’t appear to be learning, and I’m trying to figure out why. I’m wondering if representing the maxPool operation as a differentiable function might be breaking back-propagation? Any thoughts about that? I’ve attached a few examples of the dumped kernel, for my last run, which cover the first 50 iterations. Also, I’ve pushed all my code to my fork of your neural repository, on GitHub: https://github.com/capn-freako/neural <https://github.com/capn-freako/neural> I was hoping you might have a moment to peruse the new code I’ve added and see if anything jumps out at you as obviously incorrect. Thanks! -db

…

On Oct 19, 2017, at 8:21 AM, David Banas ***@***.*** ***@***.***>> wrote: Thanks, Lars! Looking into this, now. Hey, I’m wondering how you handle padding. Do you infer what’s needed, from the types? So, for instance, given my current CNN definition: c = reLULayer . cArr (Diff toVector) . (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 4 4 32) (Volume 1 1 32)) . (convolution (Proxy :: DP.Proxy 3) 1 reLULayer :: Component (Volume 6 6 16) (Volume 4 4 32)) . (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 24 24 16) (Volume 6 6 16)) . (convolution (Proxy :: DP.Proxy 5) 1 reLULayer :: Component (Volume 28 28 1) (Volume 24 24 16)) . cArr (Diff fromMatrix) Does that first convolution use VALID padding? Thanks, -db > On Oct 15, 2017, at 2:55 AM, Lars Brünjes ***@***.*** ***@***.***>> wrote: > > Hi David, > > my first intuition for things like this would be to make the 8x8 > convolution you're interested in part of the output of the model (just > identically putting it next to the "real" output and ignoring it during > error computation). You might have to add support for this, not sure how > easy it would be, but probably nicer than pattern matching on the internals. > > Greetings - > Lars > > On Thu, Oct 5, 2017 at 6:44 PM, David Banas ***@***.*** ***@***.***>> > wrote: > > > Referring to Lars' *MNIST.hs* example, and given this layout of a CNN: > > > > c = reLULayer > > . cArr (Diff toVector) > > . (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28 28 1) (Volume 8 8 8)) > > . cArr (Diff fromMatrix) > > > > what's the best way to make one of the 8 8x8 convolution results available > > to the reporting pipe? > > > > I'd like to have the reporting pipe dump this "image" every so often, so I > > can see the convolution kernels "tuning" themselves to certain image > > features. > > > > Also, am I correct in assuming that I'll fail if I try to brute force this > > by picking apart the model component, via pattern matching, due to the > > existentially hidden shape of the model parameter set? > > > > — > > You are receiving this because you are subscribed to this thread. > > Reply to this email directly, view it on GitHub > > <#6 <#6>>, or mute the thread > > <https://github.com/notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3 <https://github.com/notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3>> > > . > > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub <#6 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA_7msaS3diZc9hIV4s1bpaVp2Q_OVCcks5ssdaQgaJpZM4PvYO3>. >

capn-freako · 2017-10-21T22:30:19Z

Hi Lars, Great news! I was able to get this simplified version of the CNN model to achieve 90% accuracy, by changing the initial learning rate to 0.1. I’ve pushed the code update, and will generate a pull request to you, now. When you get a chance, I’d like to chat with you about how to push the more complex version of the model forward. Thanks! -db ============ Hi Lars, As a test of my idea, re: the maxPool layers as culprits, I simplified my model structure: c = reLULayer . cArr (Diff toVector) . (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28 28 1) (Volume 8 8 4)) . cArr (Diff fromMatrix) However, my kernel still doesn’t appear to be changing. I let this run go for 750 iterations, and have attached 10 (every 75th) images. Note that the images are 8x7, this time. I’d forgotten about the bias column, last time. I’m wondering if you have any suggestions for what to try next. Thanks, -db =============== Hi Lars, I’ve got the code wired up (I think) to dump one of the 5x5 convolution kernels, every 5 iterations. It doesn’t appear to be changing, which is consistent with the batch error being reported, which wanders around a little, but doesn’t really decrease steadily. My network doesn’t appear to be learning, and I’m trying to figure out why. I’m wondering if representing the maxPool operation as a differentiable function might be breaking back-propagation? Any thoughts about that? I’ve attached a few examples of the dumped kernel, for my last run, which cover the first 50 iterations. Also, I’ve pushed all my code to my fork of your neural repository, on GitHub: https://github.com/capn-freako/neural <https://github.com/capn-freako/neural> I was hoping you might have a moment to peruse the new code I’ve added and see if anything jumps out at you as obviously incorrect. Thanks! -db

…

On Oct 19, 2017, at 8:21 AM, David Banas ***@***.*** ***@***.***>> wrote: Thanks, Lars! Looking into this, now. Hey, I’m wondering how you handle padding. Do you infer what’s needed, from the types? So, for instance, given my current CNN definition: c = reLULayer . cArr (Diff toVector) . (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 4 4 32) (Volume 1 1 32)) . (convolution (Proxy :: DP.Proxy 3) 1 reLULayer :: Component (Volume 6 6 16) (Volume 4 4 32)) . (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 24 24 16) (Volume 6 6 16)) . (convolution (Proxy :: DP.Proxy 5) 1 reLULayer :: Component (Volume 28 28 1) (Volume 24 24 16)) . cArr (Diff fromMatrix) Does that first convolution use VALID padding? Thanks, -db > On Oct 15, 2017, at 2:55 AM, Lars Brünjes ***@***.*** ***@***.***>> wrote: > > Hi David, > > my first intuition for things like this would be to make the 8x8 > convolution you're interested in part of the output of the model (just > identically putting it next to the "real" output and ignoring it during > error computation). You might have to add support for this, not sure how > easy it would be, but probably nicer than pattern matching on the internals. > > Greetings - > Lars > > On Thu, Oct 5, 2017 at 6:44 PM, David Banas ***@***.*** ***@***.***>> > wrote: > > > Referring to Lars' *MNIST.hs* example, and given this layout of a CNN: > > > > c = reLULayer > > . cArr (Diff toVector) > > . (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28 28 1) (Volume 8 8 8)) > > . cArr (Diff fromMatrix) > > > > what's the best way to make one of the 8 8x8 convolution results available > > to the reporting pipe? > > > > I'd like to have the reporting pipe dump this "image" every so often, so I > > can see the convolution kernels "tuning" themselves to certain image > > features. > > > > Also, am I correct in assuming that I'll fail if I try to brute force this > > by picking apart the model component, via pattern matching, due to the > > existentially hidden shape of the model parameter set? > > > > — > > You are receiving this because you are subscribed to this thread. > > Reply to this email directly, view it on GitHub > > <#6 <#6>>, or mute the thread > > <https://github.com/notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3 <https://github.com/notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3>> > > . > > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub <#6 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA_7msaS3diZc9hIV4s1bpaVp2Q_OVCcks5ssdaQgaJpZM4PvYO3>. >

brunjlar · 2017-10-22T14:13:08Z

Hi David, great! I''ll have a look as soon as I can! Greetings - Lars On Sun, Oct 22, 2017 at 12:30 AM, David Banas <notifications@github.com> wrote:

…

Hi Lars, Great news! I was able to get this simplified version of the CNN model to achieve 90% accuracy, by changing the initial learning rate to 0.1. I’ve pushed the code update, and will generate a pull request to you, now. When you get a chance, I’d like to chat with you about how to push the more complex version of the model forward. Thanks! -db ============ Hi Lars, As a test of my idea, re: the maxPool layers as culprits, I simplified my model structure: c = reLULayer . cArr (Diff toVector) . (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28 28 1) (Volume 8 8 4)) . cArr (Diff fromMatrix) However, my kernel still doesn’t appear to be changing. I let this run go for 750 iterations, and have attached 10 (every 75th) images. Note that the images are 8x7, this time. I’d forgotten about the bias column, last time. I’m wondering if you have any suggestions for what to try next. Thanks, -db =============== Hi Lars, I’ve got the code wired up (I think) to dump one of the 5x5 convolution kernels, every 5 iterations. It doesn’t appear to be changing, which is consistent with the batch error being reported, which wanders around a little, but doesn’t really decrease steadily. My network doesn’t appear to be learning, and I’m trying to figure out why. I’m wondering if representing the maxPool operation as a differentiable function might be breaking back-propagation? Any thoughts about that? I’ve attached a few examples of the dumped kernel, for my last run, which cover the first 50 iterations. Also, I’ve pushed all my code to my fork of your neural repository, on GitHub: https://github.com/capn-freako/neural <https://github.com/capn- freako/neural> I was hoping you might have a moment to peruse the new code I’ve added and see if anything jumps out at you as obviously incorrect. Thanks! -db > On Oct 19, 2017, at 8:21 AM, David Banas ***@***.*** <mailto: ***@***.***>> wrote: > > Thanks, Lars! Looking into this, now. > > Hey, I’m wondering how you handle padding. > Do you infer what’s needed, from the types? > So, for instance, given my current CNN definition: > > c = reLULayer > . cArr (Diff toVector) > . (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 4 4 32) (Volume 1 1 32)) > . (convolution (Proxy :: DP.Proxy 3) 1 reLULayer :: Component (Volume 6 6 16) (Volume 4 4 32)) > . (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 24 24 16) (Volume 6 6 16)) > . (convolution (Proxy :: DP.Proxy 5) 1 reLULayer :: Component (Volume 28 28 1) (Volume 24 24 16)) > . cArr (Diff fromMatrix) > > Does that first convolution use VALID padding? > > Thanks, > -db > >> On Oct 15, 2017, at 2:55 AM, Lars Brünjes ***@***.*** ***@***.***>> wrote: >> >> Hi David, >> >> my first intuition for things like this would be to make the 8x8 >> convolution you're interested in part of the output of the model (just >> identically putting it next to the "real" output and ignoring it during >> error computation). You might have to add support for this, not sure how >> easy it would be, but probably nicer than pattern matching on the internals. >> >> Greetings - >> Lars >> >> On Thu, Oct 5, 2017 at 6:44 PM, David Banas ***@***.*** ***@***.***>> >> wrote: >> >> > Referring to Lars' *MNIST.hs* example, and given this layout of a CNN: >> > >> > c = reLULayer >> > . cArr (Diff toVector) >> > . (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28 28 1) (Volume 8 8 8)) >> > . cArr (Diff fromMatrix) >> > >> > what's the best way to make one of the 8 8x8 convolution results available >> > to the reporting pipe? >> > >> > I'd like to have the reporting pipe dump this "image" every so often, so I >> > can see the convolution kernels "tuning" themselves to certain image >> > features. >> > >> > Also, am I correct in assuming that I'll fail if I try to brute force this >> > by picking apart the model component, via pattern matching, due to the >> > existentially hidden shape of the model parameter set? >> > >> > — >> > You are receiving this because you are subscribed to this thread. >> > Reply to this email directly, view it on GitHub >> > <#6 < #6>>, or mute the thread >> > <https://github.com/notifications/unsubscribe-auth/AGA83_- z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3 <https://github.com/ notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5sp QeKgaJpZM4PvYO3>> >> > . >> > >> — >> You are receiving this because you authored the thread. >> Reply to this email directly, view it on GitHub < #6 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA_ 7msaS3diZc9hIV4s1bpaVp2Q_OVCcks5ssdaQgaJpZM4PvYO3>. >> > — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#6 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGA839qVehS4WF490vggYBXsDTYBH_zgks5sunB8gaJpZM4PvYO3> .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to isolate a particular hidden layer result, in a CNN? #6

How to isolate a particular hidden layer result, in a CNN? #6

capn-freako commented Oct 5, 2017

brunjlar commented Oct 15, 2017 via email

capn-freako commented Oct 19, 2017 via email

capn-freako commented Oct 21, 2017 via email

capn-freako commented Oct 21, 2017 via email

capn-freako commented Oct 21, 2017 via email

brunjlar commented Oct 22, 2017 via email

How to isolate a particular hidden layer result, in a CNN? #6

How to isolate a particular hidden layer result, in a CNN? #6

Comments

capn-freako commented Oct 5, 2017

brunjlar commented Oct 15, 2017 via email

capn-freako commented Oct 19, 2017 via email

capn-freako commented Oct 21, 2017 via email

capn-freako commented Oct 21, 2017 via email

capn-freako commented Oct 21, 2017 via email

brunjlar commented Oct 22, 2017 via email