-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to isolate a particular hidden layer result, in a CNN? #6
Comments
Hi David,
my first intuition for things like this would be to make the 8x8
convolution you're interested in part of the output of the model (just
identically putting it next to the "real" output and ignoring it during
error computation). You might have to add support for this, not sure how
easy it would be, but probably nicer than pattern matching on the internals.
Greetings -
Lars
…On Thu, Oct 5, 2017 at 6:44 PM, David Banas ***@***.***> wrote:
Referring to Lars' *MNIST.hs* example, and given this layout of a CNN:
c = reLULayer
. cArr (Diff toVector)
. (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28 28 1) (Volume 8 8 8))
. cArr (Diff fromMatrix)
what's the best way to make one of the 8 8x8 convolution results available
to the reporting pipe?
I'd like to have the reporting pipe dump this "image" every so often, so I
can see the convolution kernels "tuning" themselves to certain image
features.
Also, am I correct in assuming that I'll fail if I try to brute force this
by picking apart the model component, via pattern matching, due to the
existentially hidden shape of the model parameter set?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#6>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3>
.
|
Thanks, Lars! Looking into this, now.
Hey, I’m wondering how you handle padding.
Do you infer what’s needed, from the types?
So, for instance, given my current CNN definition:
c = reLULayer
. cArr (Diff toVector)
. (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 4 4 32) (Volume 1 1 32))
. (convolution (Proxy :: DP.Proxy 3) 1 reLULayer :: Component (Volume 6 6 16) (Volume 4 4 32))
. (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 24 24 16) (Volume 6 6 16))
. (convolution (Proxy :: DP.Proxy 5) 1 reLULayer :: Component (Volume 28 28 1) (Volume 24 24 16))
. cArr (Diff fromMatrix)
Does that first convolution use VALID padding?
Thanks,
-db
… On Oct 15, 2017, at 2:55 AM, Lars Brünjes ***@***.***> wrote:
Hi David,
my first intuition for things like this would be to make the 8x8
convolution you're interested in part of the output of the model (just
identically putting it next to the "real" output and ignoring it during
error computation). You might have to add support for this, not sure how
easy it would be, but probably nicer than pattern matching on the internals.
Greetings -
Lars
On Thu, Oct 5, 2017 at 6:44 PM, David Banas ***@***.***>
wrote:
> Referring to Lars' *MNIST.hs* example, and given this layout of a CNN:
>
> c = reLULayer
> . cArr (Diff toVector)
> . (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28 28 1) (Volume 8 8 8))
> . cArr (Diff fromMatrix)
>
> what's the best way to make one of the 8 8x8 convolution results available
> to the reporting pipe?
>
> I'd like to have the reporting pipe dump this "image" every so often, so I
> can see the convolution kernels "tuning" themselves to certain image
> features.
>
> Also, am I correct in assuming that I'll fail if I try to brute force this
> by picking apart the model component, via pattern matching, due to the
> existentially hidden shape of the model parameter set?
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#6>, or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3>
> .
>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#6 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA_7msaS3diZc9hIV4s1bpaVp2Q_OVCcks5ssdaQgaJpZM4PvYO3>.
|
Hi Lars,
I’ve got the code wired up (I think) to dump one of the 5x5 convolution kernels, every 5 iterations.
It doesn’t appear to be changing, which is consistent with the batch error being reported, which wanders around a little, but doesn’t really decrease steadily.
My network doesn’t appear to be learning, and I’m trying to figure out why.
I’m wondering if representing the maxPool operation as a differentiable function might be breaking back-propagation?
Any thoughts about that?
I’ve attached a few examples of the dumped kernel, for my last run, which cover the first 50 iterations.
Also, I’ve pushed all my code to my fork of your neural repository, on GitHub:
https://github.com/capn-freako/neural <https://github.com/capn-freako/neural>
I was hoping you might have a moment to peruse the new code I’ve added and see if anything jumps out at you as obviously incorrect.
Thanks!
-db
… On Oct 19, 2017, at 8:21 AM, David Banas ***@***.***> wrote:
Thanks, Lars! Looking into this, now.
Hey, I’m wondering how you handle padding.
Do you infer what’s needed, from the types?
So, for instance, given my current CNN definition:
c = reLULayer
. cArr (Diff toVector)
. (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 4 4 32) (Volume 1 1 32))
. (convolution (Proxy :: DP.Proxy 3) 1 reLULayer :: Component (Volume 6 6 16) (Volume 4 4 32))
. (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 24 24 16) (Volume 6 6 16))
. (convolution (Proxy :: DP.Proxy 5) 1 reLULayer :: Component (Volume 28 28 1) (Volume 24 24 16))
. cArr (Diff fromMatrix)
Does that first convolution use VALID padding?
Thanks,
-db
> On Oct 15, 2017, at 2:55 AM, Lars Brünjes ***@***.*** ***@***.***>> wrote:
>
> Hi David,
>
> my first intuition for things like this would be to make the 8x8
> convolution you're interested in part of the output of the model (just
> identically putting it next to the "real" output and ignoring it during
> error computation). You might have to add support for this, not sure how
> easy it would be, but probably nicer than pattern matching on the internals.
>
> Greetings -
> Lars
>
> On Thu, Oct 5, 2017 at 6:44 PM, David Banas ***@***.*** ***@***.***>>
> wrote:
>
> > Referring to Lars' *MNIST.hs* example, and given this layout of a CNN:
> >
> > c = reLULayer
> > . cArr (Diff toVector)
> > . (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28 28 1) (Volume 8 8 8))
> > . cArr (Diff fromMatrix)
> >
> > what's the best way to make one of the 8 8x8 convolution results available
> > to the reporting pipe?
> >
> > I'd like to have the reporting pipe dump this "image" every so often, so I
> > can see the convolution kernels "tuning" themselves to certain image
> > features.
> >
> > Also, am I correct in assuming that I'll fail if I try to brute force this
> > by picking apart the model component, via pattern matching, due to the
> > existentially hidden shape of the model parameter set?
> >
> > —
> > You are receiving this because you are subscribed to this thread.
> > Reply to this email directly, view it on GitHub
> > <#6 <#6>>, or mute the thread
> > <https://github.com/notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3 <https://github.com/notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3>>
> > .
> >
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub <#6 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA_7msaS3diZc9hIV4s1bpaVp2Q_OVCcks5ssdaQgaJpZM4PvYO3>.
>
|
Hi Lars,
As a test of my idea, re: the maxPool layers as culprits, I simplified my model structure:
c = reLULayer
. cArr (Diff toVector)
. (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28 28 1) (Volume 8 8 4))
. cArr (Diff fromMatrix)
However, my kernel still doesn’t appear to be changing.
I let this run go for 750 iterations, and have attached 10 (every 75th) images.
Note that the images are 8x7, this time. I’d forgotten about the bias column, last time.
I’m wondering if you have any suggestions for what to try next.
Thanks,
-db
===============
Hi Lars,
I’ve got the code wired up (I think) to dump one of the 5x5 convolution kernels, every 5 iterations.
It doesn’t appear to be changing, which is consistent with the batch error being reported, which wanders around a little, but doesn’t really decrease steadily.
My network doesn’t appear to be learning, and I’m trying to figure out why.
I’m wondering if representing the maxPool operation as a differentiable function might be breaking back-propagation?
Any thoughts about that?
I’ve attached a few examples of the dumped kernel, for my last run, which cover the first 50 iterations.
Also, I’ve pushed all my code to my fork of your neural repository, on GitHub:
https://github.com/capn-freako/neural <https://github.com/capn-freako/neural>
I was hoping you might have a moment to peruse the new code I’ve added and see if anything jumps out at you as obviously incorrect.
Thanks!
-db
… On Oct 19, 2017, at 8:21 AM, David Banas ***@***.*** ***@***.***>> wrote:
Thanks, Lars! Looking into this, now.
Hey, I’m wondering how you handle padding.
Do you infer what’s needed, from the types?
So, for instance, given my current CNN definition:
c = reLULayer
. cArr (Diff toVector)
. (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 4 4 32) (Volume 1 1 32))
. (convolution (Proxy :: DP.Proxy 3) 1 reLULayer :: Component (Volume 6 6 16) (Volume 4 4 32))
. (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 24 24 16) (Volume 6 6 16))
. (convolution (Proxy :: DP.Proxy 5) 1 reLULayer :: Component (Volume 28 28 1) (Volume 24 24 16))
. cArr (Diff fromMatrix)
Does that first convolution use VALID padding?
Thanks,
-db
> On Oct 15, 2017, at 2:55 AM, Lars Brünjes ***@***.*** ***@***.***>> wrote:
>
> Hi David,
>
> my first intuition for things like this would be to make the 8x8
> convolution you're interested in part of the output of the model (just
> identically putting it next to the "real" output and ignoring it during
> error computation). You might have to add support for this, not sure how
> easy it would be, but probably nicer than pattern matching on the internals.
>
> Greetings -
> Lars
>
> On Thu, Oct 5, 2017 at 6:44 PM, David Banas ***@***.*** ***@***.***>>
> wrote:
>
> > Referring to Lars' *MNIST.hs* example, and given this layout of a CNN:
> >
> > c = reLULayer
> > . cArr (Diff toVector)
> > . (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28 28 1) (Volume 8 8 8))
> > . cArr (Diff fromMatrix)
> >
> > what's the best way to make one of the 8 8x8 convolution results available
> > to the reporting pipe?
> >
> > I'd like to have the reporting pipe dump this "image" every so often, so I
> > can see the convolution kernels "tuning" themselves to certain image
> > features.
> >
> > Also, am I correct in assuming that I'll fail if I try to brute force this
> > by picking apart the model component, via pattern matching, due to the
> > existentially hidden shape of the model parameter set?
> >
> > —
> > You are receiving this because you are subscribed to this thread.
> > Reply to this email directly, view it on GitHub
> > <#6 <#6>>, or mute the thread
> > <https://github.com/notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3 <https://github.com/notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3>>
> > .
> >
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub <#6 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA_7msaS3diZc9hIV4s1bpaVp2Q_OVCcks5ssdaQgaJpZM4PvYO3>.
>
|
Hi Lars,
Great news! I was able to get this simplified version of the CNN model to achieve 90% accuracy, by changing the initial learning rate to 0.1.
I’ve pushed the code update, and will generate a pull request to you, now.
When you get a chance, I’d like to chat with you about how to push the more complex version of the model forward.
Thanks!
-db
============
Hi Lars,
As a test of my idea, re: the maxPool layers as culprits, I simplified my model structure:
c = reLULayer
. cArr (Diff toVector)
. (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28 28 1) (Volume 8 8 4))
. cArr (Diff fromMatrix)
However, my kernel still doesn’t appear to be changing.
I let this run go for 750 iterations, and have attached 10 (every 75th) images.
Note that the images are 8x7, this time. I’d forgotten about the bias column, last time.
I’m wondering if you have any suggestions for what to try next.
Thanks,
-db
===============
Hi Lars,
I’ve got the code wired up (I think) to dump one of the 5x5 convolution kernels, every 5 iterations.
It doesn’t appear to be changing, which is consistent with the batch error being reported, which wanders around a little, but doesn’t really decrease steadily.
My network doesn’t appear to be learning, and I’m trying to figure out why.
I’m wondering if representing the maxPool operation as a differentiable function might be breaking back-propagation?
Any thoughts about that?
I’ve attached a few examples of the dumped kernel, for my last run, which cover the first 50 iterations.
Also, I’ve pushed all my code to my fork of your neural repository, on GitHub:
https://github.com/capn-freako/neural <https://github.com/capn-freako/neural>
I was hoping you might have a moment to peruse the new code I’ve added and see if anything jumps out at you as obviously incorrect.
Thanks!
-db
… On Oct 19, 2017, at 8:21 AM, David Banas ***@***.*** ***@***.***>> wrote:
Thanks, Lars! Looking into this, now.
Hey, I’m wondering how you handle padding.
Do you infer what’s needed, from the types?
So, for instance, given my current CNN definition:
c = reLULayer
. cArr (Diff toVector)
. (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 4 4 32) (Volume 1 1 32))
. (convolution (Proxy :: DP.Proxy 3) 1 reLULayer :: Component (Volume 6 6 16) (Volume 4 4 32))
. (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 24 24 16) (Volume 6 6 16))
. (convolution (Proxy :: DP.Proxy 5) 1 reLULayer :: Component (Volume 28 28 1) (Volume 24 24 16))
. cArr (Diff fromMatrix)
Does that first convolution use VALID padding?
Thanks,
-db
> On Oct 15, 2017, at 2:55 AM, Lars Brünjes ***@***.*** ***@***.***>> wrote:
>
> Hi David,
>
> my first intuition for things like this would be to make the 8x8
> convolution you're interested in part of the output of the model (just
> identically putting it next to the "real" output and ignoring it during
> error computation). You might have to add support for this, not sure how
> easy it would be, but probably nicer than pattern matching on the internals.
>
> Greetings -
> Lars
>
> On Thu, Oct 5, 2017 at 6:44 PM, David Banas ***@***.*** ***@***.***>>
> wrote:
>
> > Referring to Lars' *MNIST.hs* example, and given this layout of a CNN:
> >
> > c = reLULayer
> > . cArr (Diff toVector)
> > . (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28 28 1) (Volume 8 8 8))
> > . cArr (Diff fromMatrix)
> >
> > what's the best way to make one of the 8 8x8 convolution results available
> > to the reporting pipe?
> >
> > I'd like to have the reporting pipe dump this "image" every so often, so I
> > can see the convolution kernels "tuning" themselves to certain image
> > features.
> >
> > Also, am I correct in assuming that I'll fail if I try to brute force this
> > by picking apart the model component, via pattern matching, due to the
> > existentially hidden shape of the model parameter set?
> >
> > —
> > You are receiving this because you are subscribed to this thread.
> > Reply to this email directly, view it on GitHub
> > <#6 <#6>>, or mute the thread
> > <https://github.com/notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3 <https://github.com/notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3>>
> > .
> >
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub <#6 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA_7msaS3diZc9hIV4s1bpaVp2Q_OVCcks5ssdaQgaJpZM4PvYO3>.
>
|
Hi David,
great! I''ll have a look as soon as I can!
Greetings -
Lars
On Sun, Oct 22, 2017 at 12:30 AM, David Banas <notifications@github.com>
wrote:
… Hi Lars,
Great news! I was able to get this simplified version of the CNN model to
achieve 90% accuracy, by changing the initial learning rate to 0.1.
I’ve pushed the code update, and will generate a pull request to you, now.
When you get a chance, I’d like to chat with you about how to push the
more complex version of the model forward.
Thanks!
-db
============
Hi Lars,
As a test of my idea, re: the maxPool layers as culprits, I simplified my
model structure:
c = reLULayer
. cArr (Diff toVector)
. (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume 28
28 1) (Volume 8 8 4))
. cArr (Diff fromMatrix)
However, my kernel still doesn’t appear to be changing.
I let this run go for 750 iterations, and have attached 10 (every 75th)
images.
Note that the images are 8x7, this time. I’d forgotten about the bias
column, last time.
I’m wondering if you have any suggestions for what to try next.
Thanks,
-db
===============
Hi Lars,
I’ve got the code wired up (I think) to dump one of the 5x5 convolution
kernels, every 5 iterations.
It doesn’t appear to be changing, which is consistent with the batch error
being reported, which wanders around a little, but doesn’t really decrease
steadily.
My network doesn’t appear to be learning, and I’m trying to figure out why.
I’m wondering if representing the maxPool operation as a differentiable
function might be breaking back-propagation?
Any thoughts about that?
I’ve attached a few examples of the dumped kernel, for my last run, which
cover the first 50 iterations.
Also, I’ve pushed all my code to my fork of your neural repository, on
GitHub:
https://github.com/capn-freako/neural <https://github.com/capn-
freako/neural>
I was hoping you might have a moment to peruse the new code I’ve added and
see if anything jumps out at you as obviously incorrect.
Thanks!
-db
> On Oct 19, 2017, at 8:21 AM, David Banas ***@***.*** <mailto:
***@***.***>> wrote:
>
> Thanks, Lars! Looking into this, now.
>
> Hey, I’m wondering how you handle padding.
> Do you infer what’s needed, from the types?
> So, for instance, given my current CNN definition:
>
> c = reLULayer
> . cArr (Diff toVector)
> . (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 4 4 32) (Volume
1 1 32))
> . (convolution (Proxy :: DP.Proxy 3) 1 reLULayer :: Component (Volume 6
6 16) (Volume 4 4 32))
> . (maxPool (Proxy :: DP.Proxy 4) 4 :: Component (Volume 24 24 16)
(Volume 6 6 16))
> . (convolution (Proxy :: DP.Proxy 5) 1 reLULayer :: Component (Volume 28
28 1) (Volume 24 24 16))
> . cArr (Diff fromMatrix)
>
> Does that first convolution use VALID padding?
>
> Thanks,
> -db
>
>> On Oct 15, 2017, at 2:55 AM, Lars Brünjes ***@***.***
***@***.***>> wrote:
>>
>> Hi David,
>>
>> my first intuition for things like this would be to make the 8x8
>> convolution you're interested in part of the output of the model (just
>> identically putting it next to the "real" output and ignoring it during
>> error computation). You might have to add support for this, not sure how
>> easy it would be, but probably nicer than pattern matching on the
internals.
>>
>> Greetings -
>> Lars
>>
>> On Thu, Oct 5, 2017 at 6:44 PM, David Banas ***@***.***
***@***.***>>
>> wrote:
>>
>> > Referring to Lars' *MNIST.hs* example, and given this layout of a CNN:
>> >
>> > c = reLULayer
>> > . cArr (Diff toVector)
>> > . (convolution (Proxy :: DP.Proxy 7) 3 reLULayer :: Component (Volume
28 28 1) (Volume 8 8 8))
>> > . cArr (Diff fromMatrix)
>> >
>> > what's the best way to make one of the 8 8x8 convolution results
available
>> > to the reporting pipe?
>> >
>> > I'd like to have the reporting pipe dump this "image" every so often,
so I
>> > can see the convolution kernels "tuning" themselves to certain image
>> > features.
>> >
>> > Also, am I correct in assuming that I'll fail if I try to brute force
this
>> > by picking apart the model component, via pattern matching, due to the
>> > existentially hidden shape of the model parameter set?
>> >
>> > —
>> > You are receiving this because you are subscribed to this thread.
>> > Reply to this email directly, view it on GitHub
>> > <#6 <
#6>>, or mute the thread
>> > <https://github.com/notifications/unsubscribe-auth/AGA83_-
z9qXnneLbDbWXT1gGZ26iJ4P6ks5spQeKgaJpZM4PvYO3 <https://github.com/
notifications/unsubscribe-auth/AGA83_-z9qXnneLbDbWXT1gGZ26iJ4P6ks5sp
QeKgaJpZM4PvYO3>>
>> > .
>> >
>> —
>> You are receiving this because you authored the thread.
>> Reply to this email directly, view it on GitHub <
#6 (comment)>, or
mute the thread <https://github.com/notifications/unsubscribe-auth/AA_
7msaS3diZc9hIV4s1bpaVp2Q_OVCcks5ssdaQgaJpZM4PvYO3>.
>>
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#6 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AGA839qVehS4WF490vggYBXsDTYBH_zgks5sunB8gaJpZM4PvYO3>
.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Referring to Lars' MNIST.hs example, and given this layout of a CNN:
what's the best way to make one of the 8 8x8 convolution results available to the reporting pipe?
I'd like to have the reporting pipe dump this "image" every so often, so I can see the convolution kernels "tuning" themselves to certain image features.
Also, am I correct in assuming that I'll fail if I try to brute force this by picking apart the model component, via pattern matching, due to the existentially hidden shape of the model parameter set?
The text was updated successfully, but these errors were encountered: