New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using "kernel" for weight matrix in Dense layer seems confusing #10010
Comments
A layer's "weights" include both the features matrix and the bias vector.
We needed a precise way to distinguish between kernel and bias. A way that
would be shared across all layer types. We settled for "kernel/bias", which
is canonical in conv layers, and was sometimes used in dense layers even
before this decision was made.
Also note that dense layers are a special case of conv layer with a window
that is the size of the input. It is thus as correct to refer to a "kernel"
in one case as it is in the other.
This notation is used throughout Keras and TensorFlow, and thus is
recognized by a supermajority of the deep learning community. It's
effectively canonical.
…On Sun, Apr 22, 2018, 19:04 Andreas Mueller ***@***.***> wrote:
The "Dense" layer has parameters "kernel_initializer",
"kernel_regularizer" and "kernel_constraint" which in my opinion are quite
confusing names. I have not seen the weights in a dense layer being
referred to as "kernel". The docs say "Initializer for the kernel weights
matrix". I assume these parameters are named for consistency with the
convolutional kernels and changing them might be tricky (though I'm not
sure if consistency with the convolutional layers is really a good
criterion here).
But I think the documentation could be a bit more clear. If the docs just
said "weight matrix" instead of "kernel weight matrix" I feel that would be
easier to understand.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#10010>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AArWb7cQQT1ybrxs80GGZS7W0tSYCqneks5trTaogaJpZM4TfI2t>
.
|
Thanks for your reply. If it's used in keras and tensorflow I guess it is now canonical (I have not seen it in a deep learning paper though). Having done deep learning before either existed, it's really foreign to me. In the language I am used to, weights did not include the biases. I guess tensorflow and keras decided a different nomenclature. "Kernel" is a very overloaded term in ML and I feel this usage is not helping. |
The "Dense" layer has parameters "kernel_initializer", "kernel_regularizer" and "kernel_constraint" which in my opinion are quite confusing names. I have not seen the weights in a dense layer being referred to as "kernel". The docs say "Initializer for the kernel weights matrix". I assume these parameters are named for consistency with the convolutional kernels and changing them might be tricky (though I'm not sure if consistency with the convolutional layers is really a good criterion here).
But I think the documentation could be a bit more clear. If the docs just said "weight matrix" instead of "kernel weight matrix" I feel that would be easier to understand.
The text was updated successfully, but these errors were encountered: