-
Notifications
You must be signed in to change notification settings - Fork 19.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can i add a dropout after the input layer? #96
Comments
Hey, Not sure why you would want to do that, but here are is another simple solution: |
Dropout on the input layer is actually pretty common, and was used in the original dropout paper IIRC. It would be nice if the following syntax worked (which it currently does not): model = Sequential()
model.add(Dropout(0.1))
model.add(Dense(784, 20)) |
That would require to know what tensor type is expected as input (tensor2, tensor3, tensor4...). We need this info at compile time, so before any input has been seen by the network. So it would have to be specified by the user, it can't be automatically inferred. There would be two potential ways to do that. Either have the user pass the tensor type as an argument to Dropout (only used/needed when Dropout is the first layer in a network), or introduce an "input" layer that takes a similar argument, and that can be optionally used as first layer in a network (required when the first "real" layer is one that does not have a fixed shape, like activation, dropout, etc). |
I like the input layer approach, especially if we make it optional (i.e., use it only when it's needed by the first "real" layer). Otherwise, all general "adimensional" layers will need to have their dimensions specified, which does not make much sense since it's not really a layer property, but a side effect from the input size. |
Here's a different idea that would be even simpler: at compile time, if the first layer is "adimensional", go through the model (list or tree depending on the model type) until you find the first dimensional parent, and set an input attribute on the first layout based on the tensor type of the first dimensional parent. This would require no change in interface. It would fail in case there is no dimensional parent in the model, in which case the user could still manually set an input attribute on the first layer... I'll have a go at adding this later today. |
31ea018 both allows to start models with dimension-agnostic layers like Activation and Dropout, and allows for custom output target shapes (must be specified by the user). |
A note to future readers. Because we need to explicitly specify the size of the input we must use the "input_shape" parameter, ie. if before we used model = Sequential() Now we would use model = Sequential() (This took me a while to figure out, since I haven't tried anything yet like convolutions which use the "input_shape" parameter, and since I hadn't read about the Abstract base layer class in the documentation, which Dropout inherits from.) |
In the documentation/docstring of
Maybe add this to |
No description provided.
The text was updated successfully, but these errors were encountered: