You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, Ludwig defines separate dropout parameters for different components within a compositional module. For example, the StackedCNNRNN module takes 4 different dropout parameters:
conv_dropout: Dropout rate for the convolutional layers
recurrent_dropout: Dropout rate for the recurrent layers
dropout: Dropout rate for sequence embeddings
fc_dropout: Dropout rate for the final fully connected output layer.
While this is highly expressive, this seems like a configurability overkill -- in the literature there's not much evidence that there's a lot to be gained from using heterogenous dropout rates across a model. On the other hand, unifying all modules to use a single dropout parameter seems like it strikes a better balance.
The text was updated successfully, but these errors were encountered:
From a simplicity standpoint, it would be nice to have one global dropout param... that said, we shouldn't constrain the user if they have a particular use case (i.e. reproducing some paper result). Maybe we could structure it similarly to how we have preprocessing, where one can choose preprocessing params for each individual feature, or choose global preprocessing params used by all features (of a given type).
In the dropout case, we allow people to set the dropout param for each ECD component, but also expose some global dropout param that auto-sets the dropout param for all components.
Currently, Ludwig defines separate dropout parameters for different components within a compositional module. For example, the
StackedCNNRNN
module takes 4 different dropout parameters:While this is highly expressive, this seems like a configurability overkill -- in the literature there's not much evidence that there's a lot to be gained from using heterogenous dropout rates across a model. On the other hand, unifying all modules to use a single dropout parameter seems like it strikes a better balance.
The text was updated successfully, but these errors were encountered: