Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate all dropout parameters into a single parameter applied globally across the whole model #2080

Open
justinxzhao opened this issue Jun 1, 2022 · 1 comment

Comments

@justinxzhao
Copy link
Collaborator

Currently, Ludwig defines separate dropout parameters for different components within a compositional module. For example, the StackedCNNRNN module takes 4 different dropout parameters:

  • conv_dropout: Dropout rate for the convolutional layers
  • recurrent_dropout: Dropout rate for the recurrent layers
  • dropout: Dropout rate for sequence embeddings
  • fc_dropout: Dropout rate for the final fully connected output layer.

While this is highly expressive, this seems like a configurability overkill -- in the literature there's not much evidence that there's a lot to be gained from using heterogenous dropout rates across a model. On the other hand, unifying all modules to use a single dropout parameter seems like it strikes a better balance.

@geoffreyangus
Copy link
Collaborator

From a simplicity standpoint, it would be nice to have one global dropout param... that said, we shouldn't constrain the user if they have a particular use case (i.e. reproducing some paper result). Maybe we could structure it similarly to how we have preprocessing, where one can choose preprocessing params for each individual feature, or choose global preprocessing params used by all features (of a given type).

In the dropout case, we allow people to set the dropout param for each ECD component, but also expose some global dropout param that auto-sets the dropout param for all components.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

2 participants