Skip to content

Conversation

@DavidLandup0
Copy link
Collaborator

Context: MixTransformers use a HierarchicalTransformerEncoder which use a DropPath in the forward pass.
The drop path rates (dpr) through the layers (depthwise) are a linspace() progressing from 0 to a value (hardcoded in original research codebase but exposed as an argument in keras_hub).

The refactor seeks to clarify the usage of the argument.

Copy link
Collaborator

@divyashreepathihalli divyashreepathihalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@divyashreepathihalli divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Sep 30, 2024
@divyashreepathihalli divyashreepathihalli merged commit a3026a2 into keras-team:master Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kokoro:force-run Runs Tests on GPU

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants