New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sharenoise #1
sharenoise #1
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I would prefer to require calling reset
manually with the (maximum) query length (shape) instead of calling it automatically if the shape doesn't match. Then in forward
we can just check that we have at least the required length and truncate it if needed. That would allow to store the PEs for multiple iterations in case we find the redrawing is slow or want deterministic behavior.
Yes, |
Still, we could store the pure noise |
dev/spe/spe.py
Outdated
@@ -188,6 +217,8 @@ def __init__( | |||
in_features: int = 64, | |||
num_realizations: int = 256, | |||
num_sines: int = 10, | |||
key_shape: Optional[Tuple[int, ...]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks a bit redundant to me, what about a max_length
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
dev/spe/spe.py
Outdated
self.reset(key_shape, share_in_batch) | ||
|
||
def reset(self, | ||
key_shape: Tuple[int, ...], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like it that we need to provide the key_shape each time, we can't do otherwise ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
optimizes memory usage and speed by sharing the SPE along all layers.
This is done in the following way:
qbar
andkbar
on all layers. The strategy picked is to keepqbar
andkbar
untouched as long as their shapes are ok. They must be manually resetted if required.einsum
. It's indeed much nicer, but for some mysterious reason, it apparently did not allow to save RAM when reusingqbar
andkbar
.the notebook tries to apply the SPE many times in a row, to simulate many layers.
my_spe.reset()
must now be called explicitly each time a new spe must be computed (typically at each batch during training).