Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sharenoise #1

Merged
merged 10 commits into from Jan 19, 2021
Merged

sharenoise #1

merged 10 commits into from Jan 19, 2021

Conversation

aliutkus
Copy link
Owner

optimizes memory usage and speed by sharing the SPE along all layers.

This is done in the following way:

  • not redrawing noise each time, so as to share qbar and kbar on all layers. The strategy picked is to keep qbar and kbar untouched as long as their shapes are ok. They must be manually resetted if required.
  • remove the use of einsum. It's indeed much nicer, but for some mysterious reason, it apparently did not allow to save RAM when reusing qbar and kbar.

the notebook tries to apply the SPE many times in a row, to simulate many layers.

⚠️ note that my_spe.reset() must now be called explicitly each time a new spe must be computed (typically at each batch during training).

@aliutkus aliutkus requested a review from cifkao January 18, 2021 06:40
dev/spe/spe.py Show resolved Hide resolved
dev/spe/spe.py Show resolved Hide resolved
dev/spe/spe.py Show resolved Hide resolved
dev/spe/spe.py Outdated Show resolved Hide resolved
dev/spe/spe.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@cifkao cifkao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I would prefer to require calling reset manually with the (maximum) query length (shape) instead of calling it automatically if the shape doesn't match. Then in forward we can just check that we have at least the required length and truncate it if needed. That would allow to store the PEs for multiple iterations in case we find the redrawing is slow or want deterministic behavior.

dev/spe/spe.py Outdated Show resolved Hide resolved
dev/spe/spe.py Outdated Show resolved Hide resolved
dev/spe/spe.py Outdated Show resolved Hide resolved
dev/spe/spe.py Outdated Show resolved Hide resolved
@aliutkus
Copy link
Owner Author

Maybe I would prefer to require calling reset manually with the (maximum) query length (shape) instead of calling it automatically if the shape doesn't match. Then in forward we can just check that we have at least the required length and truncate it if needed. That would allow to store the PEs for multiple iterations in case we find the redrawing is slow or want deterministic behavior.

Yes, reset should be called manually, at the start of the forward of the hosting module.
but your solution does not work: we really do need to call it again at each forward pass, during training stage, because if we don't do it, we don't update the parameters of the SPE

@aliutkus
Copy link
Owner Author

Maybe I would prefer to require calling reset manually with the (maximum) query length (shape) instead of calling it automatically if the shape doesn't match. Then in forward we can just check that we have at least the required length and truncate it if needed. That would allow to store the PEs for multiple iterations in case we find the redrawing is slow or want deterministic behavior.

Yes, reset should be called manually, at the start of the forward of the hosting module.
but your solution does not work: we really do need to call it again at each forward pass, during training stage, because if we don't do it, we don't update the parameters of the SPE

Still, we could store the pure noise z for reusability, yes

dev/spe/spe.py Outdated
@@ -188,6 +217,8 @@ def __init__(
in_features: int = 64,
num_realizations: int = 256,
num_sines: int = 10,
key_shape: Optional[Tuple[int, ...]] = None,
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks a bit redundant to me, what about a max_length ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

dev/spe/spe.py Outdated
self.reset(key_shape, share_in_batch)

def reset(self,
key_shape: Tuple[int, ...],
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like it that we need to provide the key_shape each time, we can't do otherwise ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

dev/spe/spe.py Outdated Show resolved Hide resolved
@cifkao cifkao merged commit ea9c2cd into main Jan 19, 2021
cifkao added a commit that referenced this pull request May 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants