Skip to content

Conversation

thecharlieblake
Copy link
Contributor

I've added the torch.compile analysis notebook to the repo and updated our docs to explain the recommended approach to optimisation/fusion. No need to go too deep on the notebook, but feel free to look at my conclusions there!

Also fyi, my issue flagging this up to the PyTorch people: pytorch/pytorch#101937

Copy link
Collaborator

@DouglasOrr DouglasOrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; although I guess there are more changes coming. recursive_defaultdict was my favourite 👁️-opener from this one 😄.

@thecharlieblake
Copy link
Contributor Author

Thanks v much! Changes are here - nothing major, mostly just removing all the caveats about the bugged bwd pass

Copy link
Collaborator

@DouglasOrr DouglasOrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; grand!

For a thorough analysis of the effect of unit-scaling
(for both individual operations and larger blocks), see the
`benchmarking compiled unit-scaled ops <https://github.com/graphcore-research/unit-scaling/tree/main/analysis/benchmarking_compiled_unit_scaled_ops.ipynb>`_ notebook.
Note that there's a bug in the latest PyTorch version meaning the backward pass
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to give a version number here too (is it 2.0.1)?

@thecharlieblake thecharlieblake merged commit dd1ad2c into main May 23, 2023
@thecharlieblake thecharlieblake linked an issue May 24, 2023 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fuse scaling factors
2 participants