-
I noticed that Zero Stage 3 performs this operation, but I'm very confused—won't PyTorch's computation graph still retain references to the full weights? |
Beta Was this translation helpful? Give feedback.
Answered by
tjruwase
Mar 30, 2025
Replies: 1 comment 1 reply
-
@Laiyi97, assuming I understand your question correctly, computation graph would have references to the |
Beta Was this translation helpful? Give feedback.
1 reply
Answer selected by
Laiyi97
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@Laiyi97, assuming I understand your question correctly, computation graph would have references to the
tensor
container but not totensor.data
payload which holds the weight values.