Update engine.py to zero grads prior to accumulation#102
Open
tschoell wants to merge 1 commit intokarpathy:masterfrom
Open
Update engine.py to zero grads prior to accumulation#102tschoell wants to merge 1 commit intokarpathy:masterfrom
tschoell wants to merge 1 commit intokarpathy:masterfrom
Conversation
Because each operator accumulates its gradient during back propagation it's important to zero each grad prior to accumulation. Without this change calling backward twice will produce the wrong value for grad i.e. backward() is not idempotent currently.
IgorTavcar
added a commit
to IgorTavcar/micrograd
that referenced
this pull request
Mar 5, 2026
- Simplify backward pass: replace _backward closures with _local_grads tuples (from karpathy#115) - Zero grads before backward for idempotent backward() calls (from karpathy#102) - Add exp, log, tanh, softmax to Value class - Add transformer components: Linear, Embedding, LayerNorm, Attention, MultiHeadAttention, FeedForward, TransformerBlock, Transformer, cross_entropy - Move single-output unwrapping from Layer to MLP (from karpathy#111) - Add input shape assertion in Neuron (from karpathy#107) - Add MLP test (from karpathy#111) - Expand .gitignore with standard Python patterns Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Because each operator accumulates its gradient during back propagation it's important to zero each grad prior to accumulation.
Without this change calling backward twice will produce the wrong value for grad i.e. backward() is not idempotent currently.