Update engine.py to zero grads prior to accumulation by tschoell · Pull Request #102 · karpathy/micrograd

tschoell · 2025-08-14T18:14:54Z

Because each operator accumulates its gradient during back propagation it's important to zero each grad prior to accumulation.

Without this change calling backward twice will produce the wrong value for grad i.e. backward() is not idempotent currently.

Because each operator accumulates its gradient during back propagation it's important to zero each grad prior to accumulation. Without this change calling backward twice will produce the wrong value for grad i.e. backward() is not idempotent currently.

- Simplify backward pass: replace _backward closures with _local_grads tuples (from karpathy#115) - Zero grads before backward for idempotent backward() calls (from karpathy#102) - Add exp, log, tanh, softmax to Value class - Add transformer components: Linear, Embedding, LayerNorm, Attention, MultiHeadAttention, FeedForward, TransformerBlock, Transformer, cross_entropy - Move single-output unwrapping from Layer to MLP (from karpathy#111) - Add input shape assertion in Neuron (from karpathy#107) - Add MLP test (from karpathy#111) - Expand .gitignore with standard Python patterns Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update engine.py to zero grads prior to accumulation#102

Update engine.py to zero grads prior to accumulation#102
tschoell wants to merge 1 commit intokarpathy:masterfrom
tschoell:zero-grads-before-accumulating

tschoell commented Aug 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tschoell commented Aug 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant