Official implementation of:
"Directional Consistency as a Complementary Optimization Signal: The GONO Framework" Victor Daniel Gera — Anurag University [arXiv] | [Paper PDF]
We identify the direction-loss decoupling phenomenon: an optimizer can exhibit near-perfect directional consistency (cc_t → 1) while the loss barely decreases.
GONO exploits this by adapting Adam's β₁ using consecutive gradient cosine similarity:
cc_t = <∇L_t, ∇L_{t-1}> / (||∇L_t|| · ||∇L_{t-1}||) ← one dot product, O(d)
β₁,t = clip(β₁ · (1 + λ · cc_t), β₁_min, β₁_max)
| cc_t | Meaning | GONO response |
|---|---|---|
| ≈ +1 | Gradients agree (smooth descent) | Increase β₁ → amplify momentum |
| ≈ 0 | Neutral | β₁ unchanged → identical to Adam |
| ≈ -1 | Gradients conflict (oscillation) | Decrease β₁ → suppress oscillation |
Key result: cc_t achieves F1 = 1.00 for oscillation detection vs F1 = 0.45 for gradient norm.
git clone https://github.com/YOUR_USERNAME/gono-optimizer
cd gono-optimizer
pip install -e .Or install dependencies only:
pip install -r requirements.txtRequirements: Python ≥ 3.8, PyTorch ≥ 2.0, torchvision, numpy, matplotlib, scikit-learn
from gono import GONO
# Drop-in replacement for Adam — same hyperparameters
optimizer = GONO(
model.parameters(),
lr=1e-3,
beta1=0.9,
lam=0.4, # sensitivity to cc_t signal
beta1_min=0.5,
beta1_max=0.99
)
for x, y in dataloader:
optimizer.zero_grad()
loss = criterion(model(x), y)
loss.backward()
optimizer.step()All experiments must be run from the repository root directory.
cd gono-optimizer # make sure you are herepython experiments/exp1_decoupling.py- Runtime: ~10 seconds (CPU only, no GPU needed)
- Output:
results/exp1_decoupling.png - Expected: cc_t stabilizes well before loss converges
python experiments/exp2a_oscillation_detection.py- Runtime: ~5 seconds (CPU only)
- Output:
results/exp2a_oscillation_detection.png - Expected: Cons. Cosine F1=1.00, Gradient Norm F1=0.45
python experiments/exp3_mnist.py- Runtime: ~15 minutes on GPU, ~2 hours on CPU
- Downloads MNIST automatically to
./data/ - Output:
results/exp3_mnist.png - Expected: GONO ≈ 98.1%, Adam ≈ 97.1%, AdamW ≈ 98.2%
python experiments/exp4_cifar10.py- Runtime: ~10 minutes on GPU
- Downloads CIFAR-10 automatically to
./data/ - Output:
results/exp4_cifar10.png - Expected: GONO ≈ 43.1%, AdamW ≈ 43.2%, Adam ≈ 42.8%
python experiments/exp5_resnet18.py- Runtime: ~2 hours on GPU (100 epochs × 3 seeds)
- Downloads CIFAR-10 automatically to
./data/ - Output:
results/exp5_resnet18.png - Expected: GONO ≈ 75.4%, AdamW ≈ 76.9%, SGD-M ≈ 66.2%
| Parameter | Default | Description |
|---|---|---|
lr |
1e-3 | Learning rate (same as Adam) |
beta1 |
0.9 | Base momentum coefficient |
beta2 |
0.999 | Second moment coefficient |
eps |
1e-8 | Numerical stability |
lam |
0.4 | Sensitivity to cc_t signal |
beta1_min |
0.5 | Minimum allowed beta1 |
beta1_max |
0.99 | Maximum allowed beta1 |
weight_decay |
0.0 | L2 regularization |
@article{gera2026gono,
title = {Directional Consistency as a Complementary Optimization Signal:
The GONO Framework},
author = {Gera, Victor Daniel},
journal = {arXiv preprint},
year = {2026},
url = {https://arxiv.org/abs/2605.06575}
}

