Skip to content

CalculatedContent/WeightWatcher-Examples

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

D# WeightWatcher-Examples A curated collection of real-world examples, notebooks, and experiments using WeightWatcher, the open‑source tool for analyzing layer-wise spectra, heavy‑tailed behavior, power‑law exponents (α), correlation traps, and model quality throughout training.

These examples span small MLPs, double descent, and billion-parameter LLMs.


📘 Core Examples

Single layer example

How to analyze fine-tuned models


🧠 MLP + MNIST Experiments

How varying the batch size and/or learning rates affect convergence

Explaining Epoch-wise Double Descent

Comparing the inductive biases between AdamW and Muon

MLP3 on CIFAR10: Extreme overfitting in the first layer


🧬 LLM + Fine-Tuning Examples

Post Analysis of the paper "Overtrained Language Models Are Harder to Fine-Tune"

The Magic of Mistral Dragon Kings blog

Expeperiment Method: SVD Smooting

The original 1989 Double Descent Experiment (https://calculatedcontent.com/2024/03/01/describing-double-descent-with-weightwatcher/)


🧪 Miscellaneous

Comparing BERT, RoBERTa, XLNet

ONNX Format

Old experiments on random labels


🚀 What These Examples Demonstrate

  • How α < 2 identifies overfitting & correlation traps
  • Spectral phase transitions during training
  • Epoch-wise double descent behavior
  • Optimizer differences (Muon vs AdamW vs SGD)
  • Fine‑tuning shifts between underfit → well‑fit → overfit
  • Diagnostics for memorization and rank collapse

📦 Getting Started

git clone https://github.com/CalculatedContent/WeightWatcher-Examples.git
cd WeightWatcher-Examples
pip install weightwatcher
jupyter notebook

📜 License

MIT License — see LICENSE

About

User Examples of the WeightWatcher project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published