Skip to content

Commit

Permalink
Add files via upload
Browse files Browse the repository at this point in the history
  • Loading branch information
KindXiaoming authored Feb 24, 2023
1 parent 8e97752 commit ce1df03
Show file tree
Hide file tree
Showing 18 changed files with 73,421 additions and 2 deletions.
22 changes: 20 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,20 @@
# Omnigrok
Omnigrok: Grokking Beyond Algorithmic Data
# Omnigrok
This is the code repo for the paper: ["Omnigrok: Grokking Beyond Algorithmic Data"](https://openreview.net/forum?id=zDiHoIWa0q1), accpeted in ICLR 2023 as spotlight. We elucidate [the grokking phenomenon](https://arxiv.org/abs/2201.02177) from the perspective of loss landscapes, and show that grokking can not only happen for algorithmic datasets and toy teacher-students setups, but also for standard machine learning datasets (e.g., MNIST handwritten digits, IMDb movie reviews, QM9 molecule property predictions).

The examples used in this paper are relatively small-scale. We also make our codes as minimal as possible: each example is self-consistent, kept in a single folder.
|Examples| Figure in [paper](https://openreview.net/forum?id=zDiHoIWa0q1) | Folder |
|--|--|--|
|Teacher-student|Figure 2| ./teacher-student|
| MNIST handwritten digits | Figure 3 | ./mnist |
| IMDb Movie Reviews | Figure 4 | ./imdb |
| QM9 Molecule properties | Figure 5 | ./qm9 |
| Modular addition | Figure 6 & 8 | ./mod-addition |
| MNIST Representation | Figure 7 | ./mnist-repr |

For each example, we conduct two kinds of experiments:
* (1) reduced landscape analysis: the weight norm is fixed during training.
* (2) grokking experiments: the weight norm is not fixed during training (standard training).

Each folder (except for MNIST representation) contains two subfolders, for (1) "landscape" and (2) "grokking".


Loading

0 comments on commit ce1df03

Please sign in to comment.