Skip to content

kevinniechen/scalinglaws

Repository files navigation

Scaling laws

This repository contains tools and analysis for exploring language model scaling laws, particularly focusing on the relationships between compute, dataset size, and model performance as described in Gadre et al., 2024.

Key Components

  • main.py: Implements core scaling law equations and optimization for compute/dataset relationships
  • scaling_explorer.py: Visualization and analysis tools for model scaling behavior
  • scaling/: Submodule from mlfoundations/scaling containing experimental data and evaluation results
    • exp_data/: Raw experimental data including model configurations and evaluation results
    • scaling_law_dict.pkl: Pre-fitted scaling law parameters

Setup

  1. Clone the repository:
git clone https://github.com/kevinniechen/scalinglaws.git
cd scalinglaws
  1. Initialize and update the scaling submodule:
git submodule add https://github.com/mlfoundations/scaling.git scaling
git submodule update --init --recursive
  1. Install requirements:
pip install -r requirements.txt

Requirements

  • Python 3.8+
  • NumPy
  • Matplotlib
  • Pandas
  • SciPy

Contributors

  • Kevin Niechen
  • Justin Rose

References

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •