Skip to content

drfein/LitBench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LitBench Reward Model Training

Quick Setup Guide

1. Environment Setup

# Clone the repository
git clone <repository-url>
cd LitBench

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

2. Configure Credentials

Set up your HuggingFace token:

export HF_TOKEN="your_huggingface_token"

Set up Reddit API credentials (required for dataset creation):

export REDDIT_CLIENT_ID="your_reddit_client_id"
export REDDIT_CLIENT_SECRET="your_reddit_client_secret"

A guide to creating credentials can be found here

Optional: Configure Weights & Biases:

export WANDB_API_KEY="your_wandb_api_key"
export WANDB_PROJECT="reward_model_training"

3. Rehydrate Test Set

Rehydrate the test dataset from Reddit:

python scripts/rehydrate.py

This takes 1-2 hours due to Reddit rate limits.

4. Train Models

Make training scripts executable:

chmod +x training/train_BTRM.sh
chmod +x training/train_GenRM.sh

Train Bradley-Terry Reward Model:

./training/train_BTRM.sh

Or train Generative Reward Model:

./training/train_GenRM.sh

5. Configuration (Optional)

Edit training/train_BTRM.sh or training/train_GenRM.sh to modify:

  • Base model (default: meta-llama/Llama-3.2-1B)
  • Batch size (default: 128 effective batch size)
  • Output directory
  • Training parameters

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published