This repository contains a series of reinforcement learning experiments.
Experiment notebooks can be found in the /experiments
directory, along with library code and data. The repository also contains some helpful scripts for starting dev machines and other utilities.
Experiment results can be viewed on Weights & Biases.
To learn more about our research, check out our recent blog post at OpenPipe.
If you want to reproduce our work on deductive reasoning, take a look at our streamlined training recipe here.
If you're interested in training your own models with reinforcement learning or just chatting, feel free to reach out or email Kyle directly at kyle@openpipe.ai!
Install project dependencies using uv
:
uv sync
This will install all dependencies specified in pyproject.toml
and create a virtual environment at .venv
.
-
Create your environment file by copying the example:
cp .env.example .env
-
Edit your
.env
file with your personal details:# Git Configuration GIT_USER_NAME="Your Name" GIT_USER_EMAIL="your.email@example.com" # GitHub Authentication # Generate a personal access token at https://github.com/settings/tokens # Required scopes: repo GITHUB_TOKEN="your_github_personal_access_token"
To get your GitHub token:
- Go to https://github.com/settings/tokens
- Click "Generate new token"
- Select the
repo
scope - Copy the generated token and paste it into your
.env
file
To launch a cluster with the default configuration:
./launch-cluster.sh
This will:
- Load your environment variables from
.env
- Launch a cluster named "openpipe" using the configuration in
cluster.yaml
- Set up the development environment on the cluster
Additional sky launch options can be passed to the script. For example, to launch a cluster with 2 A100 GPUs:
./launch-cluster.sh --gpus A100:2
To SSH into your running cluster:
ssh openpipe
To use VSCode with the cluster:
- Press
Cmd/Ctrl + Shift + P
- Type
Remote-SSH: Connect Current Window to Host
- Select
openpipe
from the list - Open the
openpipe
folder to access the repo
-
List all running clusters:
sky status
-
Stop your cluster (to pause billing):
sky stop openpipe
-
Start a stopped cluster:
sky start openpipe
-
Terminate your cluster (to delete all resources):
sky down openpipe