Skip to content

david-lindner/ai-safety-debate

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Training classifiers via Debate

Note: This code is a work in progress. It will change, and hopefully get better, during the next few months.

This repository provides code to reproduce the experiments from AI Safety via Debate (blogpost).

On top of that we run additional experiment on MNIST as well as FashionMNIST data and train classifiers from debate results.

Setup

Install the python dependencies by running the following in a python 3.6

pip install -r requirements.txt

Usage

All code is located in the ai-safety-debate folder.

  • To train a judge use train_judge.py
  • To run individual debates use run_debate.py
  • To evaluate the accuracy of a judge combined with debate use amplify_judge_with_debate.py
  • To use debate to train a classifier use train_classifier_via_debate.py

We use sacred for tracking experiments. The results are typically stored in the experiments and amplification_experiments folders. Scripts that use sacred have parameters specified in a config function. To specify values for these parameters, use the with statement, e.g.

python ai-safety-debate/run_debate.py with judge_path=ai-safety-debate/saved_models/mnist4

About

Implementing 'AI Safety via Debate' experiments

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages