PonderBayes

Official repository for the paper "PonderBayes: Rationally Pondering Neural Networks" (Not submitted to peer review, but available for a browse in this repository)

Requirements and Setup

Details such as python and package versions can be found in the generated pyproject.toml and poetry.lock files.

We recommend using an environment manager such as conda. After setting up your environment with the correct python version, please proceed with the installation of the required packages

For poetry users, getting setup is as easy as running

poetry install

We also provide a requirements.txt file for pip users who do not wish to use poetry. In this case, simply run

pip install -r requirements.txt

This requirements.txt file is generated by running the following

sh gen_pip_reqs.sh

Additional/Optional Requirements

Some additional requirements are necessary for running notebooks/accuracies.ipynb, which is the notebook we use for generating figures and tables:

A complete TeXLive/MacTeX installation is required, since we make use of a TeX backend for our figure plots.
Our pretrained and pretested checkpoints and logs, available on The Internet Archive at this link
- please unzip the checkpoints and place them as the models/ directory under the root of the project

Project Organization

    ├── LICENSE
    ├── README.md              <- The top-level README for developers using this project.
    ├── data/
    │   ├── interim/           <- Intermediate data that has been transformed.
    │   ├── processed/         <- The final, canonical data sets for modeling.
    │   └── raw/               <- The original, immutable data dump.
    ├── models/                <- Trained and serialized models and logs
    ├── notebooks/             <- Jupyter notebooks.
    │   └── accuracies.ipynb   <- Notebook for generating figures and tables
    ├── reports/               <- Generated analysis as HTML, PDF, LaTeX, etc.
    ├── lisa/                  <- LISA (slurm compute) scripts, jobs, config.
    ├── pyproject.toml         <- project metadata, handled by poetry.
    ├── poetry.lock            <- resolving and locking dependencies, handled by poetry.
    ├── requirements.txt       <- for non-poetry users.
    ├── gen_pip_reqs.sh        <- for generating the pip requirements.txt file
    └── ponderbayes/           <- Source code for use in this project.
        ├── __init__.py        <- Makes src a Python module
        ├── data/              <- Scripts to download or generate data
        ├── models/            <- Model definitions
        ├── run/               <- scripts to train, evaluate and use models
        ├── utils.py           <- miscellaneous utils
        └── visualization/     <- Scripts for visualization

The project structure is largely based on the cookiecutter data-science template. This is purposely opinionated so that paths align over collaborators without having to edit config files. Users may find the cookiecutter data-science opinions page, of relevance

The top level data/ and models/ directory are in version control only to show structure. Their contents will not be committed and are ignored via .gitignore.

Usage

For training, refer to ponderbayes/run/train.py:

usage: train.py [-h] [-s SEED] [--disable-logging]
                [--model {pondernet,groupthink,RGT,lambdaGT,aRGT}]
                [-c CHECKPOINT] [--n-elems N_ELEMS] [--n-hidden N_HIDDEN]
                [--max-steps MAX_STEPS] [--lambda-p LAMBDA_P] [--beta BETA]
                [--progress-bar] [--n-train-samples N_TRAIN_SAMPLES]
                [--n-eval-samples N_EVAL_SAMPLES] [--mode MODE]
                [--batch-size BATCH_SIZE] [--num-workers NUM_WORKERS]
                [--early_stopping] [--val-check-interval VAL_CHECK_INTERVAL]
                [--n-iter N_ITER] [--ensemble-size ENSEMBLE_SIZE]

Train a model

optional arguments:
  -h, --help            show this help message and exit
  -s SEED, --seed SEED  The seed to use for random number generation
  --disable-logging     Disable logging
  --model {pondernet,groupthink,RGT,lambdaGT,aRGT}
                        What model variant to use
  -c CHECKPOINT, --checkpoint CHECKPOINT
                        path to a checkpoint from which to resume training
                        from
  --n-elems N_ELEMS     Number of elements in the parity vectors
  --n-hidden N_HIDDEN   Number of hidden elements in the reccurent cell
  --max-steps MAX_STEPS
                        Maximum number of pondering steps
  --lambda-p LAMBDA_P   Geometric prior distribution hyperparameter
  --beta BETA           Regularization loss coefficient
  --progress-bar        whether to show the progress bar
  --n-train-samples N_TRAIN_SAMPLES
                        The number of training samples to comprising the
                        dataset
  --n-eval-samples N_EVAL_SAMPLES
                        The number of training samples to comprising the
                        dataset
  --mode MODE           Whether to perform 'interpolation' or 'extrapolation'
  --batch-size BATCH_SIZE
                        Batch size
  --num-workers NUM_WORKERS
                        The number of workers
  --early_stopping      Whether to use early stopping
  --val-check-interval VAL_CHECK_INTERVAL
                        Evaluate every x amount of steps, as opposed to every
                        epoch
  --n-iter N_ITER       Number of training steps to use
  --ensemble-size ENSEMBLE_SIZE
                        Number of models to ensemble

For testing, refer to ponderbayes/run/test.py:

usage: test.py [-h] [-s SEED] -c CHECKPOINT [--progress-bar]
               [--n-test-samples N_TEST_SAMPLES] [--batch-size BATCH_SIZE]
               [--num-workers NUM_WORKERS]

Test a pondernet checkpoint

optional arguments:
  -h, --help            show this help message and exit
  -s SEED, --seed SEED  The seed to use for random number generation
  -c CHECKPOINT, --checkpoint CHECKPOINT
                        path (relative to root) to a checkpoint to evaluate
  --progress-bar        whether to show the progress bar
  --n-test-samples N_TEST_SAMPLES
                        The number of testing samples to comprise the dataset
  --batch-size BATCH_SIZE
                        Batch size
  --num-workers NUM_WORKERS
                        The number of workers

Name		Name	Last commit message	Last commit date
Latest commit History 179 Commits
data		data
lisa		lisa
models		models
notebooks		notebooks
ponderbayes		ponderbayes
reports		reports
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
gen_pip_reqs.sh		gen_pip_reqs.sh
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

License

thesofakillers/ponder-bayes

Folders and files

Latest commit

History

Repository files navigation

PonderBayes

Requirements and Setup

Additional/Optional Requirements

Project Organization

Usage

About

Resources

License

Stars

Watchers

Forks

Languages