Memorization with neural nets: going beyond the worst case

This repository contains code to reproduce the results from the paper "Memorization with neural nets: going beyond the worst case".

Repository structure

These are the core files of the repository:

data.py generates datasets and writes them to datasets/.
run.py runs experiments defined in experiments/ and writes results to results/.
plot.py generates plots and writes them to plots/.
algorithm.py contains the implementation of the algorithm.

The rest are auxiliary files:

requirements.txt contains a list of required packages.
utils.py contains utility functions.
sge.sh is a helper script for executing experiments on clusters running SGE.

All Python code is formatted with black in its default configuration.

Reproducing results

The code requires Python 3.8+ and the packages listed in requirements.txt. To reproduce the results from the paper, follow these steps:

Clone the repository, set up a virtual environment, and install the dependencies.

git clone https://github.com/patrickfinke/memo.git
cd memo

python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -r requirements.txt

Generate the datasets. This only needs to be done once.

python data.py

Run the experiments using run.py. Specify the name with the --name argument. A valid name is any filename in experiments/ without the .json extension. Run the computation locally by specifying the backend --backend joblib (default) or on a compute cluster running Sun Grid Engine (SGE) with the backend --backend sge. (The file sge.sh might need some adjustments first.) The --tasks argument expects an integer specifying the number of threads (for joblib) or tasks in an array job (for sge). Setting this to -1 uses all available resources, the default is 1. For SGE, arguments after -- will be passed to the qsub command. For example:

python run.py --name moons
python run.py --name mnist --backend sge --tasks 20 -- -q all.q

Alternatively, unzip the precomputed results:

unzip results.zip

Generate the plots using plot.py. Specify the name with the --name argument. Plots can be found in a subfolder of plots/. For example:

python plot.py --name moons

Experiment configurations

Experiments are configured via JSON files inside experiments/. These contain a mapping with the following keys and values:

trial maps to a filename (without .py extension) of a Python script that implements a method called trial. This method will be called for each set of parameters in the parameter grid to produce the results of the experiment.
param_grid contains a mapping or list that is compatible with the ParameterGrid class from scikit-learn. See the existing files for examples.
plots contains a list of mappings that each configures a plot. See the docstrings in plot.py for an explanation and the existing files for examples.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Memorization with neural nets: going beyond the worst case

Repository structure

Reproducing results

Experiment configurations

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
experiments		experiments
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
algorithm.py		algorithm.py
data.py		data.py
plot.py		plot.py
requirements.txt		requirements.txt
results.zip		results.zip
run.py		run.py
sge.sh		sge.sh
utils.py		utils.py

License

patrickfinke/memo

Folders and files

Latest commit

History

Repository files navigation

Memorization with neural nets: going beyond the worst case

Repository structure

Reproducing results

Experiment configurations

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages