Skip to content

Commit

Permalink
Adding details to README and Sudoku notebook
Browse files Browse the repository at this point in the history
  • Loading branch information
priyald17 committed May 26, 2019
1 parent 0819ab3 commit 419410a
Show file tree
Hide file tree
Showing 2 changed files with 148 additions and 32 deletions.
35 changes: 26 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,33 @@

*Bridging deep learning and logical reasoning using a differentiable satisfiability solver.*

This repository contains the source code to reproduce the experiments in the ICML 2019 paper [SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver](https://icml.cc/Conferences/2019/Schedule?showEvent=3947) by [Po-Wei Wang](https://powei.tw/), [Priya L. Donti](https://priyadonti.com/), [Bryan Wilder](http://teamcore.usc.edu/people/bryanwilder/default.htm), and [J. Zico Kolter](http://zicokolter.com/).

## What is a SATnet

## What is SATNet

SATNet is a differentiable (smoothed) maximum satisfiability (MAXSAT) solver that can be integrated into the loop of larger deep learning systems. This (approximate) solver is based upon a fast coordinate descent approach to solving the semidefinite program (SDP) associated with the MAXSAT problem.

#### How SATNet works

A SATNet layer takes as input the discrete or probabilistic assignments of known MAXSAT variables, and outputs guesses for the assignments of unknown variables via a MAXSAT SDP relaxation with weights *S*. A schematic depicting the forward pass of this layer is shown below. To obtain the backward pass, we analytically differentiate through the SDP relaxation (see the paper for more details).

![Forward pass](images/forward_pass.png)

![MNIST Sudoku](images/mnist_sudoku.png)
#### Overview of experiments

We show that by integrating SATNet into end-to-end learning systems, we can learn the logical structure of challenging problems in a minimally supervised fashion. In particular, we show that we can:
* Learn the **parity function** using single-bit supervision (a traditionally hard task for deep networks)
* Learn how to play **9×9 Sudoku (original and permuted)** solely from examples.
* Solve a **"visual Sudoku"** problem that maps images of Sudoku puzzles to their associated logical solutions. (A sample "visual Sudoku" input is shown below.)

<div style="text-align:center"><img src="images/mnist_sudoku.png" /></div>
<!-- ![MNIST Sudoku](images/mnist_sudoku.png) -->


## Installation

### pip
### Via pip
```bash
pip install satnet
```
Expand All @@ -38,14 +55,14 @@ conda install -c pytorch tqdm
```


### Docker Image
### Via Docker image
```bash
cd docker
sh ./build.sh
sh ./run.sh
```

## Experiments:
## Running experiments
### Jupyter Notebook and Google Colab
[Jupyter notebook](https://github.com/locuslab/SATNet/blob/master/notebooks/Learning%20and%20Solving%20Sudoku%20via%20SATNet.ipynb)
and [Google Colab]()
Expand All @@ -59,15 +76,15 @@ The [Sudoku dataset](https://powei.tw/sudoku.zip) and [Parity dataset](https://p
wget -cq powei.tw/sudoku.zip && unzip -qq sudoku.zip
wget -cq powei.tw/parity.zip && unzip -qq parity.zip
```
#### Sudoku Experiments
#### Sudoku experiments (original, permuted, and visual)
```bash
python exps/sudoku.py
python exps/sudoku.py --perm
python exps/sudoku.py --mnist --batchSz=50
```

#### Parity Experiments
#### Parity experiments
```bash
pyhton exps/parity.py --seq=20
pyhton exps/parity.py --seq=40
python exps/parity.py --seq=20
python exps/parity.py --seq=40
```
145 changes: 122 additions & 23 deletions notebooks/Learning and Solving Sudoku via SATNet.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import os\n",
Expand Down Expand Up @@ -32,10 +34,25 @@
"# Introduction to SATNet"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"SATNet is a differentiable (smoothed) maximum satisfiability (MAXSAT) solver that can be integrated into the loop of larger deep learning systems. Our (approximate) solver is based upon a fast coordinate descent approach to solving the semidefinite program (SDP) associated with the MAXSAT problem.\n",
"\n",
"The code below reproduces the Sudoku experiments from our paper \"SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver.\" These experiments show that by integrating the SATNet solver into end-to-end learning systems, we can learn the logical structure of challenging problems in a minimally supervised fashion. In particular, this notebook shows how we can learn to:\n",
"* Play **9×9 Sudoku (original and permuted)** solely from examples.\n",
"* Solve a **\"visual Sudoku\"** problem that maps images of Sudoku puzzles to their associated logical solutions. \n",
"\n",
"For more details and discussion about these experiments, please see the [SATNet paper](https://icml.cc/Conferences/2019/Schedule?showEvent=3947)."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
Expand Down Expand Up @@ -91,10 +108,26 @@
"print('SATNet document\\n', satnet.SATNet.__doc__)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Building SATNet-based Models"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To solve **Sudoku** and a **permuted version of Sudoku**: We construct a SATNet-based SudokuSolver layer that takes as input a logical (bit) representation of the initial Sudoku board along with a mask representing which bits must be learned (i.e. all bits in empty Sudoku cells). This input is vectorized. Given this input, the SudokuSolver layer then outputs a bit representation of the Sudoku board with guesses for the unknown bits."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"class SudokuSolver(nn.Module):\n",
Expand All @@ -109,10 +142,19 @@
" return out"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To solve **\"visual Sudoku\"**: We construct a (standard) convolutional neural network for MNIST digit recognition and train it end-to-end with our SudokuSolver layer. This architecture takes in an image representation of a Sudoku board constructed with MNIST digits. Each MNIST digit is classified by the convolutional network, and the resulting (estimated) logical representation of the initial Sudoku board is then fed as input to the SudokuSolver layer. (As described earlier, the SudokuSolver layer then outputs a bit representation of the Sudoku board with guesses for the unknown bits.)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"class DigitConv(nn.Module):\n",
Expand Down Expand Up @@ -155,10 +197,19 @@
" return solution"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The experimental parameters we use in the paper are below."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"from exps.sudoku import train, test, FigLogger, find_unperm\n",
Expand All @@ -181,10 +232,24 @@
"# The Sudoku Datasets"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We use and/or create the following datasets:\n",
"* **Sudoku:** We generate 10K 9x9 Sudoku boards (9K test/1K train) using code available [here](https://github.com/Kyubyong/sudoku) and represent them via bit (one-hot) representations.\n",
"* **Permuted Sudoku:** We apply a fixed permutation to the 10K Sudoku board bit representations generated for the Sudoku experiment.\n",
"* **Visual Sudoku:** We construct versions of the 10K Sudoku boards generated for the Sudoku experiment in which each board cell is represented by a (randomly-selected) MNIST digit. (MNIST digits are also split into train/test sets, with train and test MNIST digits applied only to train and test Sudoku boards, respectively.)\n",
"\n",
"The code below reads and processes these datasets for use with the architectures constructed above. A sample Sudoku board, its associated bit representation, and its associated MNIST representation are displayed below."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def process_inputs(X, Ximg, Y, boardSz):\n",
Expand Down Expand Up @@ -225,7 +290,9 @@
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
Expand Down Expand Up @@ -1156,13 +1223,24 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# The Boolean Sudoku Experiment"
"# The 9x9 Sudoku Experiment"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The results for our 9x9 Sudoku experiment are below. In this experiment, we:\n",
"* **Input** a logical (bit) representation of the initial (unsolved) Sudoku board along with a mask representing which bits must be learned (i.e. all bits in empty Sudoku cells). This input is vectorized, which means that our SATNet model cannot exploit the locality structure of the input Sudoku grid when learning to solve puzzles.\n",
"* **Output** a bit representation of the Sudoku board with guesses for the unknown bits."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
Expand Down Expand Up @@ -1972,7 +2050,9 @@
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
Expand Down Expand Up @@ -3600,13 +3680,24 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# The Permutated Sudoku Experiment"
"# The Permuted 9x9 Sudoku Experiment"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The results for our permuted 9x9 Sudoku experiment are below. In this experiment, we:\n",
"* **Input** the same inputs as in the original 9x9 Sudoku experiment, but with a fixed permutation applied.\n",
"* **Output** a bit representation of the permuted Sudoku board with guesses for the unknown bits."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
Expand Down Expand Up @@ -4416,7 +4507,9 @@
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
Expand Down Expand Up @@ -6045,13 +6138,24 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# The End-to-End MNIST Sudoku Experiment"
"# The End-to-End MNIST Sudoku (\"Visual Sudoku\") Experiment"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The results for our permuted 9x9 Sudoku experiment are below. In this experiment, we:\n",
"* **Input** an image representation of the initial (unsolved) Sudoku board.\n",
"* **Output** a bit representation of the Sudoku board with guesses for the unknown bits."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
Expand Down Expand Up @@ -6864,7 +6968,9 @@
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
Expand Down Expand Up @@ -8487,13 +8593,6 @@
" train(args.boardSz, epoch, mnist_sudoku, optimizer, train_logger, mnist_train, args.mnistBatchSz)\n",
" test(args.boardSz, epoch, mnist_sudoku, optimizer, test_logger, mnist_test, args.mnistBatchSz)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand Down

0 comments on commit 419410a

Please sign in to comment.