Adding details to README and Sudoku notebook

locuslab · May 26, 2019 · 419410a · 419410a
1 parent 0819ab3
commit 419410a
Show file tree

Hide file tree

Showing 2 changed files with 148 additions and 32 deletions.
diff --git a/README.md b/README.md
@@ -11,16 +11,33 @@
 
 *Bridging deep learning and logical reasoning using a differentiable satisfiability solver.*
 
+This repository contains the source code to reproduce the experiments in the ICML 2019 paper [SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver](https://icml.cc/Conferences/2019/Schedule?showEvent=3947) by [Po-Wei Wang](https://powei.tw/), [Priya L. Donti](https://priyadonti.com/), [Bryan Wilder](http://teamcore.usc.edu/people/bryanwilder/default.htm), and [J. Zico Kolter](http://zicokolter.com/).
 
-## What is a SATnet
+
+## What is SATNet
+
+SATNet is a differentiable (smoothed) maximum satisfiability (MAXSAT) solver that can be integrated into the loop of larger deep learning systems. This (approximate) solver is based upon a fast coordinate descent approach to solving the semidefinite program (SDP) associated with the MAXSAT problem.
+
+#### How SATNet works
+
+A SATNet layer takes as input the discrete or probabilistic assignments of known MAXSAT variables, and outputs guesses for the assignments of unknown variables via a MAXSAT SDP relaxation with weights *S*. A schematic depicting the forward pass of this layer is shown below. To obtain the backward pass, we analytically differentiate through the SDP relaxation (see the paper for more details).
 
 ![Forward pass](images/forward_pass.png)
 
-![MNIST Sudoku](images/mnist_sudoku.png)
+#### Overview of experiments
+
+We show that by integrating SATNet into end-to-end learning systems, we can learn the logical structure of challenging problems in a minimally supervised fashion. In particular, we show that we can:
+* Learn the **parity function** using single-bit supervision (a traditionally hard task for deep networks) 
+* Learn how to play **9×9 Sudoku (original and permuted)** solely from examples. 
+* Solve a **"visual Sudoku"** problem that maps images of Sudoku puzzles to their associated logical solutions. (A sample "visual Sudoku" input is shown below.)
+
+<div style="text-align:center"><img src="images/mnist_sudoku.png" /></div>
+<!-- ![MNIST Sudoku](images/mnist_sudoku.png) -->
+
 
 ## Installation
 
-### pip
+### Via pip
 ```bash
 pip install satnet
 ```
@@ -38,14 +55,14 @@ conda install -c pytorch tqdm
 ```
 
 
-### Docker Image
+### Via Docker image
 ```bash
 cd docker
 sh ./build.sh
 sh ./run.sh
 ```
 
-## Experiments:
+## Running experiments
 ### Jupyter Notebook and Google Colab
 [Jupyter notebook](https://github.com/locuslab/SATNet/blob/master/notebooks/Learning%20and%20Solving%20Sudoku%20via%20SATNet.ipynb)
 and [Google Colab]()
@@ -59,15 +76,15 @@ The [Sudoku dataset](https://powei.tw/sudoku.zip) and [Parity dataset](https://p
 wget -cq powei.tw/sudoku.zip && unzip -qq sudoku.zip
 wget -cq powei.tw/parity.zip && unzip -qq parity.zip
 ```
-#### Sudoku Experiments
+#### Sudoku experiments (original, permuted, and visual)
 ```bash
 python exps/sudoku.py
 python exps/sudoku.py --perm
 python exps/sudoku.py --mnist --batchSz=50
 ```
 
-#### Parity Experiments
+#### Parity experiments
 ```bash
-pyhton exps/parity.py --seq=20
-pyhton exps/parity.py --seq=40
+python exps/parity.py --seq=20
+python exps/parity.py --seq=40
 ```
diff --git a/notebooks/Learning and Solving Sudoku via SATNet.ipynb b/notebooks/Learning and Solving Sudoku via SATNet.ipynb
@@ -3,7 +3,9 @@
   {
    "cell_type": "code",
    "execution_count": 1,
-   "metadata": {},
+   "metadata": {
+    "collapsed": true
+   },
    "outputs": [],
    "source": [
     "import os\n",
@@ -32,10 +34,25 @@
     "# Introduction to SATNet"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "SATNet is a differentiable (smoothed) maximum satisfiability (MAXSAT) solver that can be integrated into the loop of larger deep learning systems.  Our (approximate) solver is based upon a fast coordinate descent approach to solving the semidefinite program (SDP) associated with the MAXSAT problem.\n",
+    "\n",
+    "The code below reproduces the Sudoku experiments from our paper \"SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver.\" These experiments show that by integrating the SATNet solver into end-to-end learning systems, we can learn the logical structure of challenging problems in a minimally supervised fashion. In particular, this notebook shows how we can learn to:\n",
+    "* Play **9×9 Sudoku (original and permuted)** solely from examples.\n",
+    "* Solve a **\"visual Sudoku\"** problem that maps images of Sudoku puzzles to their associated logical solutions. \n",
+    "\n",
+    "For more details and discussion about these experiments, please see the [SATNet paper](https://icml.cc/Conferences/2019/Schedule?showEvent=3947)."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 2,
-   "metadata": {},
+   "metadata": {
+    "collapsed": false
+   },
    "outputs": [
     {
      "name": "stdout",
@@ -91,10 +108,26 @@
     "print('SATNet document\\n', satnet.SATNet.__doc__)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Building SATNet-based Models"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To solve **Sudoku** and a **permuted version of Sudoku**: We construct a SATNet-based SudokuSolver layer that takes as input a logical (bit) representation of the initial Sudoku board along with a mask representing which bits must be learned (i.e. all bits in empty Sudoku cells). This input is vectorized. Given this input, the SudokuSolver layer then outputs a bit representation of the Sudoku board with guesses for the unknown bits."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 3,
-   "metadata": {},
+   "metadata": {
+    "collapsed": true
+   },
    "outputs": [],
    "source": [
     "class SudokuSolver(nn.Module):\n",
@@ -109,10 +142,19 @@
     "        return out"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To solve **\"visual Sudoku\"**: We construct a (standard) convolutional neural network for MNIST digit recognition and train it end-to-end with our SudokuSolver layer. This architecture takes in an image representation of a Sudoku board constructed with MNIST digits. Each MNIST digit is classified by the convolutional network, and the resulting (estimated) logical representation of the initial Sudoku board is then fed as input to the SudokuSolver layer. (As described earlier, the SudokuSolver layer then outputs a bit representation of the Sudoku board with guesses for the unknown bits.)"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 4,
-   "metadata": {},
+   "metadata": {
+    "collapsed": true
+   },
    "outputs": [],
    "source": [
     "class DigitConv(nn.Module):\n",
@@ -155,10 +197,19 @@
     "        return solution"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The experimental parameters we use in the paper are below."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 5,
-   "metadata": {},
+   "metadata": {
+    "collapsed": true
+   },
    "outputs": [],
    "source": [
     "from exps.sudoku import train, test, FigLogger, find_unperm\n",
@@ -181,10 +232,24 @@
     "# The Sudoku Datasets"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We use and/or create the following datasets:\n",
+    "* **Sudoku:** We generate 10K 9x9 Sudoku boards (9K test/1K train) using code available [here](https://github.com/Kyubyong/sudoku) and represent them via bit (one-hot) representations.\n",
+    "* **Permuted Sudoku:** We apply a fixed permutation to the 10K Sudoku board bit representations generated for the Sudoku experiment.\n",
+    "* **Visual Sudoku:** We construct versions of the 10K Sudoku boards generated for the Sudoku experiment in which each board cell is represented by a (randomly-selected) MNIST digit. (MNIST digits are also split into train/test sets, with train and test MNIST digits applied only to train and test Sudoku boards, respectively.)\n",
+    "\n",
+    "The code below reads and processes these datasets for use with the architectures constructed above. A sample Sudoku board, its associated bit representation, and its associated MNIST representation are displayed below."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 6,
-   "metadata": {},
+   "metadata": {
+    "collapsed": true
+   },
    "outputs": [],
    "source": [
     "def process_inputs(X, Ximg, Y, boardSz):\n",
@@ -225,7 +290,9 @@
   {
    "cell_type": "code",
    "execution_count": 7,
-   "metadata": {},
+   "metadata": {
+    "collapsed": false
+   },
    "outputs": [
     {
      "data": {
@@ -1156,13 +1223,24 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# The Boolean Sudoku Experiment"
+    "# The 9x9 Sudoku Experiment"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The results for our 9x9 Sudoku experiment are below. In this experiment, we:\n",
+    "* **Input** a logical (bit) representation of the initial (unsolved) Sudoku board along with a mask representing which bits must be learned (i.e. all bits in empty Sudoku cells). This input is vectorized, which means that our SATNet model cannot exploit the locality structure of the input Sudoku grid when learning to solve puzzles.\n",
+    "* **Output** a bit representation of the Sudoku board with guesses for the unknown bits."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": 8,
-   "metadata": {},
+   "metadata": {
+    "collapsed": false
+   },
    "outputs": [
     {
      "data": {
@@ -1972,7 +2050,9 @@
   {
    "cell_type": "code",
    "execution_count": 9,
-   "metadata": {},
+   "metadata": {
+    "collapsed": false
+   },
    "outputs": [
     {
      "name": "stderr",
@@ -3600,13 +3680,24 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# The Permutated Sudoku Experiment"
+    "# The Permuted 9x9 Sudoku Experiment"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The results for our permuted 9x9 Sudoku experiment are below. In this experiment, we:\n",
+    "* **Input** the same inputs as in the original 9x9 Sudoku experiment, but with a fixed permutation applied.\n",
+    "* **Output** a bit representation of the permuted Sudoku board with guesses for the unknown bits."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": 10,
-   "metadata": {},
+   "metadata": {
+    "collapsed": false
+   },
    "outputs": [
     {
      "data": {
@@ -4416,7 +4507,9 @@
   {
    "cell_type": "code",
    "execution_count": 11,
-   "metadata": {},
+   "metadata": {
+    "collapsed": false
+   },
    "outputs": [
     {
      "name": "stderr",
@@ -6045,13 +6138,24 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# The End-to-End MNIST Sudoku Experiment"
+    "# The End-to-End MNIST Sudoku (\"Visual Sudoku\") Experiment"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The results for our permuted 9x9 Sudoku experiment are below. In this experiment, we:\n",
+    "* **Input** an image representation of the initial (unsolved) Sudoku board.\n",
+    "* **Output** a bit representation of the Sudoku board with guesses for the unknown bits."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": 12,
-   "metadata": {},
+   "metadata": {
+    "collapsed": false
+   },
    "outputs": [
     {
      "data": {
@@ -6864,7 +6968,9 @@
   {
    "cell_type": "code",
    "execution_count": 13,
-   "metadata": {},
+   "metadata": {
+    "collapsed": false
+   },
    "outputs": [
     {
      "name": "stderr",
@@ -8487,13 +8593,6 @@
     "    train(args.boardSz, epoch, mnist_sudoku, optimizer, train_logger, mnist_train, args.mnistBatchSz)\n",
     "    test(args.boardSz, epoch, mnist_sudoku, optimizer, test_logger, mnist_test, args.mnistBatchSz)"
    ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
   }
  ],
  "metadata": {