LabForComputationalVision · billbrod · Aug 13, 2020 · Jun 20, 2019 · Jul 29, 2019 · Dec 6, 2019
diff --git a/.gitignore b/.gitignore
@@ -11,6 +11,8 @@ __pycache__/
 
 examples/data/cifar*
 data/plenoptic-test-files
+data/ssim_images
+data/ssim_analysis.mat
 data/cat7*
 data/elep*
 docs/_build
diff --git a/.travis.yml b/.travis.yml
@@ -1,5 +1,7 @@
 language: python
 dist: xenial
+services:
+  - xvfb
 env:
   - TEST_SCRIPT=metamers
   - TEST_SCRIPT=models

diff --git a/data/256x256/checkerboard.pgm b/data/256x256/checkerboard.pgm
diff --git a/data/256x256/curie.pgm b/data/256x256/curie.pgm
diff --git a/data/256x256/einstein.png b/data/256x256/einstein.png
diff --git a/examples/MAD_Competition.ipynb b/examples/MAD_Competition.ipynb
diff --git a/examples/Metamer.ipynb b/examples/Metamer.ipynb
diff --git a/examples/Original_MAD.ipynb b/examples/Original_MAD.ipynb
diff --git a/examples/Simple_MAD.ipynb b/examples/Simple_MAD.ipynb
diff --git a/examples/Synthesis.ipynb b/examples/Synthesis.ipynb
@@ -0,0 +1,90 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Implementing New Synthesis Methods\n",
+    "\n",
+    "This notebook assumes you are familiar with synthesis methods and have interacted with them in this package, and would like to either understand what happens under the hood a little better or implement your own using the existing API.\n",
+    "\n",
+    "`Synthesis` is an abstract class that should be inherited by any synthesis method (e.g., `Metamer`, `MADCompetition`). It provides many helper functions which, depending on your method, can be used directly or modified slightly. For the two extremes on this, see the source code for `Metamer` and `MADCompetition`: `Metamer` uses almost everything exactly as written, whereas because `MADCompetition` works with two models, requires extensive modification. Even when you're modifying the methods, however, you should try to:\n",
+    "\n",
+    "1. Maintain the names and, as much as possible, the call signatures. We want it to be easy to use the different synthesis methods so the way users interact with them should be as similar as possible.\n",
+    "  - The most common reason you'd modify the call signature would be for adding arguments. If an argument is a modification / tweak of an existing one, place it next to that existing argument. If it's completely novel (and important), place it near the beginning. \n",
+    "  - For example, `MADCompetition` requires two models, instead of one, during intialization, and a new required argument, `synthesis_target` for `synthesize()`. The standard initialization call signature is `(target_image, model, loss_function, **model_kwargs)`, so `MADCompetition`'s is `(target_image, model_1, model_2, loss_function, model_1_kwargs, model_2_kwargs)`. The new argument for `synthesize()` goes at the beginning.\n",
+    "2. Reuse existing methods. The basic idea of many synthesis methods is pretty similar: update the input image based on the gradient (or a function of the gradient) of the model. Therefore, the code for much of what you'll want to do already exists and you will just need to e.g., call it with a different argument, specify what model to use, modify the gradient before updating the image.\n",
+    "3. Make sure all the existing public-facing methods either work or raise a `NotImplementedError`. We want people either to be able to use the methods they're used to from other synthesis methods for better understanding synthesis (for example, plotting the synthesis status or creating an animation of progress), or to know why they cannot. For example, because `MADCompetition` has two models, we want to plot both losses in `plot_loss`. We can make use of the existing `Synthesis.plot_loss()` method, just modifying where it grabs the data from, and call it twice. To the user, there's no difference in how it creates the plot. However, there's no need to do this for the private methods (e.g., `_set_seed()`).\n",
+    "4. Add any natural generalizations. `MADCompetition` stimuli come in sets of 4, and so it makes sense to provide a function that generalizes `plot_synthesized_image()` to show all 4 of them: `plot_synthesized_image_all()`.\n",
+    "\n",
+    "## Structure\n",
+    "\n",
+    "Now, let's walk through the structure of a `Synthesis` class.\n",
+    "\n",
+    "The two most important functions are `__init__()`, which initializes the class, and `synthesize()` which synthesizes an image. \n",
+    "\n",
+    "`Synthesis.__init__()` provides a lot of code that you can use (as well as a basis for the docstring), and should be called unless you have a *really* good reason not to. It will automatically support the use of models and metrics, modifying the loss function, and initialize a lot of the class's attributes. You may want to call it and then do additional stuff, e.g., set up a second model or initialize new attributes.\n",
+    "\n",
+    "`Synthesis.synthesize()` cannot be called, but provides a skeleton of what `synthesize()` should look like (as well as a basis for the docstring). It shows how the various hidden helper methods are used to set up the synthesis call and core loop. You'll probably want to copy this into your new synthesis method's `synthesize()` and then modify it. You'll certainly need to change the initialization of the matched image, which varies from method to method (for instance, `Metamer` uses random noise or a new image, whereas `MADCompetition` uses the reference image plus some noise). You may otherwise be able to ues the method as it's written, just modifying the helper functions.\n",
+    "\n",
+    "`Synthesis` also contains a variety of plotting and animating functions. You will probably need to think about what to plot, but should hopefully be able to adapt the existing display code to your needs:\n",
+    "- `Synthesis.plot_representation_error()` calls `po.tools.display.plot_representation` on `Synthesis.representation_error()`, which takes the difference between `Synthesis.saved_representation` and `Synthesis.base_representation`.\n",
+    "- `Synthesis.plot_loss()` plots `Synthesis.loss` as a function of iterations.\n",
+    "- `Synthesis.plot_synthesized_image()` calls `pt.imshow` on `Synthesis.synthesized_signal`\n",
+    "- `Synthesis.plot_synthesis_statuss()` combines the three above plots into one figure\n",
+    "- `Synthesis.animate()` animates the above figure over iterations.\n",
+    "\n",
+    "## Important Attributes\n",
+    "\n",
+    "In order to mesh with `Synthesis`, you'll need to adopt its naming conventions for its attributes:\n",
+    "- At initialization, you should take something like the following arguments, which will get stored as attributes:\n",
+    "  - `base_signal`: the signal you're basing your synthesis off of.\n",
+    "  - `model`: the model (`torch.nn.Module`) or metric (callable) that you're basing your synthesis off of\n",
+    "  - `loss_function`: the callable to use for computing distance, must return a scalar. Can be `None`, in which case we use the l2-norm of the difference in representation space.\n",
+    "- The model's representation of `base_signal` should be `base_representation`.\n",
+    "- During iterative synthesis: \n",
+    "  - The synthesis-in-progress is `synthesized_signal` and the model's representation of it is `synthesized_representation`.\n",
+    "  - Loss is `loss`, norm of the gradient is `gradient`, learning rate is `learning_rate`\n",
+    "  - If user wants to store progress, then `store_progress` is either a boolean or an integer specifying how often to update the following attributes, which store the corresponding other attributes:\n",
+    "    - `saved_signal` contains `synthesized_signal`\n",
+    "    - `saved_representation` contains `synthesized_representation`\n",
+    "    - `saved_signal_gradient` contains `synthesized_signal.grad`\n",
+    "    - `saved_representation_gradient` contains `synthesized_representation.grad`\n",
+    "  - If you want to make use of coarse-to-fine optimization, `_init_ctf_and_randomizer` will take care of initializing the following attributes, `_optimizer_step` and `_closure` use them:\n",
+    "    - `scales` is a copy of `model.scales` and will be edited over the course of optimization to specify which scale we're working on at the moment\n",
+    "    - `scales_loss`: scale-specific loss at each iteration (`loss` contains the loss computed with the whole model)\n",
+    "    - `scales_timing`: dictionary containing the iterations where we started and stopped synthesizing each scale\n",
+    "    - `scales_finished`: list of scales that we've finished optimizing\n",
+    "  - For saving during synthesis (in case of failure or something), `save_progress` acts like `store_progress` and `save_path` specifies the path to the `.pt` file for saving. \n",
+    "  - The other arguments to `synthesize()`, as documented there, are also set as attributes and made use of by `_optimizer_step` and `_closure`, but are not necessary for the other functionality.\n",
+    "  \n",
+    "## Required methods\n",
+    "\n",
+    "The only methods you need to implement are `__init__()`, `save()`, and `load()`:\n",
+    "- For save, you just need to tell `super().save()` which attributes you wish to save. It's recommended you include the `save_model_reduced` argument as well (see `Metamer` tutorial notebook for an explanation of that).\n",
+    "- For load, you need to tell `super().load()` what the name of the attribute that contains the model is (e.g., `model`)."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python [conda env:plenoptic]",
+   "language": "python",
+   "name": "conda-env-plenoptic-py"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/examples/simple_example.ipynb b/examples/simple_example.ipynb
diff --git a/plenoptic/__init__.py b/plenoptic/__init__.py
@@ -10,6 +10,8 @@
 # from .tools.linalg import *
 from .tools.display import *
 from .tools.data import *
+from .tools import optim
+from .tools import external
 
 
 from .version import version as __version__

diff --git a/plenoptic/metric/__init__.py b/plenoptic/metric/__init__.py
@@ -1,2 +1,4 @@
-from .perceptual_distance import ssim, msssim, nlpd, nspd
-from .model_metric import model_metric
+from .perceptual_distance import ssim, nlpd, nspd, ssim_map
+from .model_metric import model_metric
+from .naive import mse
+from .classes import NLP
diff --git a/plenoptic/metric/classes.py b/plenoptic/metric/classes.py
@@ -0,0 +1,47 @@
+import torch
+from .perceptual_distance import normalized_laplacian_pyramid
+
+
+class NLP(torch.nn.Module):
+    r"""simple class for implementing normalized laplacian pyramid
+
+    This class just calls
+    ``plenoptic.metric.normalized_laplacian_pyramid`` on the image and
+    returns a 3d tensor with the flattened activations.
+
+    NOTE: synthesis using this class will not be the exact same as
+    synthesis using the ``plenoptic.metric.nlpd`` function (by default),
+    because the synthesis methods use ``torch.norm(x - y, p=2)`` as the
+    distance metric between representations, whereas ``nlpd`` uses the
+    root-mean square of the distance (i.e.,
+    ``torch.sqrt(torch.mean(x-y)**2))``
+
+    """
+    def __init__(self):
+        super().__init__()
+
+    def forward(self, image):
+        """returns flattened NLP activations
+
+        WARNING: For now this only supports images with batch and
+        channel size 1
+
+        Parameters
+        ----------
+        image : torch.tensor
+            image to pass to normalized_laplacian_pyramid
+
+        Returns
+        -------
+        representatio : torch.tensor
+            3d tensor with flattened NLP activations
+
+        """
+        if image.shape[0] > 1 or image.shape[1] > 1:
+            raise Exception("For now, this only supports batch and channel size 1")
+        activations = normalized_laplacian_pyramid(image)
+        # activations is a list of tensors, each at a different scale
+        # (down-sampled by factors of 2). To combine these into one
+        # vector, we need to flatten each of them and then unsqueeze so
+        # it is 3d
+        return torch.cat([i.flatten() for i in activations]).unsqueeze(0).unsqueeze(0)
diff --git a/plenoptic/metric/naive.py b/plenoptic/metric/naive.py
@@ -0,0 +1,33 @@
+import torch
+
+
+def mse(img1, img2):
+    r"""return the MSE between img1 and img2
+
+    Our baseline metric to compare two images is often mean-squared
+    error, MSE. This is not a good approximation of the human visual
+    system, but is handy to compare against.
+
+    For two images, :math:`x` and :math:`y`, with :math:`n` pixels
+    each:
+
+    .. math::
+
+        MSE &= \frac{1}{n}\sum_i=1^n (x_i - y_i)^2
+
+    The two images must have a float dtype
+
+    Parameters
+    ----------
+    img1 : torch.tensor
+        The first image to compare
+    img2 : torch.tensor
+        The second image to compare, must be same size as ``img1``
+
+    Returns
+    -------
+    mse : torch.float
+        the mean-squared error between ``img1`` and ``img2``
+
+    """
+    return torch.pow(img1 - img2, 2).mean((-1, -2))