Created using Colaboratory

Abhijit-2592 · Mar 5, 2020 · c64ae64 · c64ae64
1 parent efe39b9
commit c64ae64
Showing 1 changed file with 74 additions and 39 deletions.
diff --git a/filter_visualization.ipynb b/filter_visualization.ipynb
@@ -5,7 +5,7 @@
     "colab": {
       "name": "filter_visualization.ipynb",
       "provenance": [],
-      "authorship_tag": "ABX9TyNY4+312+Q8lmQT4H8p12g2",
+      "authorship_tag": "ABX9TyPEFcDafzEsz6u6GaFSD9Ai",
       "include_colab_link": true
     },
     "kernelspec": {
@@ -33,26 +33,33 @@
       },
       "source": [
         "# Visualizing the Convolutional Neural Network's filters\n",
-        "A way to inspect the filters learned by convnets is to display the visual pattern that each filter is meant to respond to. This can be done with gradient ascent in input space : applying gradient descent to the value of the input image of a convnet so as to maximize the response of a specific filter, starting from a random input image. The resulting input image will be one that the chosen filter is maximally responsive to.\n",
+        "The Learnt CNN filters in the initial layers (I prefer to visualize a top down view of the network: input image on the top and output on the bottom in contrast to the widely used bottom up view of the network: input image in the bottom and output on the top) looks like [Gabor filters](https://en.wikipedia.org/wiki/Gabor_filter). This is to be expected as the initial layers are responsible for extracting primitive image features like edges, textures etc. As we go deeper into the network this feature extraction becomes more abstract and it tries to extract features like faces, body parts etc. So how do we visualize these filters? Directly visualizing them will not make much sense because, they are just a bunch of 3X3, 5x5 etc matrices. So how can we visualize these learnt filters?\n",
         "\n",
-        "This process is simple: you’ll build a loss function that maximizes the value of a given filter in a given convolution layer, and then you’ll use stochastic gradient descent (it's ascent here) to adjust the values of the input image so as to maximize this activation value"
+        "One way to inspect these filters is to display the visual pattern that each filter is meant to respond to. This can be done with gradient **ascent** (yes I mean ascent! and not descent) in input space: applying gradient ascent to the value of the input image of a convnet so as to maximize the response of a specific filter, starting from a random input image. The resulting input image will be one that the chosen filter is maximally responsive to.\n",
+        "\n",
+        "This process is simple. We will build a loss function that maximizes the value of a given filter in a given convolution layer, and then we will use our standard trick **\"stochastic gradient descent\"** to adjust the values of the input image so as to maximize this activation value.\n",
+        "\n",
+        "The loss function we will use is just **the mean of that specific filter!**\n",
+        "\n",
+        "\n",
+        "Reference: https://blog.keras.io/how-convolutional-neural-networks-see-the-world.html"
       ]
     },
     {
       "cell_type": "code",
       "metadata": {
         "id": "WJoZGQLavb8k",
         "colab_type": "code",
+        "outputId": "bb13141f-bc83-4ce1-8968-e468b840b390",
         "colab": {
           "base_uri": "https://localhost:8080/",
           "height": 34
-        },
-        "outputId": "bb13141f-bc83-4ce1-8968-e468b840b390"
+        }
       },
       "source": [
         "%tensorflow_version 2.x"
       ],
-      "execution_count": 1,
+      "execution_count": 0,
       "outputs": [
         {
           "output_type": "stream",
@@ -109,17 +116,17 @@
       "metadata": {
         "id": "XV8VmpK1vnBl",
         "colab_type": "code",
+        "outputId": "06a2489a-5808-4364-e17b-8b057db906b5",
         "colab": {
           "base_uri": "https://localhost:8080/",
           "height": 50
-        },
-        "outputId": "06a2489a-5808-4364-e17b-8b057db906b5"
+        }
       },
       "source": [
         "print(tf.__version__)\n",
         "print(np.__version__)"
       ],
-      "execution_count": 5,
+      "execution_count": 0,
       "outputs": [
         {
           "output_type": "stream",
@@ -136,16 +143,16 @@
       "metadata": {
         "id": "JKyODUG1vpop",
         "colab_type": "code",
+        "outputId": "34f764f5-df80-4676-8e50-d154ac0dbc46",
         "colab": {
           "base_uri": "https://localhost:8080/",
           "height": 50
-        },
-        "outputId": "34f764f5-df80-4676-8e50-d154ac0dbc46"
+        }
       },
       "source": [
         "model = tf.keras.applications.vgg16.VGG16(include_top=True, weights=\"imagenet\")"
       ],
-      "execution_count": 6,
+      "execution_count": 0,
       "outputs": [
         {
           "output_type": "stream",
@@ -162,16 +169,16 @@
       "metadata": {
         "id": "-eBI-6cmvy_F",
         "colab_type": "code",
+        "outputId": "b1e2910f-f22f-4901-fa1f-1af8147e1597",
         "colab": {
           "base_uri": "https://localhost:8080/",
           "height": 924
-        },
-        "outputId": "b1e2910f-f22f-4901-fa1f-1af8147e1597"
+        }
       },
       "source": [
         "model.summary()"
       ],
-      "execution_count": 7,
+      "execution_count": 0,
       "outputs": [
         {
           "output_type": "stream",
@@ -267,16 +274,16 @@
       "metadata": {
         "id": "sDRsjTn8v5WG",
         "colab_type": "code",
+        "outputId": "5710ab57-b5a7-43b8-9676-cf07e3fe8588",
         "colab": {
           "base_uri": "https://localhost:8080/",
           "height": 420
-        },
-        "outputId": "5710ab57-b5a7-43b8-9676-cf07e3fe8588"
+        }
       },
       "source": [
         "partial_model.summary()"
       ],
-      "execution_count": 10,
+      "execution_count": 0,
       "outputs": [
         {
           "output_type": "stream",
@@ -315,11 +322,11 @@
       "metadata": {
         "id": "rI79jHOmv636",
         "colab_type": "code",
+        "outputId": "24edf11f-b81b-41e5-9433-6777266ee5e9",
         "colab": {
           "base_uri": "https://localhost:8080/",
           "height": 298
-        },
-        "outputId": "24edf11f-b81b-41e5-9433-6777266ee5e9"
+        }
       },
       "source": [
         "random_image = np.random.random((224, 224, 3)) * 20 + 128\n",
@@ -328,7 +335,7 @@
         "plt.show()\n",
         "random_image = np.expand_dims(random_image, axis=0)  # reshape it to (1,224,224,3)"
       ],
-      "execution_count": 11,
+      "execution_count": 0,
       "outputs": [
         {
           "output_type": "stream",
@@ -371,16 +378,28 @@
       "execution_count": 0,
       "outputs": []
     },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "ccUWQFfLyR9o",
+        "colab_type": "text"
+      },
+      "source": [
+        "## NOTE:\n",
+        "\n",
+        "Here we are using a normalization trick to make the gradient ascent process smooth. We are normalizing the gradients using a scalar that is very similar to it's L2 norm (take the mean before we do the square root). This ensures that our gradients are not too small nor too large"
+      ]
+    },
     {
       "cell_type": "code",
       "metadata": {
         "id": "iERT72j6wCKn",
         "colab_type": "code",
+        "outputId": "66a590a6-425b-4dd5-82f5-ccec2e930edd",
         "colab": {
           "base_uri": "https://localhost:8080/",
           "height": 34
-        },
-        "outputId": "66a590a6-425b-4dd5-82f5-ccec2e930edd"
+        }
       },
       "source": [
         "step_size = 1\n",
@@ -392,7 +411,7 @@
         "    random_image += grads * step_size   # + is gradient ascent\n",
         "    progbar.update(i+1)"
       ],
-      "execution_count": 13,
+      "execution_count": 0,
       "outputs": [
         {
           "output_type": "stream",
@@ -445,16 +464,16 @@
       "metadata": {
         "id": "Z4dblCGNwKcm",
         "colab_type": "code",
+        "outputId": "9237edbe-3ea8-45a8-f0a4-93c9c0f0f539",
         "colab": {
           "base_uri": "https://localhost:8080/",
           "height": 34
-        },
-        "outputId": "9237edbe-3ea8-45a8-f0a4-93c9c0f0f539"
+        }
       },
       "source": [
         "filter_image.shape"
       ],
-      "execution_count": 16,
+      "execution_count": 0,
       "outputs": [
         {
           "output_type": "execute_result",
@@ -475,18 +494,18 @@
       "metadata": {
         "id": "v8UbQJwhwQJY",
         "colab_type": "code",
+        "outputId": "0fb623fe-208d-450d-ed59-de8dee5e71ab",
         "colab": {
           "base_uri": "https://localhost:8080/",
           "height": 377
-        },
-        "outputId": "0fb623fe-208d-450d-ed59-de8dee5e71ab"
+        }
       },
       "source": [
         "plt.figure(figsize=(6,6))\n",
         "plt.imshow(filter_image)\n",
         "plt.show()"
       ],
-      "execution_count": 17,
+      "execution_count": 0,
       "outputs": [
         {
           "output_type": "display_data",
@@ -509,9 +528,11 @@
         "colab_type": "text"
       },
       "source": [
-        "Looks like this filter responds to **POLKA DOT** like patterns.\n",
+        "Looks like this specific filter responds to **POLKA DOT** like patterns in the input image\n",
         "\n",
-        "## Let's Visualize a bunch of filters now"
+        "## Let's Visualize a bunch of filters now\n",
+        "\n",
+        "The following is coded up in such a way that you can pass any layer name and the number of filters you want to visualize from that layer"
       ]
     },
     {
@@ -583,11 +604,11 @@
       "metadata": {
         "id": "5M9656Tdwdxy",
         "colab_type": "code",
+        "outputId": "5d678d3e-29b2-4bc2-e696-090d789a35e5",
         "colab": {
           "base_uri": "https://localhost:8080/",
           "height": 34
-        },
-        "outputId": "5d678d3e-29b2-4bc2-e696-090d789a35e5"
+        }
       },
       "source": [
         "layer_name = 'block1_conv1'\n",
@@ -599,7 +620,7 @@
         "    filter_images.append(filter_image)\n",
         "    progbar.update(filter_index+1)"
       ],
-      "execution_count": 20,
+      "execution_count": 0,
       "outputs": [
         {
           "output_type": "stream",
@@ -628,19 +649,19 @@
       "metadata": {
         "id": "u-Mru-FtwmZr",
         "colab_type": "code",
+        "outputId": "92eb26a1-c590-4ea6-fe2b-648c384601b0",
         "colab": {
           "base_uri": "https://localhost:8080/",
           "height": 934
-        },
-        "outputId": "92eb26a1-c590-4ea6-fe2b-648c384601b0"
+        }
       },
       "source": [
         "plt.figure(figsize=(16,16))\n",
         "plt.title(\"First {} filters from layer {}\".format(filter_nums, layer_name))\n",
         "plt.imshow(stitched_image)\n",
         "plt.show()"
       ],
-      "execution_count": 22,
+      "execution_count": 0,
       "outputs": [
         {
           "output_type": "display_data",
@@ -667,9 +688,23 @@
         "These filter visualizations tell you a lot about how a convolutional neural network sees the world. Each layer in a ConvNet learns a collection of filters such that their inputs can be expressed as a combination of these filters. The filters in these ConvNet layers get increasingly complex and refined as you go higher in the model:\n",
         "\n",
         "The filters from the first layer in the model ( block1_conv1 ) encode simple directional edges and colors (or colored edges, in some cases).\n",
-        "The filters from block2_conv1 encode simple textures made from combinations of edges and colors.\n",
+        "The filters from block2_conv1 encode simple textures (These are the Gabor like filters as they are responsible for texture extractions) made from combinations of edges and colors.\n",
+        "\n",
         "The filters in higher layers begin to resemble textures found in natural images: feathers, eyes, leaves, and so on."
       ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "_rWlkM7OxZgA",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        ""
+      ],
+      "execution_count": 0,
+      "outputs": []
     }
   ]
 }