Switched to CIFAR10

wandb · Oct 10, 2019 · 47a51e0 · 47a51e0
1 parent 477206c
commit 47a51e0
Showing 1 changed file with 51 additions and 37 deletions.
diff --git a/pytorch-intro/intro.ipynb b/pytorch-intro/intro.ipynb
@@ -21,7 +21,7 @@
       },
       "source": [
         "# Welcome!\n",
-        "In this tutorial we'll walk through a simple convolutional neural network to classify the handwritten digits in MNIST using PyTorch.\n",
+        "In this tutorial we'll walk through a simple convolutional neural network to classify the images in CIFAR10 using PyTorch.\n",
         "\n",
         "We’ll also set up Weights & Biases to log models metrics, inspect performance and share findings about the best architecture for the network. In this example we're using Google Colab as a convenient hosted environment, but you can run your own training scripts from anywhere and visualize metrics with W&B's experiment tracking tool.\n",
         "\n",
@@ -119,7 +119,7 @@
       "metadata": {
         "id": "rsSGtmAVfMZk",
         "colab_type": "code",
-        "outputId": "c193bc86-7084-4e86-caf6-13d8d965bea1",
+        "outputId": "960ba07f-d76b-4d55-bc54-8742678ec320",
         "colab": {
           "base_uri": "https://localhost:8080/",
           "height": 170
@@ -129,7 +129,7 @@
         "# WandB – Login to your wandb account so you can log all your metrics\n",
         "!wandb login"
       ],
-      "execution_count": 18,
+      "execution_count": 3,
       "outputs": [
         {
           "output_type": "stream",
@@ -172,33 +172,30 @@
         "        \n",
         "        # In our constructor, we define our neural network architecture that we'll use in the forward pass.\n",
         "        # Conv2d() adds a convolution layer that generates 2 dimensional feature maps to learn different aspects of our image\n",
-        "        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)\n",
-        "        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)\n",
-        "        \n",
-        "        # Dropout randomly turns off a percentage of neurons at each training step resulting\n",
-        "        # in a more robust neural network that is resistant to overfitting\n",
-        "        self.conv2_drop = nn.Dropout2d()\n",
+        "        self.conv1 = nn.Conv2d(3, 6, kernel_size=5)\n",
+        "        self.conv2 = nn.Conv2d(6, 16, kernel_size=5)\n",
         "        \n",
         "        # Linear(x,y) creates dense, fully connected layers with x inputs and y outputs\n",
         "        # Linear layers simply output the dot product of our inputs and weights.\n",
-        "        self.fc1 = nn.Linear(320, 50)\n",
-        "        self.fc2 = nn.Linear(50, 10)\n",
+        "        self.fc1 = nn.Linear(16 * 5 * 5, 120)\n",
+        "        self.fc2 = nn.Linear(120, 84)\n",
+        "        self.fc3 = nn.Linear(84, 10)\n",
         "\n",
         "    def forward(self, x):\n",
         "        # Here we feed the feature maps from the convolutional layers into a max_pool2d layer.\n",
         "        # The max_pool2d layer reduces the size of the image representation our convolutional layers learnt,\n",
         "        # and in doing so it reduces the number of parameters and computations the network needs to perform.\n",
         "        # Finally we apply the relu activation function which gives us max(0, max_pool2d_output)\n",
         "        x = F.relu(F.max_pool2d(self.conv1(x), 2))\n",
-        "        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))\n",
+        "        x = F.relu(F.max_pool2d(self.conv2(x), 2))\n",
         "        \n",
-        "        # Reshapes x into size (-1, 320) so we can feed the convolution layer outputs into our fully connected layer\n",
-        "        x = x.view(-1, 320)\n",
+        "        # Reshapes x into size (-1, 16 * 5 * 5) so we can feed the convolution layer outputs into our fully connected layer\n",
+        "        x = x.view(-1, 16 * 5 * 5)\n",
         "        \n",
         "        # We apply the relu activation function and dropout to the output of our fully connected layers\n",
         "        x = F.relu(self.fc1(x))\n",
-        "        x = F.dropout(x, training=self.training)\n",
-        "        x = self.fc2(x)\n",
+        "        x = F.relu(self.fc2(x))\n",
+        "        x = self.fc3(x)\n",
         "        \n",
         "        # Finally we apply the softmax function to squash the probabilities of each class (0-9) and ensure they add to 1.\n",
         "        return F.log_softmax(x, dim=1)"
@@ -275,7 +272,7 @@
         "colab": {}
       },
       "source": [
-        "def test(args, model, device, test_loader):\n",
+        "def test(args, model, device, test_loader, classes):\n",
         "    # Switch model to evaluation mode. This is necessary for layers like dropout, batchnorm etc which behave differently in training and evaluation mode\n",
         "    model.eval()\n",
         "    test_loss = 0\n",
@@ -299,7 +296,7 @@
         "            \n",
         "            # WandB – Log images in your test dataset automatically, along with predicted and true labels by passing pytorch tensors with image data into wandb.Image\n",
         "            example_images.append(wandb.Image(\n",
-        "                data[0], caption=\"Pred: {} Truth: {}\".format(pred[0].item(), target[0])))\n",
+        "                data[0], caption=\"Pred: {} Truth: {}\".format(classes[pred[0].item()], classes[target[0]])))\n",
         "    \n",
         "    # WandB – wandb.log(a_dict) logs the keys and values of the dictionary passed in and associates the values with a step.\n",
         "    # You can log anything by passing it to wandb.log, including histograms, custom matplotlib objects, images, video, text, tables, html, pointclouds and other 3D objects.\n",
@@ -352,10 +349,10 @@
       "metadata": {
         "id": "bZpt5W2NNl6S",
         "colab_type": "code",
-        "outputId": "7542bd8a-6fe9-494e-d850-0515b72f576f",
+        "outputId": "14db5304-66aa-41fd-8548-d37a598f8d18",
         "colab": {
           "base_uri": "https://localhost:8080/",
-          "height": 34
+          "height": 68
         }
       },
       "source": [
@@ -385,20 +382,22 @@
         "    # numpy.random.seed(config.seed) # numpy random seed\n",
         "    torch.backends.cudnn.deterministic = True\n",
         "\n",
-        "    # Load the dataset: We're training our CNN on MNIST which consists of black and white images of hand-written digits, 0 to 9.\n",
-        "    train_loader = torch.utils.data.DataLoader(\n",
-        "        datasets.MNIST('../data', train=True, download=True,\n",
-        "                       transform=transforms.Compose([\n",
-        "                           transforms.ToTensor(),\n",
-        "                           transforms.Normalize((0.1307,), (0.3081,))\n",
-        "                       ])),\n",
-        "        batch_size=config.batch_size, shuffle=True, **kwargs)\n",
-        "    test_loader = torch.utils.data.DataLoader(\n",
-        "        datasets.MNIST('../data', train=False, transform=transforms.Compose([\n",
-        "            transforms.ToTensor(),\n",
-        "            transforms.Normalize((0.1307,), (0.3081,))\n",
-        "        ])),\n",
-        "        batch_size=config.test_batch_size, shuffle=True, **kwargs)\n",
+        "    # Load the dataset: We're training our CNN on CIFAR10 (https://www.cs.toronto.edu/~kriz/cifar.html)\n",
+        "    # First we define the tranformations to apply to our images\n",
+        "    transform = transforms.Compose(\n",
+        "    [transforms.ToTensor(),\n",
+        "     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])\n",
+        "    \n",
+        "    # Now we load our training and test datasets and apply the transformations defined above\n",
+        "    train_loader = torch.utils.data.DataLoader(datasets.CIFAR10(root='./data', train=True,\n",
+        "                                              download=True, transform=transform), batch_size=config.batch_size,\n",
+        "                                              shuffle=True, **kwargs)\n",
+        "    test_loader = torch.utils.data.DataLoader(datasets.CIFAR10(root='./data', train=False,\n",
+        "                                             download=True, transform=transform), batch_size=config.test_batch_size,\n",
+        "                                             shuffle=False, **kwargs)\n",
+        "\n",
+        "    classes = ('plane', 'car', 'bird', 'cat',\n",
+        "               'deer', 'dog', 'frog', 'horse', 'ship', 'truck')\n",
         "\n",
         "    # Initialize our model, recursively go over all modules and convert their parameters and buffers to CUDA tensors (if device is set to cuda)\n",
         "    model = Net().to(device)\n",
@@ -411,7 +410,7 @@
         "\n",
         "    for epoch in range(1, config.epochs + 1):\n",
         "        train(config, model, device, train_loader, optimizer, epoch)\n",
-        "        test(config, model, device, test_loader)\n",
+        "        test(config, model, device, test_loader, classes)\n",
         "        \n",
         "    # WandB – Save the model checkpoint. This automatically saves a file to the cloud and associates it with the current run.\n",
         "    torch.save(model.state_dict(), \"model.h5\")\n",
@@ -427,7 +426,7 @@
           "data": {
             "text/html": [
               "\n",
-              "            Notebook configured with <a href=\"https://wandb.com\" target=\"_blank\">W&B</a>. You can <a href=\"https://app.wandb.ai/wandb/pytorch-intro/runs/4dpea4bq\" target=\"_blank\">open</a> the run page, or call <code>%%wandb</code>\n",
+              "            Notebook configured with <a href=\"https://wandb.com\" target=\"_blank\">W&B</a>. You can <a href=\"https://app.wandb.ai/wandb/pytorch-intro/runs/f8mmkoxt\" target=\"_blank\">open</a> the run page, or call <code>%%wandb</code>\n",
               "            in a cell containing your training loop to display live results.  Learn more in our <a href=\"https://docs.wandb.com/docs/integrations/jupyter.html\" target=\"_blank\">docs</a>.\n",
               "        "
             ],
@@ -438,6 +437,14 @@
           "metadata": {
             "tags": []
           }
+        },
+        {
+          "output_type": "stream",
+          "text": [
+            "Files already downloaded and verified\n",
+            "Files already downloaded and verified\n"
+          ],
+          "name": "stdout"
         }
       ]
     },
@@ -461,13 +468,20 @@
         "\n",
         "\n",
         "## Visualize Gradients\n",
-        "Click through to a single run to see more details about that run. For example, on [this run page](https://app.wandb.ai/wandb/pytorch-intro/runs/i3g6g0eq) you can see the gradients I logged when I ran this script.\n",
+        "Click through to a single run to see more details about that run. For example, on [this run page](https://app.wandb.ai/wandb/pytorch-intro/runs/f8mmkoxt) you can see the gradients I logged when I ran this script.\n",
         "\n",
         "![gradients](https://i.imgur.com/za8S6Xv.png)\n",
         "\n",
         "\n",
+        "## Visualize Predictions\n",
+        "You can visualize predictions made at everystep by clicking on the Media tab. Here we can see an example of true labels and predictions made by our model on the CIFAR dataset.\n",
+        "\n",
+        "![predictions](https://i.imgur.com/vzye9ei.png)\n",
+        "\n",
+        "\n",
         "## Review Code\n",
         "The overview tab picks up a link to the code. In this case, it's a link to the Google Colab. If you're running a script from a git repo, we'll pick up the SHA of the latest git commit and give you a link to that version of the code in your own GitHub repo.\n",
+        "\n",
         "![overview](https://i.imgur.com/FEBNXcI.png)\n",
         "\n",
         "## Visualize Relationships\n",