## Part 1 Chapter 2: Pretrained Models

In [1]:
from torchvision import models
import torch
from torchvision import transforms
from PIL import Image
dir(models);

In [5]:
alexnet = models.AlexNet()
# we will run a forward pass through the network
resnet = models.resnet101(pretrained=True)
resnet;

we’ll see a lot of Bottleneck modules repeating one after the other (101 of them!), containing convolutions and other modules. That’s the anatomy(组成部分) of a typical deep neural network for computer vision: a more or less sequential cascade of filters and nonlinear functions, ending with a layer (fc) producing scores for each of the 1,000 output classes (out_features).

In [6]:
# picture preprocessing 
preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
        )
])

img = Image.open("../data/p1ch2/bobby.jpg")

In [7]:
img_t = preprocess(img)
batch_t = torch.unsqueeze(img_t, 0)
img_t.shape, batch_t.shape

(torch.Size([3, 224, 224]), torch.Size([1, 3, 224, 224]))

In [9]:
resnet.eval()
out = resnet(batch_t)
torch.argmax(out).item()
with open('../data/p1ch2/imagenet_classes.txt') as f:
    labels = [line.strip() for line in f.readlines()]

_, index = torch.max(out, 1)
percentage = torch.nn.functional.softmax(out, dim=1)[0] * 100
labels[index[0]], percentage[index[0]].item()

('golden retriever', 96.29336547851562)

In [10]:
_, indices = torch.sort(out, descending=True)
[(labels[idx], percentage[idx].item()) for idx in indices[0][:5]]

[('golden retriever', 96.29336547851562),
 ('Labrador retriever', 2.8081140518188477),
 ('cocker spaniel, English cocker spaniel, cocker', 0.28267380595207214),
 ('redbone', 0.20863007009029388),
 ('tennis ball', 0.11621550470590591)]

AlexNet (http://mng.bz/lo6z), ResNet (https://arxiv.org/pdf/1512.03385.pdf), and Inception v3 (https://arxiv.org/pdf/1512.00567.pdf). 

## Part 1 Chapter 3

Deep learning really consists of building a system that can transform data from one representation to another. 