<a href="https://colab.research.google.com/github/SUTFutureCoder/LookForJayChou/blob/master/LookForJayChou.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LookForJayChou
寻找周杰伦

用于练习图像检索会很有意思的样子

十年前我也是饭圈男孩！

# 数据集来源
数据集来源于[十个明星的面部数据集](https://aistudio.baidu.com/aistudio/datasetDetail/13959)

# Let's roll

### 第一步直接从Google硬盘解压数据集

In [1]:
!unzip /content/drive/My\ Drive/AIColab/LookForJayChou/images.zip

Archive:  /content/drive/My Drive/AIColab/LookForJayChou/images.zip
   creating: images/
   creating: images/face/
   creating: images/face/fanbingbing/
  inflating: images/face/fanbingbing/1.png  
  inflating: images/face/fanbingbing/10.png  
  inflating: images/face/fanbingbing/101.png  
  inflating: images/face/fanbingbing/102.png  
  inflating: images/face/fanbingbing/103.png  
  inflating: images/face/fanbingbing/104.png  
  inflating: images/face/fanbingbing/105.png  
  inflating: images/face/fanbingbing/106.png  
  inflating: images/face/fanbingbing/108.png  
  inflating: images/face/fanbingbing/109.png  
  inflating: images/face/fanbingbing/11.png  
  inflating: images/face/fanbingbing/110.png  
  inflating: images/face/fanbingbing/111.png  
  inflating: images/face/fanbingbing/112.png  
  inflating: images/face/fanbingbing/114.png  
  inflating: images/face/fanbingbing/115.png  
  inflating: images/face/fanbingbing/118.png  
  inflating: images/face/fanbingbing/119.png  
  inf

### 开写

In [0]:
import torch
import torch.nn as nn
import torchvision
from torchvision import transforms, datasets
import torch.utils.data as data
import torch.optim as optim
from torch.optim import lr_scheduler
import numpy as np
import copy

### 预定义超参数

In [0]:
IMG_SIZE = 256
INPUT_SIZE = 224
BATCH_SIZE = 32
EPOCHS_SIZE = 32
BASE_LR = 0.01
CLASSFIERS = 10
CUDA = torch.cuda.is_available()
DEVICE = torch.device('cuda' if CUDA else 'cpu')

### 预定义图像变换

In [0]:
transform = {
    "train": transforms.Compose([
        transforms.RandomResizedCrop(INPUT_SIZE),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    "val": transforms.Compose([
        transforms.Resize(IMG_SIZE),
        transforms.CenterCrop(INPUT_SIZE),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])
}

### 读入图片

In [0]:
class DataSet(data.Dataset):
  def __init__(self, dataset, transform):
    self.dataset = dataset
    self.transform = transform

  def __getitem__(self, index):
    return self.transform(self.dataset[index][0]), self.dataset[index][1]

  def __len__(self):
    return len(self.dataset)

In [0]:
data_dir = "./images/face"


image_datasets = datasets.ImageFolder(data_dir)

train_size = int(len(image_datasets) * 0.8)
val_size = len(image_datasets) - train_size

train_subdataset, val_subdataset = data.random_split(image_datasets, [train_size, val_size])

splited_datasets = {x: DataSet(train_subdataset, transform[x]) for x in ['train', 'val'] }
dataloaders = {x: torch.utils.data.DataLoader(splited_datasets[x], batch_size=BATCH_SIZE, shuffle=True, num_workers=4) for x in ['train', 'val']}

### 定义模型

In [0]:
def ResNet152():
  model = torchvision.models.resnet152(pretrained=True)

  # freeze
  for param in model.parameters():
    param.requires_grad = False
  
  # 放开最后一层
  for param in model.layer4.parameters():
    param.requires_grad = True

  model.fc = nn.Linear(model.fc.in_features, CLASSFIERS)
  model.to(DEVICE)
  return model

### 定义训练函数

In [0]:
def train(model, dataloader, optimizer, criterion):
  best_model = copy.deepcopy(model.state_dict())
  best_acc = 0.0

  for epoch in range(EPOCHS_SIZE):
    print("EPOCH {}/{}".format(epoch, EPOCHS_SIZE - 1))
    print("-" * 10)

    optim.lr_scheduler.StepLR(optimizer, 15, 0.01)
    
    for phase in ['train', 'val']:
      if phase == 'train':
        model.train()
      else:
        model.eval()

      running_loss = 0.0
      running_correct = 0
      
      for x, y in dataloader[phase]:
        x = x.to(DEVICE)
        y = y.to(DEVICE)

        optimizer.zero_grad()

        with torch.set_grad_enabled(phase == 'train'):
          out = model(x)
          loss = criterion(out, y)

        _, pred = torch.max(out, 1)



        if phase == 'train':
          loss.backward()
          optimizer.step()

        running_loss += loss.item() / x.size(0)
        running_correct += torch.sum(pred == y.data)

      epoch_loss = running_loss / len(dataloader[phase].dataset)
      epoch_acc = running_correct.double() / len(dataloader[phase].dataset)
      print("{} Loss: {:.4f} Acc: {:.4f} Best: {:.4f}".format(epoch, epoch_loss, epoch_acc, best_acc))


      if phase == 'val' and epoch_acc > best_acc:
        best_model = copy.deepcopy(model.state_dict())
        best_acc = epoch_acc
  model.load_state_dict(best_model)
  return model

### 执行一波

In [38]:
model = ResNet152()
if not CUDA:
  criterion = nn.CrossEntropyLoss()
else:
  criterion = nn.CrossEntropyLoss().cuda(DEVICE)
optimizer = optim.SGD(model.parameters(), BASE_LR, momentum=0.9, weight_decay=0.0001)

model = train(model, dataloaders, optimizer, criterion)
torch.save(model.state_dict(), './mdl.pkl')


EPOCH 0/31
----------
0 Loss: 0.0020 Acc: 0.3075 Best: 0.0000
0 Loss: 0.0013 Acc: 0.5461 Best: 0.0000
EPOCH 1/31
----------
1 Loss: 0.0014 Acc: 0.5909 Best: 0.5461
1 Loss: 0.0018 Acc: 0.6421 Best: 0.5461
EPOCH 2/31
----------
2 Loss: 0.0011 Acc: 0.6692 Best: 0.6421
2 Loss: 0.0010 Acc: 0.7614 Best: 0.6421
EPOCH 3/31
----------
3 Loss: 0.0008 Acc: 0.7307 Best: 0.7614
3 Loss: 0.0008 Acc: 0.7884 Best: 0.7614
EPOCH 4/31
----------
4 Loss: 0.0007 Acc: 0.7829 Best: 0.7884
4 Loss: 0.0004 Acc: 0.8975 Best: 0.7884
EPOCH 5/31
----------
5 Loss: 0.0006 Acc: 0.7903 Best: 0.8975
5 Loss: 0.0001 Acc: 0.9515 Best: 0.8975
EPOCH 6/31
----------
6 Loss: 0.0005 Acc: 0.8332 Best: 0.9515
6 Loss: 0.0001 Acc: 0.9534 Best: 0.9515
EPOCH 7/31
----------
7 Loss: 0.0004 Acc: 0.8611 Best: 0.9534
7 Loss: 0.0002 Acc: 0.9478 Best: 0.9534
EPOCH 8/31
----------
8 Loss: 0.0003 Acc: 0.8966 Best: 0.9534
8 Loss: 0.0001 Acc: 0.9823 Best: 0.9534
EPOCH 9/31
----------
9 Loss: 0.0004 Acc: 0.8835 Best: 0.9823
9 Loss: 0.0002 Acc: 

### 赶紧保存起来~

In [0]:
!cp ./mdl.pkl /content/drive/My\ Drive/AIColab/LookForJayChou/

### 进行分类验证

### 将原始数据集最后输出保存到文件中，用于检索出最相似图片
