<a href="https://colab.research.google.com/github/mafaldasalomao/pavic_treinamento_ml/blob/main/PAVIC_ML_20_DCGAN_PyTorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Training a DCGAN in PyTorch

## Project Structure


```
# !tree .
.
├── dcgan_mnist.py
├── output
│   ├── epoch_0002.png
│   ├── epoch_0004.png
│   ├── epoch_0006.png
│   ├── epoch_0008.png
│   ├── epoch_0010.png
│   ├── epoch_0012.png
│   ├── epoch_0014.png
│   ├── epoch_0016.png
│   ├── epoch_0018.png
│   └── epoch_0020.png
├── output.gif
└── pyimagesearch
    ├── dcgan.py
    └── __init__.py
```



In the pyimagesearch directory, we have two files:

dcgan.py: Contains the complete DCGAN architecture
__init__.py: Turns the pyimagesearch into a python directory
In the parent directory, we have the dcgan_mnist.py script, which will train the DCGAN and draw inference from it.

Apart from these, we have the output directory, which contains the epoch-wise visualization of images generated by the DCGAN Generator. Finally, we have the output.gif, which contains the visualizations converted into a gif.

In [1]:
import os

os.makedirs("pyimagesearch", exist_ok=True)
os.makedirs("output", exist_ok=True)

In [2]:
%%writefile pyimagesearch/dcgan.py
# import the necessary packages
from torch.nn import ConvTranspose2d
from torch.nn import BatchNorm2d
from torch.nn import Conv2d
from torch.nn import Linear
from torch.nn import LeakyReLU
from torch.nn import ReLU
from torch.nn import Tanh
from torch.nn import Sigmoid
from torch import flatten
from torch import nn
class Generator(nn.Module):
    def __init__(self, inputDim=100, outputChannels=1):
        super(Generator, self).__init__()
        # first set of CONVT => RELU => BN
        self.ct1 = ConvTranspose2d(in_channels=inputDim,
          out_channels=128, kernel_size=4, stride=2, padding=0,
          bias=False)
        self.relu1 = ReLU()
        self.batchNorm1 = BatchNorm2d(128)
        # second set of CONVT => RELU => BN
        self.ct2 = ConvTranspose2d(in_channels=128, out_channels=64,
              kernel_size=3, stride=2, padding=1, bias=False)
        self.relu2 = ReLU()
        self.batchNorm2 = BatchNorm2d(64)
        # last set of CONVT => RELU => BN
        self.ct3 = ConvTranspose2d(in_channels=64, out_channels=32,
              kernel_size=4, stride=2, padding=1, bias=False)
        self.relu3 = ReLU()
        self.batchNorm3 = BatchNorm2d(32)
        # apply another upsample and transposed convolution, but
        # this time output the TANH activation
        self.ct4 = ConvTranspose2d(in_channels=32,
          out_channels=outputChannels, kernel_size=4, stride=2,
          padding=1, bias=False)
        self.tanh = Tanh()
    def forward(self, x):
      # pass the input through our first set of CONVT => RELU => BN
      # layers
      x = self.ct1(x)
      x = self.relu1(x)
      x = self.batchNorm1(x)
      # pass the output from previous layer through our second
      # CONVT => RELU => BN layer set
      x = self.ct2(x)
      x = self.relu2(x)
      x = self.batchNorm2(x)
      # pass the output from previous layer through our last set
      # of CONVT => RELU => BN layers
      x = self.ct3(x)
      x = self.relu3(x)
      x = self.batchNorm3(x)
      # pass the output from previous layer through CONVT2D => TANH
      # layers to get our output
      x = self.ct4(x)
      output = self.tanh(x)
      # return the output
      return output
class Discriminator(nn.Module):
  def __init__(self, depth, alpha=0.2):
      super(Discriminator, self).__init__()
      # first set of CONV => RELU layers
      self.conv1 = Conv2d(in_channels=depth, out_channels=32,
          kernel_size=4, stride=2, padding=1)
      self.leakyRelu1 = LeakyReLU(alpha, inplace=True)
      # second set of CONV => RELU layers
      self.conv2 = Conv2d(in_channels=32, out_channels=64, kernel_size=4,
          stride=2, padding=1)
      self.leakyRelu2 = LeakyReLU(alpha, inplace=True)
      # first (and only) set of FC => RELU layers
      self.fc1 = Linear(in_features=3136, out_features=512)
      self.leakyRelu3 = LeakyReLU(alpha, inplace=True)
      # sigmoid layer outputting a single value
      self.fc2 = Linear(in_features=512, out_features=1)
      self.sigmoid = Sigmoid()
  def forward(self, x):
      # pass the input through first set of CONV => RELU layers
      x = self.conv1(x)
      x = self.leakyRelu1(x)
      # pass the output from the previous layer through our second
      # set of CONV => RELU layers
      x = self.conv2(x)
      x = self.leakyRelu2(x)
      # flatten the output from the previous layer and pass it
      # through our first (and only) set of FC => RELU layers
      x = flatten(x, 1)
      x = self.fc1(x)
      x = self.leakyRelu3(x)
      # pass the output from the previous layer through our sigmoid
      # layer outputting a single value
      x = self.fc2(x)
      output = self.sigmoid(x)
      # return the output
      return output

Writing pyimagesearch/dcgan.py


## Training The DCGAN


In [3]:
%%writefile dcgan_mnist.py
# USAGE
# python dcgan_mnist.py --output output
# import the necessary packages
from pyimagesearch.dcgan import Generator
from pyimagesearch.dcgan import Discriminator
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader
from torchvision.transforms import ToTensor
from torchvision import transforms
from sklearn.utils import shuffle
from imutils import build_montages
from torch.optim import Adam
from torch.nn import BCELoss
from torch import nn
import numpy as np
import argparse
import torch
import cv2
import os
# custom weights initialization called on generator and discriminator
def weights_init(model):
	# get the class name
	classname = model.__class__.__name__
	# check if the classname contains the word "conv"
	if classname.find("Conv") != -1:
		# intialize the weights from normal distribution
		nn.init.normal_(model.weight.data, 0.0, 0.02)
	# otherwise, check if the name contains the word "BatcnNorm"
	elif classname.find("BatchNorm") != -1:
		# intialize the weights from normal distribution and set the
		# bias to 0
		nn.init.normal_(model.weight.data, 1.0, 0.02)
		nn.init.constant_(model.bias.data, 0)
# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-o", "--output", required=True,
	help="path to output directory")
ap.add_argument("-e", "--epochs", type=int, default=20,
	help="# epochs to train for")
ap.add_argument("-b", "--batch-size", type=int, default=128,
	help="batch size for training")
args = vars(ap.parse_args())
# store the epochs and batch size in convenience variables
NUM_EPOCHS = args["epochs"]
BATCH_SIZE = args["batch_size"]
# set the device we will be using
DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# define data transforms
dataTransforms = transforms.Compose([
	transforms.ToTensor(),
	transforms.Normalize((0.5), (0.5))]
)
# load the MNIST dataset and stack the training and testing data
# points so we have additional training data
print("[INFO] loading MNIST dataset...")
trainData = MNIST(root="data", train=True, download=True,
	transform=dataTransforms)
testData = MNIST(root="data", train=False, download=True,
	transform=dataTransforms)
data = torch.utils.data.ConcatDataset((trainData, testData))
# initialize our dataloader
dataloader = DataLoader(data, shuffle=True,
	batch_size=BATCH_SIZE)
# calculate steps per epoch
stepsPerEpoch = len(dataloader.dataset) // BATCH_SIZE
# build the generator, initialize it's weights, and flash it to the
# current device
print("[INFO] building generator...")
gen = Generator(inputDim=100, outputChannels=1)
gen.apply(weights_init)
gen.to(DEVICE)
# build the discriminator, initialize it's weights, and flash it to
# the current device
print("[INFO] building discriminator...")
disc = Discriminator(depth=1)
disc.apply(weights_init)
disc.to(DEVICE)
# initialize optimizer for both generator and discriminator
genOpt = Adam(gen.parameters(), lr=0.0002, betas=(0.5, 0.999),
	weight_decay=0.0002 / NUM_EPOCHS)
discOpt = Adam(disc.parameters(), lr=0.0002, betas=(0.5, 0.999),
	weight_decay=0.0002 / NUM_EPOCHS)
# initialize BCELoss function
criterion = BCELoss()
# randomly generate some benchmark noise so we can consistently
# visualize how the generative modeling is learning
print("[INFO] starting training...")
benchmarkNoise = torch.randn(256, 100, 1, 1, device=DEVICE)
# define real and fake label values
realLabel = 1
fakeLabel = 0
# loop over the epochs
for epoch in range(NUM_EPOCHS):
    # show epoch information and compute the number of batches per
    # epoch
    print("[INFO] starting epoch {} of {}...".format(epoch + 1,
      NUM_EPOCHS))
    # initialize current epoch loss for generator and discriminator
    epochLossG = 0
    epochLossD = 0
    for x in dataloader:
      # zero out the discriminator gradients
      disc.zero_grad()
      # grab the images and send them to the device
      images = x[0]
      images = images.to(DEVICE)
      # get the batch size and create a labels tensor
      bs =  images.size(0)
      labels = torch.full((bs,), realLabel, dtype=torch.float,
        device=DEVICE)
      # forward pass through discriminator
      output = disc(images).view(-1)
      # calculate the loss on all-real batch
      errorReal = criterion(output, labels)
      # calculate gradients by performing a backward pass
      errorReal.backward()
      # randomly generate noise for the generator to predict on
      noise = torch.randn(bs, 100, 1, 1, device=DEVICE)
      # generate a fake image batch using the generator
      fake = gen(noise)
      labels.fill_(fakeLabel)
      # perform a forward pass through discriminator using fake
      # batch data
      output = disc(fake.detach()).view(-1)
      errorFake = criterion(output, labels)
      # calculate gradients by performing a backward pass
      errorFake.backward()
      # compute the error for discriminator and update it
      errorD = errorReal + errorFake
      discOpt.step()
      # set all generator gradients to zero
      gen.zero_grad()
      # update the labels as fake labels are real for the generator
      # and perform a forward pass  of fake data batch through the
      # discriminator
      labels.fill_(realLabel)
      output = disc(fake).view(-1)
      # calculate generator's loss based on output from
      # discriminator and calculate gradients for generator
      errorG = criterion(output, labels)
      errorG.backward()
      # update the generator
      genOpt.step()
      # add the current iteration loss of discriminator and
      # generator
      epochLossD += errorD
      epochLossG += errorG
      # display training information to disk
    print("[INFO] Generator Loss: {:.4f}, Discriminator Loss: {:.4f}".format(
      epochLossG / stepsPerEpoch, epochLossD / stepsPerEpoch))
    # check to see if we should visualize the output of the
    # generator model on our benchmark data
    if (epoch + 1) % 2 == 0:
      # set the generator in evaluation phase, make predictions on
      # the benchmark noise, scale it back to the range [0, 255],
      # and generate the montage
      gen.eval()
      images = gen(benchmarkNoise)
      images = images.detach().cpu().numpy().transpose((0, 2, 3, 1))
      images = ((images * 127.5) + 127.5).astype("uint8")
      images = np.repeat(images, 3, axis=-1)
      vis = build_montages(images, (28, 28), (16, 16))[0]
      # build the output path and write the visualization to disk
      p = os.path.join(args["output"], "epoch_{}.png".format(
        str(epoch + 1).zfill(4)))
      cv2.imwrite(p, vis)
      # set the generator to training mode
      gen.train()

Writing dcgan_mnist.py


In [4]:
!python dcgan_mnist.py --output output

[INFO] loading MNIST dataset...
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to data/MNIST/raw/train-images-idx3-ubyte.gz
  0% 0/9912422 [00:00<?, ?it/s]100% 9912422/9912422 [00:00<00:00, 172443202.53it/s]
Extracting data/MNIST/raw/train-images-idx3-ubyte.gz to data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to data/MNIST/raw/train-labels-idx1-ubyte.gz
100% 28881/28881 [00:00<00:00, 168010671.05it/s]
Extracting data/MNIST/raw/train-labels-idx1-ubyte.gz to data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to data/MNIST/raw/t10k-images-idx3-ubyte.gz
100% 1648877/1648877 [00:00<00:00, 38770771.20it/s]
Extracting data/MNIST/raw/t10k-images-idx3-ubyte.gz to data/MNIST/raw

Do

In [5]:
#DCGAN Training Results and Visualizations

[Codigo fonte](https://pyimagesearch.com/2021/10/25/training-a-dcgan-in-pytorch/)