<hr style="border: solid 3px blue;">

# Introduction

![](https://miro.medium.com/max/1313/1*MIJZKbsCLYuNp5yV26OH8g.gif)

Picture Credit: https://miro.medium.com

**Semi-supervised learning**

> Semi-supervised learning is an approach to machine learning that combines a small amount of labeled data with a large amount of unlabeled data during training. Semi-supervised learning falls between unsupervised learning (with no labeled training data) and supervised learning (with only labeled training data). It is a special instance of weak supervision.
> 
> Unlabeled data, when used in conjunction with a small amount of labeled data, can produce considerable improvement in learning accuracy. The acquisition of labeled data for a learning problem often requires a skilled human agent (e.g. to transcribe an audio segment) or a physical experiment (e.g. determining the 3D structure of a protein or determining whether there is oil at a particular location). The cost associated with the labeling process thus may render large, fully labeled training sets infeasible, whereas acquisition of unlabeled data is relatively inexpensive. In such situations, semi-supervised learning can be of great practical value. Semi-supervised learning is also of theoretical interest in machine learning and as a model for human learning.

Ref: https://en.wikipedia.org/wiki/Semi-supervised_learning


**Labeled data is expensive!**
 In particular, collecting and labeling datasets is a very laborious and difficult task.
However, in order for models to perform supervised learning well, a labeled big dataset is required.

In this notebook, when the labeled dataset is small, one of the ways to overcome this is the Semi-Supervised GAN (SGAN).

![](https://drek4537l1klr.cloudfront.net/langr/Figures/07fig03_alt.jpg)

Picture Credit: https://drek4537l1klr.cloudfront.net

GANs can be broadly divided into generators and discriminators. In general, GANs focus on the training of the generator to make the fakes look real, and then use the generators after training.
However, in SGAN, learning proceeds in the direction of improving the performance of the discriminator, and the discriminator is used after training.

**In this notebook, we would like to organize them in the following order.**
1. After modeling and training for SGAN, check the result.
2. After training the same CNN model through supervised learning, check the result.
3. By comparing the performance of the above two models, it is confirmed that SGAN is effective in a dataset that does not have much labeling.

----------------------------------------------
# Setting up

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split

from keras import backend as K

from keras.datasets import mnist
from keras.layers import (Activation, BatchNormalization, Concatenate, Dense,
                          Dropout, Flatten, Input, Lambda, Reshape)
from keras.layers.advanced_activations import LeakyReLU
from keras.layers.convolutional import Conv2D, Conv2DTranspose
from keras.models import Model, Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical

In [None]:
'''The Dimensions of the traget images'''
img_rows = 28
img_cols = 28
channels = 1

img_shape = (img_rows , img_cols , channels)

z_dim = 100 #random noise input for generator 

num_classes = 10 #no. of classes to predict

The size of the random noise vector to be used as input for the generator is 100 dimensions, and the number of labels to predict is 10.

In [None]:
class Dataset:
    def __init__(self, num_labeled):

        self.num_labeled = num_labeled                               
        
        (self.x_train, self.y_train), (self.x_test,self.y_test) = mnist.load_data()

        def preprocess_imgs(x):
            x = (x.astype(np.float32) - 127.5) / 127.5                   
            x = np.expand_dims(x, axis=3)                                
            return x

        def preprocess_labels(y):
            return y.reshape(-1, 1)

        self.x_train = preprocess_imgs(self.x_train)                     
        self.y_train = preprocess_labels(self.y_train)

        self.x_test = preprocess_imgs(self.x_test)                       
        self.y_test = preprocess_labels(self.y_test)

    def batch_labeled(self, batch_size):
        idx = np.random.randint(0, self.num_labeled, batch_size)         
        imgs = self.x_train[idx]
        labels = self.y_train[idx]
        return imgs, labels

    def batch_unlabeled(self, batch_size):
        idx = np.random.randint(self.num_labeled, self.x_train.shape[0], batch_size)
        imgs = self.x_train[idx]
        return imgs

    def training_set(self):
        x_train = self.x_train[range(self.num_labeled)]
        y_train = self.y_train[range(self.num_labeled)]
        return x_train, y_train

    def test_set(self):
        return self.x_test, self.y_test

In [None]:
num_labeled = 100

dataset = Dataset(num_labeled)

For SGAN, it is assumed that only 100 images are labeled in the MNIST traing dataset. 

Assume that the remaining 49900 labels are not set.

----------------------------------
# Generator

In [None]:
def build_generator(z_dim):
    model = Sequential()
    model.add(Dense(256 * 7 * 7, input_dim=z_dim))                           
    model.add(Reshape((7, 7, 256)))
    model.add(Conv2DTranspose(128, kernel_size=3, strides=2, padding='same'))
    model.add(BatchNormalization())                                       
    model.add(LeakyReLU(alpha=0.01))                                        
    model.add(Conv2DTranspose(64, kernel_size=3, strides=1, padding='same')) 
    model.add(BatchNormalization())                                          
    model.add(LeakyReLU(alpha=0.01))                                         
    model.add(Conv2DTranspose(1, kernel_size=3, strides=2, padding='same'))  
    model.add(Activation('tanh'))                                            
    return model

-----------------------------------------------
# Discriminator

![](https://i.gifer.com/origin/a4/a4f7d2ea5f8cdce902f9a510dbe1a141.gif)

Picture Credit: https://i.gifer.com

For SGAN models, the discriminator is very busy. This is because supervised learning and unsupervised learning must be performed at the same time.

However, during this process, the discriminator learns how to learn effectively when the labeled data is small.

In [None]:
def build_discriminator_net(img_shape):
    model = Sequential()
    
    model.add(Conv2D(32,
                     kernel_size = 3,
                     strides=3,
                    input_shape=img_shape,
                    padding='same'))
    
    model.add(LeakyReLU(alpha=0.01))
    
    model.add(Conv2D(64,
                     kernel_size = 3,
                     strides=3,
                    input_shape=img_shape,
                    padding='same'))
    model.add(BatchNormalization())
    model.add(LeakyReLU(alpha=0.01))
    
    model.add(Conv2D(128,
                     kernel_size = 3,
                     strides=3,
                    input_shape=img_shape,
                    padding='same'))
    model.add(BatchNormalization())
    model.add(LeakyReLU(alpha=0.01))
    model.add(Dropout(0.5))
    
    model.add(Flatten())
    model.add(Dense(num_classes))    
    return model    

## Supervised learning discriminator

In [None]:
def build_discriminator_supervised(discriminator_net):
    model = Sequential()
    model.add(discriminator_net)
    model.add(Activation('softmax'))    
    return model

## Unsupervised learning discriminator

In [None]:
def build_discriminator_unsupervised(discriminator_net):
    model = Sequential()
    model.add(discriminator_net)
    
    def predict(x):
        # Transform distribution over real classes into a binary real-vs-fake probability
        prediction = 1.0 - (1.0 / (K.sum(K.exp(x), axis=-1, keepdims=True) + 1.0))
        return prediction

    model.add(Lambda(predict))    
    return model

In [None]:
def build_gan(generator, discriminator):
    model = Sequential()
    model.add(generator)                                                    
    model.add(discriminator)
    return model                 

----------------------------------------------------------------------------------------------
## Building Discriminator Network

In [None]:
# These layers are shared during supervised and unsupervised training
discriminator_net = build_discriminator_net(img_shape)

# Build & compile the Discriminator for supervised training
discriminator_supervised = build_discriminator_supervised(discriminator_net)
discriminator_supervised.compile(loss='categorical_crossentropy',
                                 metrics=['accuracy'],
                                 optimizer=Adam())

# Build & compile the Discriminator for unsupervised training
discriminator_unsupervised = build_discriminator_unsupervised(discriminator_net)
discriminator_unsupervised.compile(loss='binary_crossentropy',
                                   optimizer=Adam())

--------------------------------------------------------------------
## Building Generator Network

In [None]:
generator = build_generator(z_dim)

# Keep Discriminator’s parameters constant for Generator training
discriminator_unsupervised.trainable = False

# Build and compile GAN model with fixed Discriminator to train the Generator
# Note that we are using the Discriminator version with unsupervised output
gan = build_gan(generator, discriminator_unsupervised)
gan.compile(loss='binary_crossentropy', optimizer=Adam())

----------------------------------------------------------
# Training

![](https://drek4537l1klr.cloudfront.net/langr/Figures/07fig02_alt.jpg)

Picture Credit: https://drek4537l1klr.cloudfront.net

As shown in the figure above, the discriminator uses three types of images.
* Labeled images
* Unlabeled images
* Fake images

In SGAN, the discriminator is the most complex. The purpose of the discriminator has two goals:
* Distinguish between real and fake. For this, the discriminator outputs the probability for binary classification using the sigmoid function.
* If it is a real sample, it correctly classifies the label. For this, the discriminator uses the softmax function to output probabilities, one for each target class.

In [None]:
supervised_losses = []
iteration_checkpoints = []
image_grid_rows = 2
image_grid_columns = 2

def display_fake_images(gen_imgs):
    gen_imgs_scale = 0.5 * gen_imgs + 0.5
    fig, axs = plt.subplots(image_grid_rows,
                        image_grid_columns,
                        figsize=(7, 7),
                        sharey=True,
                        sharex=True)
    fig.suptitle('Fake Images', fontsize=20)

    cnt = 0
    plt.figure(figsize=(9,9))
    for i in range(image_grid_rows):
        for j in range(image_grid_columns):
            axs[i, j].imshow(gen_imgs_scale[cnt, :, :, 0], cmap='Blues_r')
            axs[i, j].axis('off')
            cnt += 1            
    plt.show()

In [None]:
def display_label_images(imgs):
    imgs_scale = 0.5 * imgs + 0.5
    fig, axs = plt.subplots(image_grid_rows,
                        image_grid_columns,
                        figsize=(7, 7),
                        sharey=True,
                        sharex=True)

    fig.suptitle('Labeled Images', fontsize=20)
    cnt = 0
    plt.figure(figsize=(9,9))
    for i in range(image_grid_rows):
        for j in range(image_grid_columns):
            axs[i, j].imshow(imgs_scale[cnt, :, :, 0], cmap='Blues_r')
            axs[i, j].axis('off')
            cnt += 1
    plt.show()    

In [None]:
def train(iterations, batch_size, sample_interval):

    # Labels for real images: all 1
    real = np.ones((batch_size, 1))

    # Labels for fake images: all 0
    fake = np.zeros((batch_size, 1))

    for iteration in range(iterations):

        # Take labeled samples.
        imgs, labels = dataset.batch_labeled(batch_size)

        # Encode One-hot encoder labels.
        labels = to_categorical(labels, num_classes=num_classes)

        # Take unlabeled samples.
        imgs_unlabeled = dataset.batch_unlabeled(batch_size)

        # Creates a batch of fake images.
        z = np.random.normal(0, 1, (batch_size, z_dim))
        gen_imgs = generator.predict(z)

        # Train on labeled real samples.
        d_loss_supervised, accuracy = discriminator_supervised.train_on_batch(imgs, labels)

        # train on real, unlabeled samples.
        d_loss_real = discriminator_unsupervised.train_on_batch(
            imgs_unlabeled, real)

        # Train on fake samples.
        d_loss_fake = discriminator_unsupervised.train_on_batch(gen_imgs, fake)

        d_loss_unsupervised = 0.5 * np.add(d_loss_real, d_loss_fake)

        # Creates a batch of fake images.
        z = np.random.normal(0, 1, (batch_size, z_dim))
        gen_imgs = generator.predict(z)

        # Train the creator.
        g_loss = gan.train_on_batch(z, np.ones((batch_size, 1)))

        if (iteration + 1) % sample_interval == 0:
            display_fake_images(gen_imgs)
            display_label_images(imgs)
            # Record the supervised classification loss of the discriminator to plot the graph after training.
            supervised_losses.append(d_loss_supervised)
            iteration_checkpoints.append(iteration + 1)

            # Print the training process.
            print(
                "%d [D loss supervised: %.4f, acc.: %.2f%%] [D loss unsupervised: %.4f] [G loss: %f]"
                % (iteration + 1, d_loss_supervised, 100 * accuracy,
                   d_loss_unsupervised, g_loss))

In [None]:
iterations = 8000
batch_size = 32
sample_interval = 800

# Train the SGAN for the specified number of iterations
train(iterations, batch_size, sample_interval)

The figures above shows labeled images and fake images that are input to a specific iteration. The discriminator has the ability to distinguish the real from the fake through the above inputs and the ability to classify the real ones by class.
Through this learning, training is effectively performed using a labeled dataset with a small number of SGAN models.

## Plotting the supervised learning loss of the discriminator.

In [None]:
losses = np.array(supervised_losses)

plt.figure(figsize=(15, 5))
plt.plot(iteration_checkpoints, losses, label="Discriminator loss")
plt.xticks(iteration_checkpoints, rotation=90)
plt.title("Discriminator – Supervised Loss")
plt.xlabel("Iteration")
plt.ylabel("Loss")
plt.legend()
plt.show()

## Calculating of classification accuracy on the training set

In [None]:
x, y = dataset.training_set()
y = to_categorical(y, num_classes=num_classes)

_, accuracy = discriminator_supervised.evaluate(x, y)
print("Training Accuracy: %.2f%%" % (100 * accuracy))

Wow! Our model completely memorized 100 labeled data.

## Calculating of classification accuracy on the test set

In [None]:
x, y = dataset.test_set()
y = to_categorical(y, num_classes=num_classes)

_, accuracy = discriminator_supervised.evaluate(x, y)
print("Test Accuracy: %.2f%%" % (100 * accuracy))

<hr style="border: solid 3px blue;">

# Comparing with Supervised learning classifier

To check the performance of the SGAN model, after training with the existing supervised learning model, check the results with the test set. Through this, we check which model is more effective in the case of a small dataset.

In [None]:
# Supervised learning classifier with network structure like SGAN discriminator
mnist_classifier = build_discriminator_supervised(build_discriminator_net(img_shape))
mnist_classifier.compile(loss='categorical_crossentropy',
                         metrics=['accuracy'],
                         optimizer=Adam())

In [None]:
imgs, labels = dataset.training_set()

labels = to_categorical(labels, num_classes=num_classes)

training = mnist_classifier.fit(x=imgs,
                                y=labels,
                                batch_size=32,
                                epochs=30,
                                verbose=1)
losses = training.history['loss']
accuracies = training.history['accuracy']

## Plotting Classification Loss

In [None]:
plt.figure(figsize=(10, 5))
plt.plot(np.array(losses), label="Loss")
plt.title("Classification Loss")
plt.legend()
plt.show()

## Plotting Classification Accuracy.

In [None]:
plt.figure(figsize=(10, 5))
plt.plot(np.array(accuracies), label="Accuracy")
plt.title("Classification Accuracy")
plt.legend()
plt.show()

## Calculating of classification accuracy on the training set

In [None]:
x, y = dataset.training_set()
y = to_categorical(y, num_classes=num_classes)

_, accuracy = mnist_classifier.evaluate(x, y)
print("Training Accuracy: %.2f%%" % (100 * accuracy))

## Calculating of classification accuracy on the test set

In [None]:
x, y = dataset.test_set()
y = to_categorical(y, num_classes=num_classes)

_, accuracy = mnist_classifier.evaluate(x, y)
print("Test Accuracy: %.2f%%" % (100 * accuracy))

<hr style="border: solid 3px blue;">

# Conclusion

We used labels for only 100 data in the MNIST dataset, and the rest did not use labels.
We can think of this as a situation where we collected 50000 data, but only 100 became labels.

In this situation, the method we can choose is the SGAN model to train the discriminator in a semi-supervised learning method.

The performance of the testset method is summarized as follows.
* Semi-supervised learning method learning model accuracy: 69.95%
* The model's accuracy of learning in the supervised learning method: 57.83%

In the semi-supervised learning method, the accuracy of the learning model is about 12.12% better.