<a name='0'></a>

# Intoduction to Deep Generative Networks

Deep generative networks are neural network architectures that are used for generating or synthesing data samples such as images and texts. Although most applications of generative models are in research, they can also be used in real-world applications, and it's thus important to study them. While same models that can generate texts can also slightly be modifed to generate images, in this notebook we will focus on the algorithms and applications of generative models in computer vision domain.

***Outline***:

- [1. Introducing generative models: supervised and unsupervised learning, generative and discriminative models](#1)
- [2. Types of generative models](#2)
- [3. Applications of Generative Models](#3)
- [4. Recent Breakthroughts in Generative Modelling](#4)
- [5. Final Notes](#5)
- [6. Further Learning](#6)

<a name='1'></a>

## 1. Introducing Generative Networks: Supervised and Unsupervised Learning, Generative and Discriminative Models

Before we unpack generative models, it's important to first understand the foundational terms. In broad, there are two main types of machine learning algorithms which are supervised and unsupervised learning. Also, there are two kinds of models which are dicriminative models and generative models. Let's see what those terms mean!

### 1. 1 Supervised Learning and Unsupervised Learning

Most machine learning problems are supervised learning problems. In supervised learning, we have input data and labels and we want a learning algorithm that can map data to labels. If our input data and labels are represented by `X` and `y` respectively, a supervised learning algorithm maps `X` to `y`. Examples of popular supervised learning tasks are image classification(image > category label), object detection(image > category label + bounding box), and semantic segmentation(image > segmentation mask).

[[IMAGE CLASSIFICATION]]

Supervised learning works when you have labels. But in real-world, most large-scale datasets don't have labels(or framed differently, it's hard to get labels). What do you do when you don't have labels? Unsupervised learning is a type of machine learning that deals with unlabelled data. Unsupervised learning algorithms learn the underlyining structure of the data. Examples of unspuervised learning tasks are [clustering](https://nyandwi.com/machine_learning_complete/22_intro_to_unsupervised_learning_with_kmeans_clustering/), [dimension reduction](https://nyandwi.com/machine_learning_complete/23_a_practical_intro_to_principal_components_analysis/), and feature learning.

Most generative models are unsupervised, but it's kind of tricky to classify them as unsupervised learning or supervised. For example, generative adversarial networks(GANs) don't need labelled data to generate new data samples, but recent successes in generative modelling revolve around [texts-image mapping](https://openai.com/blog/dall-e/). Thankfully, there is a right term for generative models that use labels to generate things which is conditional generative models. We will cover those models in details in the next notebooks.

### 1.2 Discriminative and Generative Models

Discriminative models learn the probability distribution of labels given the input data P(y/x) which is read as probability of y given x. Most supervised learning tasks can be viewed in the discriminative lens.

Generative models on the other hand learns the distribution of the input data P(x). Generative models typically don't employ labels. They just takes input data and learn the distribution of the data to generate different data samples that have the same distribution. As we saw previously, the kinds of generative models that make use of labels are called conditional generative models. Example of conditional generative model is a text-to-image generation, a task of generating image samples conditioned from texts.

<a name='2'></a>

## 2. Types of Generative Models

Most generative models make use of maximum likelihood estimation. As the [famous tutorial on generative adversarial network](https://arxiv.org/abs/1701.00160) stated, "the basic idea of maximum likelihood is to define a model that provides an estimate of a probability distribution, parameterized by parameters θ. The principle of maximum likelihood simply says to choose the parameters for the model that maximize the likelihood of the training data."

In broad, there are two main types of generative models: explicity density models and implicit density models. Explicit models are models that can compute the density function explicitly whereas implicit density models are models that don't compute the density function or likelihood value explicitly. Explicity models compute maximum likelihood value straightly, implicit models offer a way to sample from the training data distribution. This image below shows the taxonomy of generative models.


![image](https://drive.google.com/uc?export=view&id=1Qz5Ymeon1qmnnSmnAe1jDLkr2qHGxPxt)

Generative modelling is a very broad topic, so we won't cover every model shown in the above taxonomy. Instead, we will cover models that are used in practice to day such as autoregressive models, variation auto-encoder(VAE), generative adversarial networks(GANs), and diffusion models. Those 4 types of generative models will be covered in subsquent notebooks, one by one.


<a name='3'></a>

## 3. Applications of Generative Models

Generative models have been the highlight of AI research in 2022 and prior years. The idea of generating new things and sometime things that never existed is fascinating. But, why care about generative models after all?

Generative models have numerous applications. Let's see some of them:

* Creating new image datasets: generative models can be used to synthesize image datasets that can be used for training machine learning models. There are many examples of works that attempts to create new datasets using generative networks. Synthetic datasets are most popular in fields like medicine whereas it is hard to get datasets due to cost(time & expertize) and privacy concerns. Here is a [paper](https://arxiv.org/abs/2104.11797) that discuss on how to generate new datasets using generative adversarial networks(GANs). As another recent example, a group of researchers used diffusion models(a kind of generative models) to [generate synthetic images from high-resolution 3D brain images](https://arxiv.org/abs/2209.07162). The [dataset](https://academictorrents.com/details/63aeb864bbe2115ded0aa0d7d36334c026f0660b) they generated contains 100K samples. Another related use-case of generative models is data augmentation. Rather than generating entirely new datasets, we can use generative models to augment existing datasets.

![image](https://drive.google.com/uc?export=view&id=1mrdV5LBDElse6ntUNfLqRFgDERN-std5)

* Creating photo-realistic images: Generative models can be used to generate images that looks real but never existed. Image generation is probably one of the things that advanced alot in 2022. We have witnessed AI systems(such as [DALLE•2](https://openai.com/dall-e-2/), [Stable Diffusion](https://github.com/CompVis/stable-diffusion), [Imagen](https://imagen.research.google), to mention few) that can generate images with extreme photorealism. Modern generative models are typically conditioned on texts(or prompted by texts) and that gives them the ability to generate any images that you can think of. In upcoming notebooks, we will dive deep into those models. To convince you how great are image generative models, please browse some fun and amusing images/arts on [LEXICA, a search engine of Stable Diffusion](https://lexica.art).

* Image-to-image translation: Image-to image-translation is one of the fascinating applications of generative networks. The idea of image-to-image translation is to map images from one domain to images of another domain. Image-to-image translation is used in various applications such as style transfer, photo enhancement, image colororization, etc...If you would like to learn more about image-to-image translation, check [this](https://paperswithcode.com/task/image-to-image-translation), [this](https://arxiv.org/abs/2101.08629), and [this](https://phillipi.github.io/pix2pix/).

* Image super resolution is another interesting application of generative networks that involve reconstructing a high resolution image from low resolution image. Image super resolution is also called image restoration. There are certain areas where the resolution quality of images is low and need to be increased such as in surveillance and medical imaging. For a comprehensive study of image super resolution, check this these [papers with code](https://paperswithcode.com/task/image-super-resolution) and recent [Real-ESRGAN](https://www.youtube.com/watch?v=fxHWoDSSvSc). You can also play with Real-ESRGAN on [Gradio demo](https://huggingface.co/spaces/akhaliq/Real-ESRGAN). This [article](https://blog.paperspace.com/image-super-resolution/) also talks about image super resolution in great details.

![image](https://drive.google.com/uc?export=view&id=1wLSs2Kz7sSMr-ZeprgHgjufE35UCemCa)

The applications of generative models that we listed above are not meant to be exhaustive. There are many applications of generative models. People are increasingly getting creative on using using AI for artistic generation. Let's see some of the recent breakthroughs in generative modelling.

<a name='4'></a>

## 4. Recent Breakthroughs in Generative Modelling

Generative modelling is probably one of the fields that has advanced significantly in the recent times. When [generative adversarial networks(GANs)](https://arxiv.org/abs/1406.2661) were introduced in 2014, it was a revolutionary thing. People couldn't believe that AI can generate new data samples. GANs made it possible to generate [things that never existed](https://thisxdoesnotexist.com) before such as [faces](https://thispersondoesnotexist.com), [cats](https://thiscatdoesnotexist.com), [dogs](https://mv-lab.github.io/thisdogdoesnotexist/), [rental rooms](https://thisrentaldoesnotexist.com), [chairs](https://thischairdoesnotexist.com), etc...

The idea of generating things that don't exist is very intriguing. Generative models has gone from generating okay images to now generating really great photorealistic images. Recently, there has been a significant progress in models that generate images conditioned from texts. One of the earliest models that showed remarkable photorealism ability is [DALL•E](https://openai.com/dall-e-2/) from OpenAI. After a few days, other powerful text-image models such as [Imagen](https://imagen.research.google) and [Stable Diffusion](https://github.com/CompVis/stable-diffusion) were introduced. Those are just recent notable text-image geneative models. There are so many others.

The specific kind of models that are largely used in decent generative networks are diffusion models. Diffusion models have shown to not only work well in image generation but in video generation as well. Recently, META AI researchers released [Make-A-Video](https://makeavideo.studio), an AI system that can generate videos from texts. A few days later, Google researchers extended its existing Imagen into [Imagen Video](https://imagen.research.google/video/), a system that can generate really high-resolution videos from texts prompts. Diffusion models are also being applied in other fields beyond generative modelling such as NLP, graph datasets, etc... 

Things are advancing fast in generative modelling. In less than a year(2022), we saw DALL•E, Imagen, Stable Diffusion, Make-A-Video, Imagen Video, to name few. GANs that was once the go-to architecture in image generation are now [generally antiquated networks](https://twitter.com/rasbt/status/1578374879229145088?s=20&t=os3aSU65FUzVN1hWRFqZYg)! Diffusion models that were uknown thing before a year ago(from 2022) are now a thing.

<a name='5'></a>

## 5. Final Notes

This was a short introduction to generative modelling. We saw the difference between supervised and unsupervised learning, discriminative and generative models, types of generative modells, application and recent breakthroughs. In the next notebooks, we will see 4 most popular types of generative models which are autoregressive models, variation autoencoder(VAE), generative adversarial networks(GANs), and latest diffusion models.

<a name='6'></a>

## 6. Further Learning

Below are some resources on generative modelling if you would like to learn more:

* [Lecture 19 - Generative Models I - Justin Johnson Michigan](https://www.youtube.com/watch?v=Q3HU2vEhD5Y&list=PL5-TkQAfAZFbzxjBHtzdVCWE0Zbhomg7r&index=19)

* [Overview of Generative Models on Paper with Code](https://paperswithcode.com/methods/category/generative-models)

### [BACK TO TOP](#0)