## Presentation Assignment - Neural Networks, GANs, Diffusion Models, and Transformers

## **Neural Networks**

**Define Neural Networks**<br>

A neural network is a machine learning model inspired by the structure and function of the human brain. It's a complex system of interconnected nodes or "neurons" that process and transmit information.<br>

**How Neural Networks Work:**<br>

Imagine a neural network as a network of nodes (neurons) connected by edges (synapses). Each node receives one or more inputs, performs a computation on those inputs, and then sends the output to other nodes. This process allows the network to learn and represent complex relationships between inputs and outputs.<br>

Here's a step-by-step breakdown:<br>

**1-Input-Layer:**<br>
The network receives input data, which could be images, sound waves, or any other type of data.<br>

**2-Hidden-Layers:**<br>
The input data is fed into one or more hidden layers, where complex representations of the data are built. Each node in these layers applies an activation function to the weighted sum of its inputs, producing an output.<br>

**3-Output-Layer:**<br>
The output of the hidden layers is fed into the output layer, which produces the final prediction or classification.<br>

**4-Backpropagation:**<br>
During training, the network adjusts the weights and biases of the connections between nodes to minimize the error between its predictions and the actual outputs. This process is called backpropagation.<br>

**Key Concepts:**<br>

* **Artificial neurons (nodes):** The basic computing units of the network.<br>

* **Activation functions:** Introduce non-linearity into the network, allowing it to learn complex relationships.<br>

* **Weights and biases:** Adjustable parameters that determine the strength of connections between nodes.<br>

* **Forward pass:** The process of propagating input data through the network to produce an output.<br>

* **Backward pass:** The process of adjusting weights and biases to minimize error during training.<br>


**Types of Neural Networks:**<br>

* **Feedforward Networks:** Data flows only in one direction, from input to output.<br>

* **Recurrent Neural Networks (RNNs):** Data can flow in a loop, allowing the network to keep track of state over time.<br>

* **Convolutional Neural Networks (CNNs):** Designed for image and signal processing, using convolutional and pooling layers.<br>


**Real-World Applications:**<br>

* **Image recognition and classification**<br>

* **Natural language processing (NLP) tasks**<br>

* **Speech recognition**<br>

* **Game playing (e.g., AlphaGo)**<br>

* **Autonomous vehicles**<br>

## Generative Adversarial Networks (GANs)

**Define GANS**<br>

A Generative Adversarial Network (GAN) is a powerful class of neural networks that are used for unsupervised learning. GANs are made up of two neural networks, **a discriminator and a generator.** They use adversarial training to produce artificial data that is identical to actual data.<br>

* The Generator attempts to fool the Discriminator, which is tasked with accurately distinguishing between produced and genuine data, by producing random noise samples.<br>

* Realistic, high-quality samples are produced as a result of this competitive interaction, which drives both networks toward advancement.<br>

**Architecture of GANs**<br>

GANs can be broken down into three parts:<br>

* **Generative:** To learn a generative model, which describes how data is generated in terms of a probabilistic model.<br>

* **Adversarial:** The word adversarial refers to setting one thing up against another. This means that, in the context of GANs, the generative result is compared with the actual images in the data set. A mechanism known as a discriminator is used to apply a model that attempts to distinguish between real and fake images.<br>

* **Networks:** Use deep neural networks as artificial intelligence (AI) algorithms for training purposes.<br>

**How does a GAN work?**<br>

GANs consists of two neural networks. There is a **Generator G(x)** and a **Discriminator D(x).** Both of them play an adversarial game. The generator's aim is to fool the discriminator by producing data that are similar to those in the training set. The discriminator will try not to be fooled by identifying fake data from real data. Both of them work simultaneously to learn and train complex data like audio, video, or image files.

The Generator network takes a sample and generates a fake sample of data. The Generator is trained to increase the Discriminator network's probability of making mistakes.<br>

**Types of GANs**<br>

There are several types of GANs, including:<br>

* **Vanilla GAN:** This is the simplest type of GAN. Here, the Generator and the Discriminator are simple a basic multi-layer perceptrons.<br>

* **Conditional GAN (CGAN):** CGAN can be described as a deep learning method in which some conditional parameters are put into place.<br>

* **Deep Convolutional GAN (DCGAN):** DCGANs support convolution neural networks instead of vanilla neural networks at both Discriminator and Generator.<br>

**Application Of Generative Adversarial Networks (GANs)**<br>

GANs have many uses in many different fields. Here are some of the widely recognized uses of GANs:<br>

* **Image Synthesis and Generation:** GANs are often used for picture synthesis and generation tasks, They may create fresh, lifelike pictures that mimic training data by learning the distribution that explains the dataset.<br>

* **Image-to-Image Translation:** GANs may be used for problems involving image-to-image translation, where the objective is to convert an input picture from one domain to another while maintaining its key features.<br>

* **Text-to-Image Synthesis:** GANs have been used to create visuals from descriptions in text. GANs may produce pictures that translate to a description given a text input, such as a phrase or a caption.<br>

* **Data Augmentation:** GANs can augment present data and increase the robustness and generalizability of machine-learning models by creating synthetic data samples.<br>

* **Data Generation for Training:** GANs can enhance the resolution and quality of low-resolution images.<br>

**Advantages of GAN**<br>

* **Flexibility:** GANs can be used for a wide range of applications, including image synthesis, data augmentation, and text-to-image synthesis.<br>

* **High-quality results:** GANs are capable of producing high-quality, realistic results that are often indistinguishable from real data.<br>

* **Improved performance:** GANs can improve the performance of machine learning models by providing them with more diverse and realistic training data.<br>


**Disadvantages of GAN**<br>

* **Training instability:** GANs can be difficult to train, and it's not uncommon for the training process to become unstable.<br>

* **Mode collapse:** GANs can suffer from mode collapse, where the generator produces limited variations of the same output.<br>

* **Lack of interpretability:** GANs can be difficult to interpret, making it challenging to understand why the generator is producing certain outputs.<br>

## Diffusion Models

**Define Diffusion Models**<br>

Diffusion models are a class of generative models in artificial intelligence that have revolutionized how we create and manipulate digital data. They are inspired by non-equilibrium thermodynamics and define a Markov chain of diffusion steps to slowly add random noise to the data and then learn to reverse the diffusion process to construct desired data samples from the noise.<br>

**Forward Diffusion Process**<br>

The forward diffusion process is a process of turning an image into noise. It consists of multiple steps, where at each step, a small amount of noise is added to the input image using a schedule. The schedule decides how much noise is added at the given step t.<br>

**Reverse Diffusion Process**<br>

The reverse diffusion process is a process of turning noise into an image. It consists of multiple steps, where at each step, a small amount of noise is removed from the input noise. The reverse diffusion process is learned by predicting the entire noise, not the difference between step t and t-1.<br>

**How Diffusion Models Work**<br>

Diffusion models work by learning the parameters of the invertible transformations and other model components. This process typically involves optimizing a loss function, which evaluates how effectively the model can transform samples from a simple distribution into ones that closely resemble the complex data distribution.<br>

**Applications of Diffusion Models**<br>

Diffusion models have diverse applications across several domains, such as:<br>

* Text-to-video synthesis<br>

* Image-to-image translation<br>

* Image search<br>

* Reverse image search Popular models include Stable Diffusion, DALL-E 2, and Imagen.<br>

**Benefits of Using Diffusion Models**<br>

Diffusion models offer advantages over traditional generative models like GANs (Generative Adversarial Networks) and VAEs (Variational Autoencoders). These benefits stem from their unique approach to data generation and reverse diffusion. They are adept at generating high-quality images with fine details and realistic textures.

## Transformers

**Define Transformers**<br>

Transformers in Generative AI (GEN AI) are a type of neural network architecture that have revolutionized the field of natural language processing and other machine learning tasks. They are designed to track relationships between chunks of data, such as words in a sentence , and use this information to generate new data that is coherent and natural-sounding.<br>

**What is GEN AI?**<br>

GEN AI is a subset of Deep learning (a subset of Machine learning). It uses artificial neural networks and can process labeled and unlabeled data using supervised, unsupervised, and semi-supervised methods.<br>

**What is a Transformer Model?**<br>

A transformer model is a neural network that learns context and thus meaning by tracking relationships in sequential data like the words in a sentence. They apply an evolving set of mathematical techniques, called attention or self-attention, to detect subtle ways even distant data elements in a series influence and depend on each other.<br>

**How does the Transformer work?**<br>

At a high level, a transformer model consists of an encoder and a decoder. The encoder converts input text into an intermediate representation and passes it to the decoder, the decoder converts that intermediate representation into useful text. All the models before the transformer could represent words as vectors, but these vectors do not contain the context. The Transformer understands context through its self-attention mechanism. The self-attention mechanism allows the model to weigh the importance of different positions in the input sequence when processing each position, i.e., this mechanism allows the model to focus on the most relevant parts of the input, similar to how we pay attention to specific details when reading or listening.<br>

**What Can Transformer Models Do?**<br>

Transformer models are translating text and speech in near real-time, opening meetings and classrooms to diverse and hearing-impaired attendees. They’re helping researchers understand the chains of genes in DNA and amino acids in proteins in ways that can speed drug design. They can detect trends and anomalies to prevent fraud, streamline manufacturing, make online recommendations or improve healthcare. People use transformers every time they search on Google or Microsoft Bing.<br>

**The Virtuous Cycle of Transformer AI**<br>

Any application using sequential text, image or video data is a candidate for transformer models. That enables these models to ride a virtuous cycle in transformer AI. Created with large datasets, transformers make accurate predictions that drive their wider use, generating more data that can be used to create even better models.<br>

**Transformers Replace CNNs, RNNs**<br>

Transformers are in many cases replacing convolutional and recurrent neural networks (CNNs and RNNs), the most popular types of deep learning models just five years ago. Indeed, 70 percent of arXiv papers on AI posted in the last two years mention transformers. That’s a radical shift from a 2017 IEEE study that reported RNNs and CNNs were the most popular models for pattern recognition.<br>

**No Labels, More Performance**<br>

Before transformers arrived, users had to train neural networks with large, labeled datasets that were costly and time-consuming to produce. By finding patterns between elements mathematically, transformers eliminate that need, making available the trillions of images and petabytes of text data on the web and in corporate databases. In addition, the math that transformers use lends itself to parallel processing, so these models can run fast.<br>

**How Transformers Pay Attention**<br>

Like most neural networks, transformer models are basically large encoder/decoder blocks that process data. Small but strategic additions to these blocks (shown in the diagram below) make transformers uniquely powerful. Transformers use positional encoders to tag data elements coming in and out of the network. Attention units follow these tags, calculating a kind of algebraic map of how each element relates to the others. Attention queries are typically executed in parallel by calculating a matrix of equations in what’s called multi-headed attention. With these tools, computers can see the same patterns humans see.<br>

**Self-Attention Finds Meaning**<br>

For example, in the sentence: She poured water from the pitcher to the cup until it was full. We know “it” refers to the cup, while in the sentence: She poured water from the pitcher to the cup until it was empty. We know “it” refers to the pitcher.<br>

**Beyond the Horizon**<br>

Vaswani imagines a future where self-learning, attention-powered transformers approach the holy grail of AI. “We have a chance of achieving some of the goals people talked about when they coined the term ‘general artificial intelligence’ and I find that north star very inspiring,” he said. “We are in a time where simple methods like neural networks are giving us an explosion of new capabilities.”<br>