# Comparison of Different Types of Neural Networks
In this notebook, we'll explore and compare various types of neural networks, including dense neural networks, convolutional neural networks (CNN), recurrent neural networks (RNN), residual neural networks (ResNet), autoencoders, transformers, adversarial learning, and reinforcement learning.

## 1. Dense Neural Network (DNN)

### Description
Dense Neural Networks, also known as fully connected networks, consist of layers where each neuron is connected to every neuron in the previous and subsequent layers. They are simple yet powerful for many types of tasks.

### Type of Data
DNNs are best suited for tabular data or structured data where features are independent and don't exhibit any spatial or temporal dependencies.

### Use Cases
- Classification tasks (e.g., predicting the category of an iris plant)
- Regression tasks (e.g., predicting house prices)
- General-purpose machine learning tasks

### Training Process
The training process involves feeding input data through the network, calculating the loss using a loss function, and optimizing the network weights using backpropagation and gradient descent.


In [2]:
# Importing necessary libraries
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Load and preprocess the data
iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

input_shape = X_train.shape[1]
num_classes = len(set(y))

# Define the model
model = Sequential([
    Dense(64, activation='relu', input_shape=(input_shape,)),
    Dense(64, activation='relu'),
    Dense(num_classes, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2)




Epoch 1/10


Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x26cd05b0f50>

## 2. Convolutional Neural Network (CNN)
### Description
Convolutional Neural Networks are specialized for processing grid-like data, such as images. They use convolutional layers to automatically detect spatial hierarchies in data.

### Type of Data
CNNs are ideal for image data, where spatial relationships between pixels are significant.

### Use Cases
- Image classification (e.g., classifying objects in CIFAR-10 images)
- Object detection and segmentation
- Image generation

### Training Process
Training involves convolving input data with filters, applying activation functions, and pooling operations to reduce dimensionality. The network learns to detect edges, textures, and higher-level patterns.

In [3]:
# Importing necessary libraries
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

# Load and preprocess the data
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.cifar10.load_data()
X_train, X_test = X_train / 255.0, X_test / 255.0

img_height, img_width = X_train.shape[1], X_train.shape[2]
num_classes = len(set(y_train.flatten()))

# Define the model
model = Sequential([
    Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(img_height, img_width, 3)),
    MaxPooling2D(pool_size=(2, 2)),
    Conv2D(64, kernel_size=(3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Flatten(),
    Dense(64, activation='relu'),
    Dense(num_classes, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_test, y_test))


Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x26cd0543ad0>

## 3. Recurrent Neural Network (RNN)
### Description
Recurrent Neural Networks are designed to handle sequential data by maintaining a hidden state that captures information from previous time steps.

### Type of Data
RNNs are suited for sequential data such as time series, text, and any data where order matters.

### Use Cases
- Text generation and classification
- Language translation
- Time series forecasting
### Training Process
RNNs process data one step at a time, updating their hidden state. Training involves backpropagation through time (BPTT) to handle dependencies across different time steps.

In [4]:
# Importing necessary libraries
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing.sequence import pad_sequences

# Load and preprocess the data
vocab_size = 10000
max_length = 500
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=vocab_size)
X_train = pad_sequences(X_train, maxlen=max_length)
X_test = pad_sequences(X_test, maxlen=max_length)

embedding_dim = 32
num_classes = 2  # Binary classification

# Define the model
model = Sequential([
    Embedding(vocab_size, embedding_dim, input_length=max_length),
    LSTM(64),
    Dense(64, activation='relu'),
    Dense(num_classes, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2)


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x26cd074b710>

## 4. Residual Neural Network (ResNet)
### Description
Residual Networks are deep neural networks that include skip connections to prevent the vanishing gradient problem, allowing them to train deeper networks.

### Type of Data
ResNets are typically used for image data, similar to CNNs.

### Use Cases
- Image classification (e.g., CIFAR-10)
- Object detection and segmentation
- Deep feature extraction
### Training Process
ResNets use residual blocks where the input to a block is added to its output, helping gradients flow through the network. Training involves standard convolutional layers and residual connections.

In [5]:
# Importing necessary libraries
import tensorflow as tf
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Add, Input
from tensorflow.keras.models import Model

# Load and preprocess the data
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.cifar10.load_data()
X_train, X_test = X_train / 255.0, X_test / 255.0

input_shape = (32, 32, 3)
num_classes = 10

# Define the model
inputs = Input(shape=input_shape)
x = Conv2D(32, kernel_size=(3, 3), activation='relu')(inputs)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Conv2D(64, kernel_size=(3, 3), activation='relu')(x)
block_1_output = MaxPooling2D(pool_size=(2, 2))(x)

x = Conv2D(64, kernel_size=(3, 3), activation='relu', padding='same')(block_1_output)
x = Conv2D(64, kernel_size=(3, 3), activation='relu', padding='same')(x)
block_2_output = Add()([x, block_1_output])

x = Conv2D(64, kernel_size=(3, 3), activation='relu', padding='same')(block_2_output)
x = Conv2D(64, kernel_size=(3, 3), activation='relu', padding='same')(x)
block_3_output = Add()([x, block_2_output])

x = Flatten()(block_3_output)
x = Dense(64, activation='relu')(x)
outputs = Dense(num_classes, activation='softmax')(x)

model = Model(inputs, outputs)

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2)


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x26cfcb684d0>

## 5. Autoencoders
### Description
Autoencoders are neural networks used for unsupervised learning, particularly for dimensionality reduction and feature learning. They consist of an encoder and a decoder.

### Type of Data
Autoencoders can be used on any type of data but are particularly useful for image data and high-dimensional data.

### Use Cases
- Anomaly detection
- Data compression
- Noise reduction
### Training Process
Training involves learning to encode the input data into a lower-dimensional space and then reconstructing the original data from this encoding. The network is trained to minimize reconstruction error.

In [6]:
# Importing necessary libraries
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Reshape
from tensorflow.keras.datasets import mnist

# Load and preprocess the data
(X_train, _), (X_test, _) = mnist.load_data()
X_train = X_train / 255.0
X_test = X_test / 255.0
X_train = X_train.reshape(-1, 784)
X_test = X_test.reshape(-1, 784)

input_shape = 784

# Define the model
model = Sequential([
    Dense(128, activation='relu', input_shape=(input_shape,)),
    Dense(64, activation='relu'),
    Dense(32, activation='relu'),
    Dense(64, activation='relu'),
    Dense(128, activation='relu'),
    Dense(input_shape, activation='sigmoid')
])

# Compile the model
model.compile(optimizer='adam', loss='mse')

# Train the model
model.fit(X_train, X_train, epochs=10, batch_size=32, validation_split=0.2)


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x26cd926e050>

# 6. Transformers
### Description
Transformers are a type of neural network architecture designed to handle sequential data but without relying on recurrence (unlike RNNs). They use self-attention mechanisms to weigh the influence of different parts of the input data, allowing them to capture long-range dependencies effectively.

### Type of Data
Transformers are used primarily for text and sequential data but have been adapted for other types of data, including images (e.g., Vision Transformers).

### Use Cases
- Natural language processing (NLP) tasks like translation, summarization, and sentiment analysis
- Text generation (e.g., GPT-3, BERT)
- Image classification and generation (Vision Transformers)
### Training Process
Training involves learning attention weights that highlight the importance of different words (or tokens) in a sequence. Transformers are trained using large datasets and optimized using gradient descent methods.

In [8]:
# Importing necessary libraries
import tensorflow as tf
from tensorflow.keras.layers import Input, Dense, Embedding, Dropout
from tensorflow.keras.models import Model
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.datasets import imdb

# Load and preprocess the data
vocab_size = 10000
max_length = 500
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=vocab_size)
X_train = pad_sequences(X_train, maxlen=max_length)
X_test = pad_sequences(X_test, maxlen=max_length)

embedding_dim = 32
num_classes = 2

# Define the model
inputs = Input(shape=(max_length,))
embedding_layer = Embedding(input_dim=vocab_size, output_dim=embedding_dim)(inputs)
transformer_block = tf.keras.layers.MultiHeadAttention(num_heads=2, key_dim=embedding_dim)(embedding_layer, embedding_layer)
# Flatten the output of the transformer block if necessary
flattened_layer = tf.keras.layers.Flatten()(transformer_block)
dropout_layer = Dropout(0.1)(flattened_layer)
dense_layer = Dense(64, activation='relu')(dropout_layer)
outputs = Dense(num_classes, activation='softmax')(dense_layer)

model = Model(inputs=inputs, outputs=outputs)

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2)


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x26cec435210>

## 7. Generative Adversarial Networks (GANs)
### Description
Generative Adversarial Networks consist of two neural networks, a generator and a discriminator, that compete against each other. The generator creates data, while the discriminator tries to distinguish between real and generated data.

### Type of Data
GANs are primarily used with image data but can be applied to any data type where generating realistic data is valuable.

### Use Cases
- Image generation
- Data augmentation
- Creative applications (e.g., art and music generation)
### Training Process
The training process alternates between training the discriminator to distinguish real from fake data and training the generator to produce data that can fool the discriminator. This adversarial process helps the generator produce realistic data.

In [11]:
# Importing necessary libraries
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LeakyReLU, Reshape, Flatten, Input
from tensorflow.keras.models import Model
import numpy as np
import os

# Suppress TensorFlow logs
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

# Load and preprocess the data
(X_train, _), (_, _) = tf.keras.datasets.mnist.load_data()
X_train = X_train / 255.0
X_train = X_train.reshape(-1, 28, 28)  # Ensure the shape is (number of samples, 28, 28)

noise_dim = 100

# Define the generator model
generator = Sequential([
    Dense(128, input_shape=(noise_dim,)),
    LeakyReLU(alpha=0.2),
    Dense(784, activation='tanh'),
    Reshape((28, 28))  # Output shape should be (28, 28)
])

# Define the discriminator model
discriminator = Sequential([
    Flatten(input_shape=(28, 28)),
    Dense(128),
    LeakyReLU(alpha=0.2),
    Dense(1, activation='sigmoid')
])

# Compile the discriminator
discriminator.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Combine generator and discriminator to make GAN
discriminator.trainable = False
gan_input = Input(shape=(noise_dim,))
x = generator(gan_input)
gan_output = discriminator(x)
gan = Model(gan_input, gan_output)

# Compile GAN
gan.compile(optimizer='adam', loss='binary_crossentropy')

# Training GAN
batch_size = 32
epochs = 10000
for epoch in range(epochs):
    # Train discriminator
    idx = np.random.randint(0, X_train.shape[0], batch_size)
    real_images = X_train[idx]
    noise = np.random.normal(0, 1, (batch_size, noise_dim))
    fake_images = generator.predict(noise, verbose=0)  # Suppress verbose output
    
    real_labels = np.ones((batch_size, 1))
    fake_labels = np.zeros((batch_size, 1))
    
    d_loss_real = discriminator.train_on_batch(real_images, real_labels)
    d_loss_fake = discriminator.train_on_batch(fake_images, fake_labels)
    
    # Train generator
    noise = np.random.normal(0, 1, (batch_size, noise_dim))
    valid_y = np.ones((batch_size, 1))
    g_loss = gan.train_on_batch(noise, valid_y)
    
    if epoch % 1000 == 0:
        print(f"Epoch {epoch}, D Loss: {0.5 * np.add(d_loss_real, d_loss_fake)}, G Loss: {g_loss}")

print("Training complete.")


Epoch 0, D Loss: [0.71170622 0.46875   ], G Loss: 0.9469071626663208
Epoch 1000, D Loss: [0.27582335 0.921875  ], G Loss: 2.6114182472229004
Epoch 2000, D Loss: [0.27922937 0.90625   ], G Loss: 2.7924535274505615
Epoch 3000, D Loss: [0.44835685 0.765625  ], G Loss: 3.3652374744415283
Epoch 4000, D Loss: [0.27148907 0.890625  ], G Loss: 3.258960247039795
Epoch 5000, D Loss: [0.30840922 0.859375  ], G Loss: 3.7118265628814697
Epoch 6000, D Loss: [0.25601084 0.90625   ], G Loss: 3.7888922691345215
Epoch 7000, D Loss: [0.17177273 0.9375    ], G Loss: 4.360065937042236
Epoch 8000, D Loss: [0.24589257 0.90625   ], G Loss: 3.502284526824951
Epoch 9000, D Loss: [0.41012922 0.875     ], G Loss: 4.5679121017456055
Training complete.


## 8. Reinforcement Learning (RL)
### Description
Reinforcement Learning (RL) is a type of machine learning where an agent learns to make decisions by interacting with an environment. The agent takes actions to maximize cumulative rewards over time. Unlike supervised learning, RL does not require labeled input/output pairs and relies on trial and error to discover the best actions.

### Type of Data
- Data Type: Interaction data between the agent and the environment.
- Data Shape: Typically involves state-action-reward sequences, where each sequence represents the state of the environment, the action taken by the agent, and the reward received.
### Use Cases
- Game playing (e.g., AlphaGo, OpenAI's Dota 2 bot)
- Robotics (e.g., robot navigation, manipulation tasks)
- Self-driving cars (e.g., decision-making, path planning)
- Optimization problems (e.g., resource management, traffic control)
### Training Process
1. Initialization: The agent initializes its policy (strategy) and value function (expected reward).
2. Interaction: The agent interacts with the environment by taking actions and observing the resulting state and reward.
3. Policy Update: Based on the observed reward and state transitions, the agent updates its policy to improve future actions.
4. Exploration vs. Exploitation: The agent balances exploration (trying new actions) and exploitation (using known actions that yield high rewards).
The training continues until the agent's performance converges or improves to a satisfactory level.

## 8.1 Environment Setup
First, let's define a simple environment. We'll create a grid world where an agent has to navigate from a starting position to a goal position while avoiding obstacles. The agent can move up, down, left, or right.

In [21]:
class GridWorld:
    def __init__(self, width, height, start, goal, obstacles):
        self.width = width
        self.height = height
        self.start = start
        self.goal = goal
        self.obstacles = obstacles
        self.agent_position = start

    def move_agent(self, action):
        x, y = self.agent_position
        if action == "UP" and y > 0:
            self.agent_position = (x, y - 1)
        elif action == "DOWN" and y < self.height - 1:
            self.agent_position = (x, y + 1)
        elif action == "LEFT" and x > 0:
            self.agent_position = (x - 1, y)
        elif action == "RIGHT" and x < self.width - 1:
            self.agent_position = (x + 1, y)

    def is_goal_reached(self):
        return self.agent_position == self.goal

    def is_obstacle(self, position):
        return position in self.obstacles

    def reset(self):
        self.agent_position = self.start


## 8.2 Q-Learning Algorithm
Now, let's implement the Q-learning algorithm to train an agent to navigate the grid world.

In [22]:
import numpy as np

class QLearningAgent:
    def __init__(self, num_actions, epsilon=0.1, alpha=0.1, gamma=0.9):
        self.num_actions = num_actions
        self.epsilon = epsilon
        self.alpha = alpha
        self.gamma = gamma
        self.q_table = np.zeros((num_actions, num_actions))

    def choose_action(self, state):
        if np.random.uniform(0, 1) < self.epsilon:
            return np.random.randint(self.num_actions)
        else:
            return np.argmax(self.q_table[state])


    def update_q_table(self, state, action, reward, next_state):
        best_next_action = np.argmax(self.q_table[next_state])
        td_target = reward + self.gamma * self.q_table[next_state, best_next_action]
        td_error = td_target - self.q_table[state, action]
        self.q_table[state, action] += self.alpha * td_error



## 8.3 Training the Agent
Now, let's train the agent to navigate the grid world using Q-learning.

In [24]:
# Define the environment
width = 5
height = 5
start = (0, 0)
goal = (4, 4)
obstacles = [(1, 1), (2, 2), (3, 3)]
env = GridWorld(width, height, start, goal, obstacles)

# Define the agent
num_actions = 4  # Up, Down, Left, Right
agent = QLearningAgent(num_actions)

# Training loop
num_episodes = 10
for episode in range(num_episodes):
    state = env.start
    done = False
    total_reward = 0

    while not done:
        action = agent.choose_action(state)
        next_state = env.agent_position
        reward = -1 if env.is_obstacle(next_state) else 0
        done = env.is_goal_reached()

        agent.update_q_table(state, action, reward, next_state)
        total_reward += reward
        state = next_state

    env.reset()

    print(f"Episode {episode + 1}: Total Reward = {total_reward}")

print("Training complete.")


KeyboardInterrupt: 

## 8.4 Testing the Trained Agent
Now, let's test the trained agent in the environment.

In [None]:
# Testing loop
state = env.start
done = False

while not done:
    action = agent.choose_action(state)
    env.move_agent(["UP", "DOWN", "LEFT", "RIGHT"][action])
    done = env.is_goal_reached()
    print(f"Agent moved to position: {env.agent_position}")

print("Goal reached!")
