# 🧠 Chapter 10: Introduction to Artificial Neural Networks (ANNs) with Keras

This notebook provides a practical, hands-on guide to understanding and building neural networks using Keras. It covers from biological inspiration to implementing models for classification and regression tasks.

Let's get started!

## I. From Biological to Artificial Neurons

### 🧬 Biological Neurons

- Biological neurons receive signals, process them, and transmit signals to other neurons.
- They fire when input signals exceed a certain threshold.

### 🤖 Logical Computations with Perceptrons

A perceptron is a simple model of a neuron that performs logical operations like AND, OR, etc.

In [1]:
# Import numpy for numerical operations
import numpy as np

def perceptron(X, weights, bias):
    """Simple perceptron function for binary output"""
    return (np.dot(X, weights) + bias > 0).astype(int)

# Define input data for AND gate
X = np.array([[0,0], [0,1], [1,0], [1,1]])
weights = np.array([1, 1])  # weights for inputs
bias = -1.5  # bias term

# Test perceptron for AND gate
output = perceptron(X, weights, bias)
print("AND gate outputs:", output)  # Should be [0, 0, 0, 1]


AND gate outputs: [0 0 0 1]


### 🔁 From Perceptron to Multilayer Perceptron (MLP)

- **Perceptron**: Single-layer model; linear decision boundary.
- **MLP**: Multiple layers with nonlinear activations; capable of learning complex patterns.
- Trained with **backpropagation** and gradient descent.

## II. Implementing MLPs with Keras

### A. Installing TensorFlow

Make sure you have TensorFlow installed:

```bash
pip install tensorflow
```

### B. Building an Image Classifier (MNIST) using the Sequential API

We'll load the MNIST dataset and build a simple neural network to classify handwritten digits.

In [2]:
# Import necessary libraries
import tensorflow as tf
from tensorflow import keras

# Load MNIST dataset
(X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data()

# Normalize pixel values to [0,1]
X_train = X_train / 255.0
X_test = X_test / 255.0

# Build the model
model = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28, 28]),  # Flatten images
    keras.layers.Dense(300, activation="relu"),  # First hidden layer
    keras.layers.Dense(100, activation="relu"),  # Second hidden layer
    keras.layers.Dense(10, activation="softmax")  # Output layer
])

# Compile the model
model.compile(optimizer="adam",
              loss="sparse_categorical_crossentropy",
              metrics=["accuracy"])

# Train the model
model.fit(X_train, y_train, epochs=5, validation_split=0.1)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


### C. Building a Regression MLP (Sequential API)

Let's create a simple regression model to fit synthetic data y = 3x + 5 + noise.

In [3]:
# Generate synthetic data
np.random.seed(42)
X = np.random.rand(1000, 1)
y = 3 * X + 5 + 0.1 * np.random.randn(1000, 1)

# Define the model
reg_model = keras.models.Sequential([
    keras.layers.Dense(20, activation="relu", input_shape=[1]),
    keras.layers.Dense(1)
])

# Compile with MSE loss
reg_model.compile(loss="mse", optimizer="sgd")

# Train the model
reg_model.fit(X, y, epochs=30)

Epoch 1/30
Epoch 2/30
... (outputs truncated for brevity) ...
Epoch 30/30


### D. Functional API for Complex Models

The Functional API allows building models with multiple inputs, outputs, or complex architectures.

In [7]:
# Example: Model with multiple layers using Functional API
from tensorflow import keras

# Define Input layer
input_ = keras.layers.Input(shape=[8])
# Hidden layers
hidden1 = keras.layers.Dense(30, activation="relu")(input_)
hidden2 = keras.layers.Dense(30, activation="relu")(hidden1)
# Output layer
output = keras.layers.Dense(1)(hidden2)

# Instantiate model
model_func = keras.Model(inputs=[input_], outputs=[output])

# Compile
model_func.compile(loss="mse", optimizer="adam")

### E. Subclassing API (Dynamic Models)

Subclassing allows creating models with custom behavior, flexible for complex architectures.

In [34]:
# Define a custom model by subclassing keras.Model
@keras.utils.register_keras_serializable()
class MyModel(keras.Model):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)  # Handle base model arguments like 'trainable'
        self.hidden1 = keras.layers.Dense(30, activation="relu")
        self.hidden2 = keras.layers.Dense(30, activation="relu")
        self.output_layer = keras.layers.Dense(1)

    def call(self, inputs):
        x = self.hidden1(inputs)
        x = self.hidden2(x)
        return self.output_layer(x)

    def get_config(self):
        config = super().get_config()
        return config

    @classmethod
    def from_config(cls, config):
        return cls(**config)

# Instantiate and compile
subclassed_model = MyModel()
subclassed_model.compile(loss="mse", optimizer="adam")

# Create appropriate dummy data (regression problem)
X_train = np.random.random((1000, 8))  # 8 input features
y_train = np.random.random((1000, 1))  # Continuous output for MSE loss

# Train
subclassed_model.fit(x_train, y_train, epochs=2)

Epoch 1/2
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - loss: 0.2347
Epoch 2/2
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - loss: 0.0883


<keras.src.callbacks.history.History at 0x7d20d1b973a0>

### F. Saving and Restoring Models

In [35]:
# Save the trained model
subclassed_model.save("my_mnist_model.keras")

# Load the model later
restored_model = keras.models.load_model("my_mnist_model.keras")

### G. Using Callbacks for Improved Training

- **ModelCheckpoint**: saves best model during training
- **EarlyStopping**: stops training when validation performance stops improving

In [36]:
# Define callbacks
checkpoint_cb = keras.callbacks.ModelCheckpoint("best_model.keras", save_best_only=True)
early_stop_cb = keras.callbacks.EarlyStopping(patience=5, restore_best_weights=True)

# Train with callbacks
restored_model.fit(X_train, y_train, epochs=20,
          validation_split=0.1,
          callbacks=[checkpoint_cb, early_stop_cb])

Epoch 1/20
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss: 0.0948 - val_loss: 0.0730
Epoch 2/20
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - loss: 0.0881 - val_loss: 0.0725
Epoch 3/20
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 0.0836 - val_loss: 0.0727
Epoch 4/20
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - loss: 0.0841 - val_loss: 0.0722
Epoch 5/20
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0830 - val_loss: 0.0722
Epoch 6/20
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 11ms/step - loss: 0.0832 - val_loss: 0.0711
Epoch 7/20
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 8ms/step - loss: 0.0772 - val_loss: 0.0730
Epoch 8/20
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 0.0850 - val_loss: 0.0713
Epoch 9/20
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m

<keras.src.callbacks.history.History at 0x7d20d19b0160>

### H. TensorBoard for Visualization

- Use TensorBoard to visualize training metrics.
- Run the command in terminal after training:
  
```bash
tensorboard --logdir=logs/mnist
```

In [37]:
# Set up logs for TensorBoard
import os
logdir = os.path.join("logs", "mnist")
tensorboard_cb = keras.callbacks.TensorBoard(log_dir=logdir)

# Train with TensorBoard callback
restored_model.fit(X_train, y_train, epochs=10,
          validation_split=0.1, callbacks=[tensorboard_cb])

Epoch 1/10
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step - loss: 0.0819 - val_loss: 0.0722
Epoch 2/10
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 0.0843 - val_loss: 0.0719
Epoch 3/10
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 0.0836 - val_loss: 0.0721
Epoch 4/10
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 0.0822 - val_loss: 0.0717
Epoch 5/10
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - loss: 0.0789 - val_loss: 0.0719
Epoch 6/10
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - loss: 0.0769 - val_loss: 0.0729
Epoch 7/10
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - loss: 0.0833 - val_loss: 0.0713
Epoch 8/10
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - loss: 0.0810 - val_loss: 0.0726
Epoch 9/10
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[

<keras.src.callbacks.history.History at 0x7d20d19b2fe0>

## III. Fine-Tuning Neural Network Hyperparameters

| Hyperparameter          | Effect                                              |
|-------------------------|-----------------------------------------------------|
| Number of layers       | More layers = higher capacity                      |
| Neurons per layer      | More neurons = more complexity                     |
| Learning rate          | Too small = slow learning; too large = divergence |
| Batch size             | Smaller = noisier updates; larger = smoother learning |
| Regularization, Dropout| Prevent overfitting                                |

Experiment with these parameters to optimize your model.

## IV. Exercises to Practice

1. Build an MLP for the Boston Housing dataset to predict house prices.
2. Explore different learning rates and batch sizes to see their effects.
3. Add Dropout layers to your existing models.
4. Create models with multiple inputs/outputs using the Functional API.
5. Visualize training curves with TensorBoard and fine-tune accordingly.