# 🔹 Phase 3: Implementing a Neural Network in TensorFlow & PyTorch

**Concepts to Cover**

- **Why use TensorFlow & PyTorch?** – Higher-level APIs for defining models.
- **Defining Layers** – Using tf.keras (TensorFlow) and torch.nn (PyTorch).
- **Training Loops** – Using built-in optimizers & loss functions.
- **Batch Processing** – Efficient training using mini-batches.


## 📌 Exercise 3: Implement the same 3-layer XOR network in TensorFlow & PyTorch

**🔹 Task**

- Implement a **3-layer neural network** using:
    - TensorFlow (`tf.keras`)
    - PyTorch (`torch.nn.Module`)
- Train on the **XOR dataset**.
- Use the **Binary Cross-Entropy Loss (`BCE`)**.
- Compare both frameworks.

# ✅ Implementation in TensorFlow

In [3]:
import tensorflow as tf
import numpy as np

# XOR dataset
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32)
y = np.array([[0], [1], [1], [0]], dtype=np.float32)

# Define the model correctly
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(2,)),  # Explicit Input Layer
    tf.keras.layers.Dense(4, activation='sigmoid'),  # Hidden Layer (4 neurons)
    tf.keras.layers.Dense(1, activation='sigmoid')  # Output Layer (1 neuron)
])

# Compile the model
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.1),
              loss='binary_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(X, y, epochs=5000, verbose=0)  # Train silently

# Final Predictions
predictions = model.predict(X)
print("\nTensorFlow Predictions:")
for i, p in enumerate(predictions):
    print(f"Input: {X[i]}, Predicted Output: {p[0]:.4f}")


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 43ms/step

TensorFlow Predictions:
Input: [0. 0.], Predicted Output: 0.0000
Input: [0. 1.], Predicted Output: 1.0000
Input: [1. 0.], Predicted Output: 1.0000
Input: [1. 1.], Predicted Output: 0.0000


# ✅ Implementation in TensorFlow (Optimised)

In [4]:
import tensorflow as tf
import numpy as np

# XOR dataset
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32)
y = np.array([[0], [1], [1], [0]], dtype=np.float32)

# Check for GPU availability
gpus = tf.config.list_physical_devices('GPU')
if gpus:
    print("✅ GPU Available! Training on GPU...")
else:
    print("⚠️ No GPU detected, training on CPU...")

# Define the optimized model
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(2,)),  # Explicit Input Layer
    tf.keras.layers.Dense(4, activation='sigmoid'),  # Hidden Layer (4 neurons)
    tf.keras.layers.Dense(1, activation='sigmoid')  # Output Layer (1 neuron)
])

# Compile with optimized settings
model.compile(
    optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.1),  # Faster optimizer
    loss='binary_crossentropy',
    metrics=['accuracy']
)

# Train with mini-batch gradient descent (batch_size=2)
with tf.device('/GPU:0' if gpus else '/CPU:0'):
    history = model.fit(X, y, epochs=1000, batch_size=2, verbose=0)  # Silent training

# Final Predictions
predictions = model.predict(X)
print("\nOptimized TensorFlow Predictions:")
for i, p in enumerate(predictions):
    print(f"Input: {X[i]}, Predicted Output: {p[0]:.4f}")


✅ GPU Available! Training on GPU...
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 45ms/step

Optimized TensorFlow Predictions:
Input: [0. 0.], Predicted Output: 0.0000
Input: [0. 1.], Predicted Output: 1.0000
Input: [1. 0.], Predicted Output: 1.0000
Input: [1. 1.], Predicted Output: 0.0000


# 🔹 Optimizations Applied

1. ✅ Reduced Epochs: Down from 5000 to 1000 (should still converge).
2. ✅ Used RMSprop Optimizer: Faster than Adam for small datasets.
3. ✅ Batch Training (batch_size=2): Instead of feeding the entire dataset at once.
4. ✅ Forced GPU Usage: Uses with tf.device('/GPU:0') if available.

This should significantly speed up training while keeping accuracy high. 🚀

# ✅ Implementation in PyTorch

In [2]:
import torch
import torch.nn as nn
import torch.optim as optim

# XOR dataset
X_torch = torch.tensor([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=torch.float32)
y_torch = torch.tensor([[0], [1], [1], [0]], dtype=torch.float32)

# Define the model
class XORNeuralNet(nn.Module):
    def __init__(self):
        super(XORNeuralNet, self).__init__()
        self.hidden = nn.Linear(2, 4)  # Hidden Layer (4 neurons)
        self.output = nn.Linear(4, 1)  # Output Layer (1 neuron)
    
    def forward(self, x):
        x = torch.sigmoid(self.hidden(x))  # Hidden Activation
        x = torch.sigmoid(self.output(x))  # Output Activation
        return x

# Initialize model
model = XORNeuralNet()
criterion = nn.BCELoss()  # Binary Cross-Entropy Loss
optimizer = optim.Adam(model.parameters(), lr=0.1)

# Training loop
epochs = 5000
for epoch in range(epochs):
    optimizer.zero_grad()  # Reset gradients
    outputs = model(X_torch)  # Forward pass
    loss = criterion(outputs, y_torch)  # Compute loss
    loss.backward()  # Backpropagation
    optimizer.step()  # Update weights

# Final Predictions
with torch.no_grad():
    predictions = model(X_torch)

print("\nPyTorch Predictions:")
for i, p in enumerate(predictions):
    print(f"Input: {X_torch[i].numpy()}, Predicted Output: {p.item():.4f}")



PyTorch Predictions:
Input: [0. 0.], Predicted Output: 0.0000
Input: [0. 1.], Predicted Output: 1.0000
Input: [1. 0.], Predicted Output: 1.0000
Input: [1. 1.], Predicted Output: 0.0000


## 📌 Significance of This Exercise

- **TensorFlow vs PyTorch** – You now see how both frameworks implement the same logic.
- **Higher-Level Abstraction** – TensorFlow provides `Sequential()`, while PyTorch gives more control.
- **Backpropagation Handling** – PyTorch requires `loss.backward()`, while TensorFlow abstracts it away.
- **Optimization** – Adam optimizer helps in faster convergence.