# Build a Single-Layer Custom ANN

## Assignment: Manual Implementation of Binary Classification Neural Network

This notebook demonstrates building a single-layer artificial neural network from scratch using only basic PyTorch operations.

**Model Architecture:**
- Linear layer: Y = w^T * x + b
- Activation function: Sigmoid
- Loss function: Binary Cross Entropy
- Optimizer: Manual gradient descent

**Dataset:**
- Binary classification with 2 features
- Generated using sklearn.datasets.make_classification

## Step 1: Import Required Libraries and Setup

In [None]:
import torch
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import os

# Set random seeds for reproducibility
torch.manual_seed(42)
np.random.seed(42)

# Check device availability
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

# Set plot style
plt.style.use('default')
plt.rcParams['figure.figsize'] = (10, 6)

## Step 2: Generate and Save Dataset

Create a binary classification dataset with 2 features and save it to CSV.

In [None]:
# Generate binary classification dataset
X, y = make_classification(
    n_samples=1000,        # Increased sample size for better training
    n_features=2,          # 2 features as specified
    n_classes=2,           # Binary classification
    n_redundant=0,         # No redundant features
    n_informative=2,       # Both features are informative
    n_clusters_per_class=1,
    random_state=42        # For reproducibility
)

# Create DataFrame
df = pd.DataFrame(X, columns=['f1', 'f2'])
df['label'] = y

# Save to CSV
csv_path = 'binary_data.csv'
df.to_csv(csv_path, index=False)
print(f"Dataset saved to {csv_path}")

# Display dataset info
print(f"\nDataset Shape: {df.shape}")
print(f"Features: {df.columns.tolist()}")
print(f"Class distribution:")
print(df['label'].value_counts())
print(f"\nFirst 5 rows:")
print(df.head())

## Step 3: Data Visualization

Visualize the generated dataset to understand the data distribution.

In [None]:
# Visualize the dataset
plt.figure(figsize=(12, 5))

# Plot 1: Scatter plot of features colored by class
plt.subplot(1, 2, 1)
colors = ['red', 'blue']
for i, label in enumerate([0, 1]):
    mask = df['label'] == label
    plt.scatter(df[mask]['f1'], df[mask]['f2'], 
               c=colors[i], label=f'Class {label}', alpha=0.6)
plt.xlabel('Feature 1 (f1)')
plt.ylabel('Feature 2 (f2)')
plt.title('Dataset Visualization')
plt.legend()
plt.grid(True, alpha=0.3)

# Plot 2: Feature distributions
plt.subplot(1, 2, 2)
df[df['label'] == 0][['f1', 'f2']].hist(alpha=0.5, label='Class 0', bins=20)
df[df['label'] == 1][['f1', 'f2']].hist(alpha=0.5, label='Class 1', bins=20)
plt.xlabel('Feature Value')
plt.ylabel('Frequency')
plt.title('Feature Distributions by Class')
plt.legend()

plt.tight_layout()
plt.show()

# Display statistics
print("Dataset Statistics:")
print(df.describe())

## Step 4: Data Preprocessing

Load data from CSV, split into train/test sets, and normalize features.

In [None]:
# Load data from CSV
df_loaded = pd.read_csv(csv_path)
print(f"Loaded dataset shape: {df_loaded.shape}")

# Separate features and labels
X = df_loaded[['f1', 'f2']].values
y = df_loaded['label'].values

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print(f"Training set size: {X_train.shape[0]}")
print(f"Test set size: {X_test.shape[0]}")
print(f"Training class distribution: {np.bincount(y_train)}")
print(f"Test class distribution: {np.bincount(y_test)}")

# Normalize features using StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print(f"\nFeature scaling completed")
print(f"Training features - Mean: {X_train_scaled.mean(axis=0)}, Std: {X_train_scaled.std(axis=0)}")
print(f"Test features - Mean: {X_test_scaled.mean(axis=0)}, Std: {X_test_scaled.std(axis=0)}")

# Convert to PyTorch tensors
X_train_tensor = torch.FloatTensor(X_train_scaled).to(device)
y_train_tensor = torch.FloatTensor(y_train).to(device)
X_test_tensor = torch.FloatTensor(X_test_scaled).to(device)
y_test_tensor = torch.FloatTensor(y_test).to(device)

print(f"\nTensor shapes:")
print(f"X_train: {X_train_tensor.shape}, y_train: {y_train_tensor.shape}")
print(f"X_test: {X_test_tensor.shape}, y_test: {y_test_tensor.shape}")
print(f"Tensors are on device: {X_train_tensor.device}")

## Step 5: Define Custom Single-Layer Neural Network

Implement the neural network components manually:
- Linear transformation: Y = w^T * x + b
- Sigmoid activation function
- Binary cross-entropy loss
- Manual gradient computation

In [None]:
class SingleLayerANN:
    def __init__(self, input_size, device='cpu'):
        """
        Initialize the single-layer neural network
        
        Args:
            input_size (int): Number of input features
            device (str): Device to run computations on
        """
        self.device = device
        self.input_size = input_size
        
        # Initialize weights and bias with small random values
        # Weight matrix: (input_size, 1) for single output neuron
        self.weights = torch.randn(input_size, 1, device=device, requires_grad=False) * 0.1
        self.bias = torch.zeros(1, device=device, requires_grad=False)
        
        print(f"Initialized ANN with:")
        print(f"  Input size: {input_size}")
        print(f"  Weights shape: {self.weights.shape}")
        print(f"  Bias shape: {self.bias.shape}")
        print(f"  Device: {device}")
    
    def sigmoid(self, z):
        """
        Sigmoid activation function: σ(z) = 1 / (1 + e^(-z))
        
        Args:
            z (torch.Tensor): Input tensor
        
        Returns:
            torch.Tensor: Sigmoid output
        """
        # Clip z to prevent overflow
        z = torch.clamp(z, -500, 500)
        return 1 / (1 + torch.exp(-z))
    
    def forward(self, X):
        """
        Forward pass: Y = σ(w^T * x + b)
        
        Args:
            X (torch.Tensor): Input features (batch_size, input_size)
        
        Returns:
            torch.Tensor: Predictions (batch_size, 1)
        """
        # Linear transformation: z = X @ w + b
        z = torch.matmul(X, self.weights) + self.bias
        
        # Apply sigmoid activation
        predictions = self.sigmoid(z)
        
        return predictions.squeeze()  # Remove extra dimension
    
    def binary_cross_entropy(self, predictions, targets):
        """
        Binary Cross Entropy Loss: -[y*log(p) + (1-y)*log(1-p)]
        
        Args:
            predictions (torch.Tensor): Predicted probabilities
            targets (torch.Tensor): True labels (0 or 1)
        
        Returns:
            torch.Tensor: Average loss
        """
        # Clip predictions to prevent log(0)
        epsilon = 1e-7
        predictions = torch.clamp(predictions, epsilon, 1 - epsilon)
        
        # Compute binary cross entropy
        loss = -(targets * torch.log(predictions) + 
                (1 - targets) * torch.log(1 - predictions))
        
        return torch.mean(loss)
    
    def compute_gradients(self, X, predictions, targets):
        """
        Manually compute gradients for weights and bias
        
        Args:
            X (torch.Tensor): Input features
            predictions (torch.Tensor): Model predictions
            targets (torch.Tensor): True labels
        
        Returns:
            tuple: (weight_gradients, bias_gradients)
        """
        batch_size = X.shape[0]
        
        # Gradient of loss w.r.t predictions: dL/dp = (p - y) / [p(1-p)]
        # But for sigmoid + BCE, this simplifies to: dL/dz = p - y
        dL_dz = predictions - targets
        
        # Gradient w.r.t weights: dL/dw = X^T @ dL_dz
        dL_dw = torch.matmul(X.T, dL_dz.unsqueeze(1)) / batch_size
        
        # Gradient w.r.t bias: dL/db = mean(dL_dz)
        dL_db = torch.mean(dL_dz)
        
        return dL_dw, dL_db
    
    def update_parameters(self, weight_grad, bias_grad, learning_rate):
        """
        Update weights and bias using gradient descent
        
        Args:
            weight_grad (torch.Tensor): Weight gradients
            bias_grad (torch.Tensor): Bias gradients
            learning_rate (float): Learning rate
        """
        self.weights -= learning_rate * weight_grad
        self.bias -= learning_rate * bias_grad
    
    def predict(self, X, threshold=0.5):
        """
        Make binary predictions
        
        Args:
            X (torch.Tensor): Input features
            threshold (float): Classification threshold
        
        Returns:
            torch.Tensor: Binary predictions (0 or 1)
        """
        probabilities = self.forward(X)
        return (probabilities >= threshold).float()
    
    def accuracy(self, X, y):
        """
        Compute classification accuracy
        
        Args:
            X (torch.Tensor): Input features
            y (torch.Tensor): True labels
        
        Returns:
            float: Accuracy percentage
        """
        predictions = self.predict(X)
        correct = (predictions == y).float()
        return torch.mean(correct).item() * 100

# Initialize the model
input_size = X_train_tensor.shape[1]  # Number of features
model = SingleLayerANN(input_size, device=device)

print(f"\nModel initialized successfully!")
print(f"Initial weights: {model.weights.squeeze().detach().cpu().numpy()}")
print(f"Initial bias: {model.bias.item():.4f}")

## Step 6: Training the Neural Network

Train the single-layer neural network using manual gradient descent.

In [None]:
# Training hyperparameters
learning_rate = 0.1
epochs = 100
print_every = 10

# Lists to store training history
train_losses = []
train_accuracies = []
test_accuracies = []

print(f"Starting training for {epochs} epochs...")
print(f"Learning rate: {learning_rate}")
print("-" * 60)

# Training loop
for epoch in range(epochs):
    # Forward pass
    predictions = model.forward(X_train_tensor)
    
    # Compute loss
    loss = model.binary_cross_entropy(predictions, y_train_tensor)
    
    # Compute gradients
    weight_grad, bias_grad = model.compute_gradients(
        X_train_tensor, predictions, y_train_tensor
    )
    
    # Update parameters
    model.update_parameters(weight_grad, bias_grad, learning_rate)
    
    # Store training metrics
    train_losses.append(loss.item())
    train_acc = model.accuracy(X_train_tensor, y_train_tensor)
    test_acc = model.accuracy(X_test_tensor, y_test_tensor)
    train_accuracies.append(train_acc)
    test_accuracies.append(test_acc)
    
    # Print progress
    if (epoch + 1) % print_every == 0 or epoch == 0:
        print(f"Epoch {epoch + 1:3d}: Loss = {loss.item():.4f}, "
              f"Train Acc = {train_acc:.1f}%, Test Acc = {test_acc:.1f}%")

print("-" * 60)
print("Training completed!")

# Final results
final_train_acc = model.accuracy(X_train_tensor, y_train_tensor)
final_test_acc = model.accuracy(X_test_tensor, y_test_tensor)
final_loss = train_losses[-1]

print(f"\nFinal Results:")
print(f"Final Loss: {final_loss:.4f}")
print(f"Final Training Accuracy: {final_train_acc:.1f}%")
print(f"Final Test Accuracy: {final_test_acc:.1f}%")
print(f"\nLearned Parameters:")
print(f"Weights: {model.weights.squeeze().detach().cpu().numpy()}")
print(f"Bias: {model.bias.item():.4f}")

## Step 7: Visualize Training Progress

Plot the training loss and accuracy curves to analyze the learning process.

In [None]:
# Create training plots
fig, axes = plt.subplots(1, 2, figsize=(15, 5))

# Plot 1: Training Loss
axes[0].plot(range(1, epochs + 1), train_losses, 'b-', linewidth=2, label='Training Loss')
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Binary Cross Entropy Loss')
axes[0].set_title('Training Loss Over Time')
axes[0].grid(True, alpha=0.3)
axes[0].legend()

# Plot 2: Accuracy Curves
axes[1].plot(range(1, epochs + 1), train_accuracies, 'g-', linewidth=2, label='Training Accuracy')
axes[1].plot(range(1, epochs + 1), test_accuracies, 'r-', linewidth=2, label='Test Accuracy')
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('Accuracy (%)')
axes[1].set_title('Accuracy Over Time')
axes[1].grid(True, alpha=0.3)
axes[1].legend()
axes[1].set_ylim(0, 100)

plt.tight_layout()
plt.show()

# Print key training milestones
print("Training Milestones:")
print(f"Epoch 1: Loss = {train_losses[0]:.2f}")
if len(train_losses) >= 30:
    print(f"Epoch 30: Loss = {train_losses[29]:.2f}")
print(f"Final Epoch {epochs}: Loss = {train_losses[-1]:.2f}")
print(f"Accuracy on test set = {final_test_acc:.1f}%")

## Step 8: Decision Boundary Visualization

Visualize the decision boundary learned by the neural network.

In [None]:
def plot_decision_boundary(model, X, y, scaler, title="Decision Boundary"):
    """
    Plot the decision boundary of the trained model
    """
    plt.figure(figsize=(10, 8))
    
    # Create a mesh of points
    h = 0.02  # Step size in the mesh
    x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                         np.arange(y_min, y_max, h))
    
    # Make predictions on the mesh
    mesh_points = np.c_[xx.ravel(), yy.ravel()]
    mesh_points_scaled = scaler.transform(mesh_points)
    mesh_tensor = torch.FloatTensor(mesh_points_scaled).to(device)
    
    with torch.no_grad():
        Z = model.forward(mesh_tensor).cpu().numpy()
    Z = Z.reshape(xx.shape)
    
    # Plot the decision boundary
    plt.contourf(xx, yy, Z, levels=50, alpha=0.6, cmap='RdYlBu')
    plt.colorbar(label='Prediction Probability')
    
    # Plot the data points
    colors = ['red', 'blue']
    for i, label in enumerate([0, 1]):
        mask = y == label
        plt.scatter(X[mask, 0], X[mask, 1], c=colors[i], 
                   label=f'Class {label}', alpha=0.8, s=50, edgecolors='black')
    
    plt.xlabel('Feature 1 (f1)')
    plt.ylabel('Feature 2 (f2)')
    plt.title(title)
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.show()

# Plot decision boundary for training data
plot_decision_boundary(model, X_train, y_train, scaler, 
                      "Decision Boundary - Training Data")

# Plot decision boundary for test data
plot_decision_boundary(model, X_test, y_test, scaler, 
                      "Decision Boundary - Test Data")

## Step 9: Model Evaluation and Analysis

Perform detailed evaluation of the trained model.

In [None]:
# Detailed predictions analysis
with torch.no_grad():
    train_probs = model.forward(X_train_tensor).cpu().numpy()
    test_probs = model.forward(X_test_tensor).cpu().numpy()
    train_preds = model.predict(X_train_tensor).cpu().numpy()
    test_preds = model.predict(X_test_tensor).cpu().numpy()

# Compute confusion matrix manually
def compute_confusion_matrix(y_true, y_pred):
    tp = np.sum((y_true == 1) & (y_pred == 1))
    tn = np.sum((y_true == 0) & (y_pred == 0))
    fp = np.sum((y_true == 0) & (y_pred == 1))
    fn = np.sum((y_true == 1) & (y_pred == 0))
    return np.array([[tn, fp], [fn, tp]])

train_cm = compute_confusion_matrix(y_train, train_preds)
test_cm = compute_confusion_matrix(y_test, test_preds)

print("Model Evaluation Results:")
print("=" * 50)
print(f"Training Set:")
print(f"  Accuracy: {final_train_acc:.2f}%")
print(f"  Confusion Matrix:")
print(f"    TN: {train_cm[0,0]}, FP: {train_cm[0,1]}")
print(f"    FN: {train_cm[1,0]}, TP: {train_cm[1,1]}")

print(f"\nTest Set:")
print(f"  Accuracy: {final_test_acc:.2f}%")
print(f"  Confusion Matrix:")
print(f"    TN: {test_cm[0,0]}, FP: {test_cm[0,1]}")
print(f"    FN: {test_cm[1,0]}, TP: {test_cm[1,1]}")

# Calculate additional metrics
def calculate_metrics(cm):
    tn, fp, fn, tp = cm.ravel()
    precision = tp / (tp + fp) if (tp + fp) > 0 else 0
    recall = tp / (tp + fn) if (tp + fn) > 0 else 0
    f1 = 2 * (precision * recall) / (precision + recall) if (precision + recall) > 0 else 0
    return precision, recall, f1

test_precision, test_recall, test_f1 = calculate_metrics(test_cm)
print(f"\nTest Set Metrics:")
print(f"  Precision: {test_precision:.3f}")
print(f"  Recall: {test_recall:.3f}")
print(f"  F1-Score: {test_f1:.3f}")

# Prediction confidence analysis
print(f"\nPrediction Confidence Analysis:")
print(f"  Test probabilities - Min: {test_probs.min():.3f}, Max: {test_probs.max():.3f}")
print(f"  Test probabilities - Mean: {test_probs.mean():.3f}, Std: {test_probs.std():.3f}")

# Show some example predictions
print(f"\nExample Predictions (first 10 test samples):")
print("True | Pred | Prob")
print("-" * 20)
for i in range(min(10, len(y_test))):
    print(f"  {int(y_test[i])}  |  {int(test_preds[i])}   | {test_probs[i]:.3f}")

## Step 10: Sample Output Summary

Display results in the format requested in the assignment.

In [None]:
print("=" * 60)
print("ASSIGNMENT SAMPLE OUTPUT")
print("=" * 60)
print()

# Display key training epochs as requested
print(f"Epoch 1: Loss = {train_losses[0]:.2f}")
if len(train_losses) >= 30:
    print(f"Epoch 30: Loss = {train_losses[29]:.2f}")
else:
    print(f"Epoch {min(30, epochs)}: Loss = {train_losses[min(29, epochs-1)]:.2f}")

print(f"Accuracy on test set = {final_test_acc:.1f}%")

print()
print("=" * 60)
print("ADDITIONAL MODEL INFORMATION")
print("=" * 60)
print()
print(f"Model Architecture: Single-layer ANN")
print(f"Input features: {input_size}")
print(f"Activation function: Sigmoid")
print(f"Loss function: Binary Cross Entropy")
print(f"Optimizer: Manual Gradient Descent")
print(f"Learning rate: {learning_rate}")
print(f"Training epochs: {epochs}")
print(f"Dataset size: {len(df)} samples")
print(f"Train/Test split: {len(X_train)}/{len(X_test)}")
print(f"Device used: {device}")
print()
print(f"Final model parameters:")
print(f"  Weights: [{model.weights[0,0].item():.4f}, {model.weights[1,0].item():.4f}]")
print(f"  Bias: {model.bias.item():.4f}")
print()
print(f"Model successfully trained using manual implementation!")
print(f"No torch.nn or torch.nn.Module was used as required.")

## Conclusion

This notebook successfully demonstrates:

### ✅ **Assignment Requirements Met:**
1. **Manual Implementation**: Built ANN using only basic PyTorch operations (no `torch.nn` or `torch.nn.Module`)
2. **Dataset Generation**: Created binary classification dataset using `sklearn.datasets.make_classification`
3. **Model Architecture**: Implemented Y = w^T * x + b with sigmoid activation
4. **Loss Function**: Used Binary Cross Entropy loss
5. **Manual Optimization**: Implemented gradient descent with manual gradient computation
6. **Device Support**: Automatically uses GPU if available, otherwise CPU

### 📊 **Key Results:**
- Successfully trained a single-layer neural network for binary classification
- Achieved good test accuracy through manual gradient descent
- Visualized training progress and decision boundaries
- Demonstrated understanding of fundamental neural network concepts

### 🧠 **Learning Outcomes:**
- Understanding of forward propagation in neural networks
- Manual computation of gradients for backpropagation
- Implementation of activation functions and loss functions
- Parameter updates using gradient descent
- Data preprocessing and visualization techniques

This implementation provides a solid foundation for understanding how neural networks work at a fundamental level before moving to higher-level frameworks.