# Neural Network Performance Analysis on Pima Indian Diabetes Dataset
**Objective:** Compare different neural network configurations for diabetes prediction

### Experiment Setup
- **Dataset:** Pima Indian Diabetes (768 samples, 8 features)
- **Evaluation Metric:** Classification Accuracy
- **Models Compared:**
  1. Single hidden layer with SGD optimizer
  2. Single hidden layer with Adam optimizer
  3. Two hidden layers with SGD optimizer
- **Parameters Tested:**
  - Learning rates: 0.01 to 0.09 (step 0.02)
  - Hidden layer neurons: 6 to 12 (step 2)
- **Robustness Measure:** 5 independent runs per configuration

## 1. Import Required Packages

In [None]:
# Core computation and data handling
import numpy as np
import pandas as pd

# Visualization
import matplotlib.pyplot as plt
%matplotlib inline

# Machine learning components
from sklearn import datasets
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.preprocessing import StandardScaler

# Experiment reproducibility
import random
random.seed(42)
np.random.seed(42)

## 2. Data Loading and Preprocessing

### Dataset Characteristics:
- **Source:** [Pima Indians Diabetes Database](https://www.kaggle.com/kumargh/pimaindiansdiabetescsv)
- **Features:** 8 medical predictors
  - Pregnancies, Glucose, BloodPressure, SkinThickness, Insulin, BMI, DiabetesPedigreeFunction, Age
- **Target:** Diabetes diagnosis (0 = negative, 1 = positive)

In [None]:
def load_and_split_data(run_num, test_size=0.4):
    """
    Load diabetes dataset and create train/test splits
    
    Parameters:
    run_num (int): Random seed for reproducible splitting
    test_size (float): Proportion of data for testing
    
    Returns:
    Tuple of (x_train, x_test, y_train, y_test)
    """
    # Load dataset
    data = pd.read_csv("../datasets/pima-indians-diabetes.csv", header=None)
    
    # If you like, feature engineering recommendations would go here
    # (e.g., handling missing values, feature scaling)
    
    # Separate features (all columns except last) and target (last column)
    X = data.iloc[:, :-1].values
    y = data.iloc[:, -1].values
    
    # Create train/test split
    return train_test_split(X, y, test_size=test_size, random_state=run_num)

## 3. Neural Network Model Configuration

### Model Architectures:
1. **Model 0**: 1 hidden layer + SGD optimizer
2. **Model 1**: 1 hidden layer + Adam optimizer
3. **Model 2**: 2 hidden layers + SGD optimizer

### Key Hyperparameters:
- `hidden_layer_sizes`: Neurons per layer
- `learning_rate_init`: Starting learning rate
- `max_iter`: Training epochs
- `solver`: Optimization algorithm

In [None]:
def create_model(model_type, hidden_units, learning_rate, random_seed):
    """
    Initialize MLP classifier based on specified configuration
    
    Parameters:
    model_type (int): 0, 1, or 2 specifying architecture
    hidden_units (int): Number of neurons per hidden layer
    learning_rate (float): Initial learning rate
    random_seed (int): Random state for reproducibility
    
    Returns:
    Configured MLPClassifier instance
    """
    common_params = {
        'random_state': random_seed,
        'max_iter': 100,
        'learning_rate_init': learning_rate,
        'early_stopping': True
    }
    
    if model_type == 0:  # SGD optimizer
        return MLPClassifier(
            solver='sgd', 
            hidden_layer_sizes = (hidden_units,),
            **common_params
        )
    
    elif model_type == 1:  # Adam optimizer
        return MLPClassifier(
            solver='adam', 
            hidden_layer_sizes = (hidden_units,),
            **common_params
        )
    
    elif model_type == 2:  # Two hidden layers
        return MLPClassifier(
            solver='sgd',
            hidden_layer_sizes=(hidden_units, hidden_units),
            **common_params
        )
    
    else:
        raise ValueError("Invalid model_type. Choose 0, 1, or 2")

## 4. Experiment Execution Framework

### Experimental Design:
1. **Parameter Grid**:
   - Learning rates: [0.01, 0.03, 0.05, 0.07, 0.09]
   - Hidden units: [6, 8, 10, 12]
2. **Robustness Testing**:
   - 5 independent runs per configuration
3. **Evaluation**:
   - Report mean ± std of test accuracy

In [None]:
def run_experiment(model_type, hidden_units_range, learning_rates, num_runs=5):
    """
    Execute full experiment for given model type
    
    Parameters:
    model_type (int): Model configuration (0,1,2)
    hidden_units_range (range): Range of hidden units to test
    learning_rates (list): Learning rates to evaluate
    num_runs (int): Number of repetitions per configuration
    
    Outputs:
    Prints performance metrics for each configuration
    """
    # Model type labels for reporting
    model_names = {
        0: '1 Hidden Layer (SGD)',
        1: '1 Hidden Layer (Adam)',
        2: '2 Hidden Layers (SGD)'
    }
    
    print(f"\n{'='*50}")
    print(f"Experimenting with Model: {model_names[model_type]}")
    print(f"Learning Rates: {learning_rates}")
    print(f"Hidden Units Range: {list(hidden_units_range)}")
    print(f"Number of Runs per Config: {num_runs}")
    print(f"{'='*50}\n")
    
    # Main experiment loop
    for lr in learning_rates:
        print(f"\n{'='*30}")
        print(f"LEARNING RATE: {lr:.3f}")
        print(f"{'='*30}")
        
        for n_units in hidden_units_range:
            accuracies = []
            
            for run in range(num_runs):
                # Data splitting with unique seed
                X_train, X_test, y_train, y_test = load_and_split_data(run)
                
                # Initialize model
                model = create_model(model_type, n_units, lr, run)
                
                # Train model
                model.fit(X_train, y_train)
                
                # Evaluate performance
                y_pred = model.predict(X_test)
                acc = accuracy_score(y_test, y_pred)
                accuracies.append(acc)
                
                # Optional: Add confusion matrix/ROC analysis here
            
            # Report summary statistics
            mean_acc = np.mean(accuracies)
            std_acc = np.std(accuracies)
            
            print(f"Hidden Units: {n_units} | "
                  f"Accuracy: {mean_acc:.3f} ± {std_acc:.3f} | "
                  f"Min: {min(accuracies):.3f}, Max: {max(accuracies):.3f}")
            
            # Recommended: Store results for later visualization
            # results.append([model_type, lr, n_units, mean_acc, std_acc])

## 5. Execute All Experiments

### Experiment Parameters:
- **Learning Rates:** 0.01 to 0.09 (step 0.02)
- **Hidden Units:** 6 to 12 (step 2)
- **Model Types:** 0 (1-layer SGD), 1 (1-layer Adam), 2 (2-layer SGD)

In [None]:
# Configure experiment parameters
learning_rates = np.arange(0.01, 0.10, 0.02)  # 0.01, 0.03, ..., 0.09
hidden_units_range = range(6, 13, 2)  # 6, 8, 10, 12
num_runs = 5

# Uncomment to run experiments
for model_type in [0, 1, 2]:
    run_experiment(model_type, hidden_units_range, learning_rates, num_runs)

## 6. Homework: Recommended Enhancements

### Data Preprocessing:
```python
# Add to load_and_split_data() function:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
```

### Visualization:
```python
# Add after experiment loop:
plt.figure(figsize=(10, 6))
for n_units in hidden_units_range:
    subset = results[results['hidden_units'] == n_units]
    plt.errorbar(subset['lr'], subset['mean_acc'], 
                 yerr=subset['std_acc'], 
                 label=f'{n_units} units',
                 capsize=5)
plt.xlabel('Learning Rate')
plt.ylabel('Test Accuracy')
plt.title('Model Performance by Configuration')
plt.legend()
plt.grid(True)
plt.show()
```

### Advanced Analysis:
```python
# Add to evaluation section:
print("Classification Report:")
print(classification_report(y_test, y_pred))
print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred))
```