# Artificial Neural Network (ANN) Implementation

This notebook demonstrates a complete implementation of an Artificial Neural Network for binary classification.

## Dataset
We'll use the Churn Modeling dataset to predict customer churn.

## Topics Covered:
1. Data Preprocessing
2. Feature Scaling
3. Building ANN Architecture
4. Training the Model
5. Evaluating Performance
6. Making Predictions

## 1. Import Libraries

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.metrics import confusion_matrix, classification_report, accuracy_score
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping

print(f"TensorFlow Version: {tf.__version__}")
print(f"Keras Version: {keras.__version__}")

## 2. Load and Explore Data

In [None]:
# For demonstration, we'll create a synthetic dataset
# You can replace this with your own dataset

from sklearn.datasets import make_classification

# Generate synthetic data
X, y = make_classification(n_samples=10000, n_features=20, n_informative=15, 
                          n_redundant=5, random_state=42)

# Create DataFrame
feature_names = [f'Feature_{i+1}' for i in range(20)]
df = pd.DataFrame(X, columns=feature_names)
df['Target'] = y

print("Dataset Shape:", df.shape)
print("\nFirst few rows:")
print(df.head())
print("\nDataset Info:")
print(df.info())
print("\nClass Distribution:")
print(df['Target'].value_counts())

## 3. Data Visualization

In [None]:
# Visualize class distribution
plt.figure(figsize=(8, 5))
sns.countplot(x='Target', data=df)
plt.title('Class Distribution')
plt.xlabel('Target Class')
plt.ylabel('Count')
plt.show()

# Correlation heatmap
plt.figure(figsize=(12, 10))
correlation = df.corr()
sns.heatmap(correlation, cmap='coolwarm', center=0, linewidths=1, annot=False)
plt.title('Feature Correlation Heatmap')
plt.show()

## 4. Data Preprocessing

In [None]:
# Split features and target
X = df.drop('Target', axis=1).values
y = df['Target'].values

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

print(f"Training set size: {X_train.shape[0]}")
print(f"Testing set size: {X_test.shape[0]}")

## 5. Feature Scaling

In [None]:
# Standardize features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

print("Feature Scaling Complete")
print(f"Mean of first feature in training: {X_train[:, 0].mean():.4f}")
print(f"Std of first feature in training: {X_train[:, 0].std():.4f}")

## 6. Build ANN Architecture

In [None]:
# Initialize the ANN
model = Sequential()

# Input layer and first hidden layer
model.add(Dense(units=64, activation='relu', input_dim=X_train.shape[1]))
model.add(Dropout(0.2))

# Second hidden layer
model.add(Dense(units=32, activation='relu'))
model.add(Dropout(0.2))

# Third hidden layer
model.add(Dense(units=16, activation='relu'))

# Output layer
model.add(Dense(units=1, activation='sigmoid'))

# Display model architecture
model.summary()

## 7. Compile the Model

In [None]:
# Compile the model
model.compile(optimizer='adam', 
              loss='binary_crossentropy', 
              metrics=['accuracy'])

print("Model Compiled Successfully")

## 8. Train the Model

In [None]:
# Early stopping callback
early_stop = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)

# Train the model
history = model.fit(X_train, y_train, 
                    validation_split=0.2,
                    epochs=100, 
                    batch_size=32,
                    callbacks=[early_stop],
                    verbose=1)

print("\nTraining Complete!")

## 9. Visualize Training History

In [None]:
# Plot training & validation accuracy
plt.figure(figsize=(14, 5))

plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend()
plt.grid(True)

# Plot training & validation loss
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend()
plt.grid(True)

plt.tight_layout()
plt.show()

## 10. Evaluate the Model

In [None]:
# Make predictions
y_pred_prob = model.predict(X_test)
y_pred = (y_pred_prob > 0.5).astype(int).flatten()

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Test Accuracy: {accuracy:.4f}")

# Confusion Matrix
cm = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
plt.title('Confusion Matrix')
plt.ylabel('Actual')
plt.xlabel('Predicted')
plt.show()

# Classification Report
print("\nClassification Report:")
print(classification_report(y_test, y_pred))

## 11. Make Predictions on New Data

In [None]:
# Example: Predict for a single sample
sample = X_test[0].reshape(1, -1)
prediction_prob = model.predict(sample)
prediction_class = (prediction_prob > 0.5).astype(int)[0][0]

print(f"Prediction Probability: {prediction_prob[0][0]:.4f}")
print(f"Predicted Class: {prediction_class}")
print(f"Actual Class: {y_test[0]}")

## 12. Save the Model

In [None]:
# Save the model
model.save('ann_model.h5')
print("Model saved as 'ann_model.h5'")

# To load the model later:
# from tensorflow.keras.models import load_model
# loaded_model = load_model('ann_model.h5')

## Summary

### Key Takeaways:
1. **ANN Architecture**: Multi-layer perceptron with dense layers
2. **Activation Functions**: ReLU for hidden layers, Sigmoid for output
3. **Regularization**: Dropout layers to prevent overfitting
4. **Optimization**: Adam optimizer with binary crossentropy loss
5. **Early Stopping**: Prevents overfitting by monitoring validation loss

### When to Use ANN:
- Binary or multi-class classification
- Regression problems
- Pattern recognition in structured/tabular data
- When you have sufficient training data

### Advantages:
- Can learn complex non-linear relationships
- Flexible architecture
- Good performance on various tasks

### Limitations:
- Requires significant training data
- Computationally expensive
- Can overfit without proper regularization
- Acts as a "black box" (less interpretable)