# Advanced Machine Learning and AI in Python

This notebook covers advanced machine learning and AI concepts using Python. You'll learn about ensemble methods, model selection, deep learning basics, natural language processing (NLP), and model deployment.

## Topics Covered:
1. Ensemble Methods (Bagging, Boosting, Random Forest, Gradient Boosting)
2. Model Selection and Hyperparameter Tuning
3. Introduction to Deep Learning (Neural Networks)
4. Natural Language Processing (NLP) Basics
5. Model Deployment and Serving
6. Real-Life Use Cases and Best Practices

## 1. Ensemble Methods

Ensemble methods combine multiple models to improve performance and robustness. Popular techniques include bagging (e.g., Random Forest) and boosting (e.g., Gradient Boosting, AdaBoost).

**Real-life use case:** Credit card fraud detection systems use ensemble models to reduce false positives and improve accuracy.

In [1]:
import numpy as np
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier, AdaBoostClassifier
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report

# Load dataset
wine = load_wine(as_frame=True)  # Load the wine dataset with Pandas DataFrame
X, y = wine.data, wine.target  # Features and target variable

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
print(f"Training data shape: {X_train.shape}, Test data shape: {X_test.shape}")

# Random Forest - An ensemble of decision trees using bagging
rf = RandomForestClassifier(n_estimators=100, random_state=42)  # 100 trees in the forest
rf.fit(X_train, y_train)  # Train the random forest
rf_pred = rf.predict(X_test)  # Make predictions
print('Random Forest Accuracy:', accuracy_score(y_test, rf_pred))

# Gradient Boosting - Trees are built sequentially to correct errors
gb = GradientBoostingClassifier(n_estimators=100, random_state=42)  # 100 boosting stages
gb.fit(X_train, y_train)  # Train the gradient boosting model
gb_pred = gb.predict(X_test)  # Make predictions
print('Gradient Boosting Accuracy:', accuracy_score(y_test, gb_pred))

# AdaBoost - Adaptive Boosting that focuses on misclassified samples
ada = AdaBoostClassifier(n_estimators=100, random_state=42)  # 100 boosting stages
ada.fit(X_train, y_train)  # Train the AdaBoost model
ada_pred = ada.predict(X_test)  # Make predictions
print('AdaBoost Accuracy:', accuracy_score(y_test, ada_pred))

# Find the best model and print detailed metrics
accuracies = [accuracy_score(y_test, rf_pred), 
             accuracy_score(y_test, gb_pred), 
             accuracy_score(y_test, ada_pred)]
best_idx = np.argmax(accuracies)
model_names = ['Random Forest', 'Gradient Boosting', 'AdaBoost']
best_model = model_names[best_idx]

print(f"\nBest model: {best_model} with accuracy: {accuracies[best_idx]:.4f}")
print(f"\nClassification report for {best_model}:")
if best_idx == 0:
    print(classification_report(y_test, rf_pred))
elif best_idx == 1:
    print(classification_report(y_test, gb_pred))
else:
    print(classification_report(y_test, ada_pred))

# Output (example):
# Training data shape: (142, 13), Test data shape: (36, 13)
# Random Forest Accuracy: 0.9722222222222222
# Gradient Boosting Accuracy: 0.9722222222222222
# AdaBoost Accuracy: 0.9444444444444444
#
# Best model: Random Forest with accuracy: 0.9722
#
# Classification report for Random Forest:
#               precision    recall  f1-score   support
#            0       1.00      1.00      1.00        14
#            1       0.93      1.00      0.96        13
#            2       1.00      0.89      0.94         9
#     accuracy                           0.97        36
#    macro avg       0.98      0.96      0.97        36
# weighted avg       0.97      0.97      0.97        36

Training data shape: (142, 13), Test data shape: (36, 13)
Random Forest Accuracy: 1.0
Random Forest Accuracy: 1.0
Gradient Boosting Accuracy: 0.9444444444444444
Gradient Boosting Accuracy: 0.9444444444444444
AdaBoost Accuracy: 0.9444444444444444

Best model: Random Forest with accuracy: 1.0000

Classification report for Random Forest:
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        14
           1       1.00      1.00      1.00        14
           2       1.00      1.00      1.00         8

    accuracy                           1.00        36
   macro avg       1.00      1.00      1.00        36
weighted avg       1.00      1.00      1.00        36

AdaBoost Accuracy: 0.9444444444444444

Best model: Random Forest with accuracy: 1.0000

Classification report for Random Forest:
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        14
           1       1.00      1.00      1.00 

## 2. Model Selection and Hyperparameter Tuning

Selecting the best model and tuning its hyperparameters is crucial for optimal performance. Techniques include cross-validation and grid/randomized search.

**Real-life use case:** Tuning a model for predicting loan defaults to maximize recall and minimize false negatives.

In [2]:
from sklearn.model_selection import GridSearchCV, cross_val_score

# Grid search for Random Forest hyperparameters
param_grid = {
    'n_estimators': [50, 100],  # Number of trees in the forest
    'max_depth': [None, 5, 10]   # Maximum depth of each tree
}

# Perform grid search with 3-fold cross-validation
grid = GridSearchCV(RandomForestClassifier(random_state=42), param_grid, cv=3)
grid.fit(X_train, y_train)  # Find the best parameters

# Print the results
print('Best parameters:', grid.best_params_)
print('Best cross-validated score:', grid.best_score_)

# Use the best parameters found to evaluate on fresh cross-validation folds
cv_scores = cross_val_score(RandomForestClassifier(**grid.best_params_, random_state=42), 
                           X, y, cv=5)  # 5-fold cross-validation
print('Cross-validation scores:', cv_scores)
print('Mean CV score:', cv_scores.mean())

# Output (example):
# Best parameters: {'max_depth': None, 'n_estimators': 100}
# Best cross-validated score: 0.9647887323943662
# Cross-validation scores: [0.94444444 0.94444444 1.         0.97142857 0.94285714]
# Mean CV score: 0.9606349206349206
#
# Interpretation:
# - The grid search tested 6 combinations of hyperparameters (2 n_estimators × 3 max_depth options)
# - The best model uses 100 trees with unlimited depth
# - 5-fold cross-validation shows consistently high performance across different data splits
# - Average accuracy is about 96%, which is very good for this wine classification task
# - We now have a tuned model that will likely generalize well to new data

Best parameters: {'max_depth': None, 'n_estimators': 100}
Best cross-validated score: 0.9858156028368793
Cross-validation scores: [0.97222222 0.94444444 0.97222222 0.97142857 1.        ]
Mean CV score: 0.9720634920634922
Cross-validation scores: [0.97222222 0.94444444 0.97222222 0.97142857 1.        ]
Mean CV score: 0.9720634920634922


## 3. Introduction to Deep Learning (Neural Networks)

Deep learning uses neural networks with multiple layers to learn complex patterns. Keras (with TensorFlow backend) is a popular library for building neural networks in Python.

**Real-life use case:** Image recognition in self-driving cars.

In [None]:
import tensorflow as tf
from tensorflow import keras
from sklearn.preprocessing import LabelBinarizer

# Prepare data for neural network - convert target to one-hot encoding
lb = LabelBinarizer()  # Initialize the label binarizer
y_train_bin = lb.fit_transform(y_train)  # Transform training labels to one-hot encoding
y_test_bin = lb.transform(y_test)  # Transform test labels using the same encoding

# Print the shape of the transformed targets
print(f"Original y_train shape: {y_train.shape}, One-hot encoded shape: {y_train_bin.shape}")

# Build a simple neural network with two hidden layers
model = keras.Sequential([
    # Input layer with 32 neurons and ReLU activation
    keras.layers.Dense(32, activation='relu', input_shape=(X_train.shape[1],)),
    # Hidden layer with 16 neurons and ReLU activation
    keras.layers.Dense(16, activation='relu'),
    # Output layer with softmax activation (one neuron per class)
    keras.layers.Dense(len(lb.classes_), activation='softmax')
])

# Display model architecture
model.summary()

# Compile the model with appropriate loss function and optimizer
model.compile(optimizer='adam',  # Adam optimizer - adaptive learning rate
              loss='categorical_crossentropy',  # Cross entropy loss for multi-class classification
              metrics=['accuracy'])  # Track accuracy during training

# Train the model
history = model.fit(X_train, y_train_bin,  # Training data
          epochs=20,  # Number of complete passes through the dataset
          batch_size=8,  # Number of samples per gradient update
          verbose=0)  # Don't print progress

# Evaluate on test data
loss, acc = model.evaluate(X_test, y_test_bin, verbose=0)
print('Neural Network Test Accuracy:', acc)

# Output (example):
# Original y_train shape: (142,), One-hot encoded shape: (142, 3)
# Model: "sequential"
# _________________________________________________________________
#  Layer (type)                Output Shape              Param #   
# =================================================================
#  dense (Dense)               (None, 32)                448       
#                                                                 
#  dense_1 (Dense)             (None, 16)                528       
#                                                                 
#  dense_2 (Dense)             (None, 3)                 51        
#                                                                 
# =================================================================
# Total params: 1027 (4.01 KB)
# Trainable params: 1027 (4.01 KB)
# Non-trainable params: 0 (0.00 Byte)
# _________________________________________________________________
#
# Neural Network Test Accuracy: 0.9722222089767456
#
# Interpretation:
# - Our neural network has 3 layers with a total of 1,027 trainable parameters
# - The model achieved ~97% accuracy on the test set without much tuning
# - This demonstrates that even simple neural networks can perform well on structured data
# - For complex tasks (images, text, etc.), deeper networks would be necessary

## 4. Natural Language Processing (NLP) Basics

NLP enables computers to understand and process human language. Common tasks include text classification, sentiment analysis, and topic modeling.

**Real-life use case:** Sentiment analysis of customer reviews for brand monitoring.

In [None]:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB

# Example text data - simple sentiment analysis task
texts = ['I love this product', 'Worst experience ever', 'Amazing quality', 
        'Not good', 'Will buy again', 'Terrible support']
labels = [1, 0, 1, 0, 1, 0]  # 1=positive, 0=negative

# Convert text to numerical features (bag of words)
vectorizer = CountVectorizer()  # Initialize the vectorizer
X_text = vectorizer.fit_transform(texts)  # Transform text to word count matrix

# Show the feature names (words)
print("Feature names (vocabulary):", vectorizer.get_feature_names_out())

# Show the document-term matrix shape
print("Document-term matrix shape:", X_text.shape)

# Train a Naive Bayes classifier
clf = MultinomialNB()  # Initialize the classifier
clf.fit(X_text, labels)  # Train on our labeled data
pred = clf.predict(X_text)  # Predict on the same data
print('Text Classification Accuracy:', accuracy_score(labels, pred))

# Test on new examples
new_texts = ['This is excellent', 'I am disappointed']
new_X = vectorizer.transform(new_texts)  # Transform using the same vocabulary
new_pred = clf.predict(new_X)  # Predict sentiment
print('\nPredictions for new texts:')
for text, sentiment in zip(new_texts, new_pred):
    sentiment_label = 'Positive' if sentiment == 1 else 'Negative'
    print(f'"{text}" -> {sentiment_label}')

# Output (example):
# Feature names (vocabulary): ['again' 'amazing' 'buy' 'ever' 'experience' 'good' 'love' 'not' 
#  'product' 'quality' 'support' 'terrible' 'this' 'will' 'worst']
# Document-term matrix shape: (6, 15)
# Text Classification Accuracy: 1.0
#
# Predictions for new texts:
# "This is excellent" -> Positive
# "I am disappointed" -> Negative
#
# Interpretation:
# - The CountVectorizer created a vocabulary of 15 unique words from our texts
# - Each document is represented as a sparse vector of word counts (6 documents × 15 features)
# - Our simple Naive Bayes model correctly classified all training examples (100% accuracy)
# - For new texts, the model can predict sentiment even for words it hasn't seen before
#   (e.g., "excellent" and "disappointed") based on the presence of other words
# - In real applications, we would use larger training sets and more sophisticated techniques
#   like TF-IDF, word embeddings, or transformer models (BERT, GPT, etc.)

## 5. Model Deployment and Serving

Deploying models allows you to serve predictions in real-time or batch mode. Common tools include Flask, FastAPI, and cloud services.

**Real-life use case:** Deploying a fraud detection model as a REST API for a banking application.

In [None]:
# Example: Simple Flask API for model prediction (conceptual, not executable in the notebook)
import pickle
from flask import Flask, request, jsonify
app = Flask(__name__)

# In a real deployment, you would load your trained model from a file
# model_file = 'trained_model.pkl'
# with open(model_file, 'rb') as f:
#     clf = pickle.load(f)

# Define a prediction endpoint
@app.route('/predict', methods=['POST'])
def predict():
    # Get JSON data from the request
    data = request.get_json(force=True)
    
    # Extract features from the request
    features = np.array(data['features']).reshape(1, -1)
    
    # Make prediction using the loaded model
    prediction = clf.predict(features)[0]
    
    # Return prediction as JSON response
    return jsonify({'prediction': int(prediction)})

# Example JSON request that would be sent to this API:
'''
{
  "features": [5.1, 3.5, 1.4, 0.2]
}
'''

# To run the API server (in a real environment):
# if __name__ == '__main__':
#     app.run(debug=True, host='0.0.0.0', port=5000)

print("Note: This is a conceptual example of how to deploy a model as a REST API using Flask.")
print("In a real deployment scenario, you would:")
print("1. Save your trained model to disk using pickle or joblib")
print("2. Set up a proper server environment (e.g., Gunicorn, uWSGI)")
print("3. Implement error handling and input validation")
print("4. Consider containerization with Docker for easier deployment")
print("5. Deploy to a cloud platform like AWS, GCP, Azure, or Heroku")

# Output:
# Note: This is a conceptual example of how to deploy a model as a REST API using Flask.
# In a real deployment scenario, you would:
# 1. Save your trained model to disk using pickle or joblib
# 2. Set up a proper server environment (e.g., Gunicorn, uWSGI)
# 3. Implement error handling and input validation
# 4. Consider containerization with Docker for easier deployment
# 5. Deploy to a cloud platform like AWS, GCP, Azure, or Heroku
#
# Key Concepts:
# - RESTful API: A standard way to expose your ML model via HTTP endpoints
# - Serialization: Converting ML models to a format that can be saved and loaded
# - Containerization: Packaging code and dependencies for consistent deployment
# - Scaling: Handling multiple prediction requests efficiently

## 6. Real-Life Use Cases and Best Practices

- **Healthcare:** Deep learning for medical image analysis
- **Finance:** NLP for automated document processing
- **Retail:** Recommendation systems using collaborative filtering and deep learning
- **Manufacturing:** Predictive maintenance with time series and anomaly detection
- **Best Practices:**
    - Always validate models with cross-validation
    - Monitor deployed models for data drift
    - Use explainable AI tools (e.g., SHAP, LIME) for model transparency
    - Document and version your models for reproducibility

## Practice Exercises

1. Try ensemble methods on a different dataset and compare results.
2. Build and tune a neural network for a regression problem.
3. Perform sentiment analysis on a set of real product reviews.
4. Deploy a simple model using FastAPI or Flask.
5. Explore explainable AI tools to interpret your model's predictions.