# Advanced Machine Learning and AI in Python

This notebook covers advanced machine learning and AI concepts using Python. You'll learn about ensemble methods, model selection, deep learning basics, natural language processing (NLP), and model deployment.

## Topics Covered:
1. Ensemble Methods (Bagging, Boosting, Random Forest, Gradient Boosting)
2. Model Selection and Hyperparameter Tuning
3. Introduction to Deep Learning (Neural Networks)
4. Natural Language Processing (NLP) Basics
5. Model Deployment and Serving
6. Real-Life Use Cases and Best Practices

## 1. Ensemble Methods

Ensemble methods combine multiple models to improve performance and robustness. Popular techniques include bagging (e.g., Random Forest) and boosting (e.g., Gradient Boosting, AdaBoost).

**Real-life use case:** Credit card fraud detection systems use ensemble models to reduce false positives and improve accuracy.

In [None]:
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier, AdaBoostClassifier
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load dataset
wine = load_wine(as_frame=True)
X, y = wine.data, wine.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Random Forest
rf = RandomForestClassifier(n_estimators=100, random_state=42)
rf.fit(X_train, y_train)
rf_pred = rf.predict(X_test)
print('Random Forest Accuracy:', accuracy_score(y_test, rf_pred))

# Gradient Boosting
gb = GradientBoostingClassifier(n_estimators=100, random_state=42)
gb.fit(X_train, y_train)
gb_pred = gb.predict(X_test)
print('Gradient Boosting Accuracy:', accuracy_score(y_test, gb_pred))

# AdaBoost
ada = AdaBoostClassifier(n_estimators=100, random_state=42)
ada.fit(X_train, y_train)
ada_pred = ada.predict(X_test)
print('AdaBoost Accuracy:', accuracy_score(y_test, ada_pred))

## 2. Model Selection and Hyperparameter Tuning

Selecting the best model and tuning its hyperparameters is crucial for optimal performance. Techniques include cross-validation and grid/randomized search.

**Real-life use case:** Tuning a model for predicting loan defaults to maximize recall and minimize false negatives.

In [None]:
from sklearn.model_selection import GridSearchCV, cross_val_score

# Grid search for Random Forest hyperparameters
param_grid = {
    'n_estimators': [50, 100],
    'max_depth': [None, 5, 10]
}
grid = GridSearchCV(RandomForestClassifier(random_state=42), param_grid, cv=3)
grid.fit(X_train, y_train)
print('Best parameters:', grid.best_params_)
print('Best cross-validated score:', grid.best_score_)

# Cross-validation example
cv_scores = cross_val_score(RandomForestClassifier(**grid.best_params_, random_state=42), X, y, cv=5)
print('Cross-validation scores:', cv_scores)
print('Mean CV score:', cv_scores.mean())

## 3. Introduction to Deep Learning (Neural Networks)

Deep learning uses neural networks with multiple layers to learn complex patterns. Keras (with TensorFlow backend) is a popular library for building neural networks in Python.

**Real-life use case:** Image recognition in self-driving cars.

In [None]:
import tensorflow as tf
from tensorflow import keras
from sklearn.preprocessing import LabelBinarizer

# Prepare data for neural network
lb = LabelBinarizer()
y_train_bin = lb.fit_transform(y_train)
y_test_bin = lb.transform(y_test)

# Build a simple neural network
model = keras.Sequential([
    keras.layers.Dense(32, activation='relu', input_shape=(X_train.shape[1],)),
    keras.layers.Dense(16, activation='relu'),
    keras.layers.Dense(len(lb.classes_), activation='softmax')
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train_bin, epochs=20, batch_size=8, verbose=0)
loss, acc = model.evaluate(X_test, y_test_bin, verbose=0)
print('Neural Network Test Accuracy:', acc)

## 4. Natural Language Processing (NLP) Basics

NLP enables computers to understand and process human language. Common tasks include text classification, sentiment analysis, and topic modeling.

**Real-life use case:** Sentiment analysis of customer reviews for brand monitoring.

In [None]:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB

# Example text data
texts = ['I love this product', 'Worst experience ever', 'Amazing quality', 'Not good', 'Will buy again', 'Terrible support']
labels = [1, 0, 1, 0, 1, 0]  # 1=positive, 0=negative

vectorizer = CountVectorizer()
X_text = vectorizer.fit_transform(texts)

clf = MultinomialNB()
clf.fit(X_text, labels)
pred = clf.predict(X_text)
print('Text Classification Accuracy:', accuracy_score(labels, pred))

## 5. Model Deployment and Serving

Deploying models allows you to serve predictions in real-time or batch mode. Common tools include Flask, FastAPI, and cloud services.

**Real-life use case:** Deploying a fraud detection model as a REST API for a banking application.

In [None]:
# Example: Simple Flask API for model prediction (conceptual, not executable here)
from flask import Flask, request, jsonify
app = Flask(__name__)

# Assume clf is a trained model
@app.route('/predict', methods=['POST'])
def predict():
    data = request.get_json(force=True)
    features = np.array(data['features']).reshape(1, -1)
    prediction = clf.predict(features)[0]
    return jsonify({'prediction': int(prediction)})

# To run: app.run(debug=True)

## 6. Real-Life Use Cases and Best Practices

- **Healthcare:** Deep learning for medical image analysis
- **Finance:** NLP for automated document processing
- **Retail:** Recommendation systems using collaborative filtering and deep learning
- **Manufacturing:** Predictive maintenance with time series and anomaly detection
- **Best Practices:**
    - Always validate models with cross-validation
    - Monitor deployed models for data drift
    - Use explainable AI tools (e.g., SHAP, LIME) for model transparency
    - Document and version your models for reproducibility

## Practice Exercises

1. Try ensemble methods on a different dataset and compare results.
2. Build and tune a neural network for a regression problem.
3. Perform sentiment analysis on a set of real product reviews.
4. Deploy a simple model using FastAPI or Flask.
5. Explore explainable AI tools to interpret your model's predictions.