# Chapter 1: Introduction to Machine Learning
## Interactive Notebook

Welcome to the hands-on companion notebook for Chapter 1! This notebook demonstrates key concepts through practical examples.

### What You'll Do in This Notebook:
1. **Compare Traditional vs ML Programming** - See the difference in action
2. **Explore Supervised Learning** - Classification and regression examples
3. **Discover Unsupervised Learning** - Clustering and dimensionality reduction
4. **Try Reinforcement Learning** - Simple agent-environment interaction
5. **Master Python ML Libraries** - NumPy, Pandas, Matplotlib, Scikit-learn

Let's begin our machine learning journey! üöÄ

## Section 1: Import Required Libraries

Let's start by importing all the Python libraries we'll need for our machine learning examples.

In [None]:
# Essential Machine Learning Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import load_iris, load_boston, make_blobs
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LinearRegression, LogisticRegression
from sklearn.cluster import KMeans
from sklearn.decomposition import PCA
from sklearn.metrics import accuracy_score, mean_squared_error, classification_report
import warnings
warnings.filterwarnings('ignore')

# Set up plotting style
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

print("‚úÖ All libraries imported successfully!")
print(f"NumPy version: {np.__version__}")
print(f"Pandas version: {pd.__version__}")
print(f"Matplotlib version: {plt.matplotlib.__version__}")
print(f"Scikit-learn version: {sklearn.__version__}")

# Import sklearn explicitly for version check
import sklearn
print(f"Scikit-learn version: {sklearn.__version__}")

## Section 2: Traditional Programming vs Machine Learning

Let's see the fundamental difference between traditional programming and machine learning approaches with a practical example: **Email Spam Detection**.

In [None]:
# Traditional Programming Approach: Rule-based Spam Detection
def is_spam_traditional(email_text):
    """Traditional approach using hand-crafted rules"""
    spam_score = 0
    email_lower = email_text.lower()
    
    # Rule 1: Check for suspicious words
    spam_words = ['free', 'money', 'winner', 'urgent', 'act now', 'limited time']
    for word in spam_words:
        if word in email_lower:
            spam_score += 1
    
    # Rule 2: Check for excessive punctuation
    if email_text.count('!') > 3:
        spam_score += 1
    
    # Rule 3: Check for all caps
    caps_ratio = sum(1 for c in email_text if c.isupper()) / len(email_text)
    if caps_ratio > 0.5:
        spam_score += 2
    
    # Rule 4: Check for suspicious patterns
    if '$$$' in email_text or '!!!' in email_text:
        spam_score += 2
    
    return spam_score > 2  # Threshold for spam classification

# Test emails
test_emails = [
    "Hi there! How are you doing today?",
    "URGENT!!! You've WON $1000!!! ACT NOW!!!",
    "Meeting scheduled for tomorrow at 3 PM",
    "FREE MONEY!!! Limited time offer - click now!!!"
]

print("üîß Traditional Programming Results:")
print("=" * 50)
for i, email in enumerate(test_emails, 1):
    result = is_spam_traditional(email)
    print(f"Email {i}: {'SPAM' if result else 'NOT SPAM'}")
    print(f"Content: {email[:50]}...")
    print()

In [None]:
# Machine Learning Approach: Learning from Examples
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB

# Training data: emails with labels
training_emails = [
    ("Hi John, how was your meeting today?", 0),  # 0 = not spam
    ("FREE MONEY NOW!!! CLICK HERE!!!", 1),      # 1 = spam
    ("Reminder: Your appointment is tomorrow", 0),
    ("You've won $1000000! Act fast!", 1),
    ("Can we reschedule our lunch?", 0),
    ("URGENT: Claim your prize now!", 1),
    ("Thanks for the presentation slides", 0),
    ("Limited time offer - buy now!", 1),
    ("Meeting notes attached", 0),
    ("GET RICH QUICK - GUARANTEED!", 1)
]

# Separate texts and labels
texts, labels = zip(*training_emails)

# Create ML pipeline
vectorizer = CountVectorizer()
classifier = MultinomialNB()

# Train the model
X_train = vectorizer.fit_transform(texts)
classifier.fit(X_train, labels)

# Test on the same emails we used before
print("ü§ñ Machine Learning Results:")
print("=" * 50)

X_test = vectorizer.transform(test_emails)
predictions = classifier.predict(X_test)
probabilities = classifier.predict_proba(X_test)

for i, (email, pred, prob) in enumerate(zip(test_emails, predictions, probabilities), 1):
    spam_prob = prob[1]  # probability of being spam
    print(f"Email {i}: {'SPAM' if pred else 'NOT SPAM'} (confidence: {spam_prob:.2f})")
    print(f"Content: {email[:50]}...")
    print()

print("üí° Key Differences:")
print("Traditional: Fixed rules, hard to maintain")
print("ML: Learns patterns, adapts to new data")

## Section 3: Supervised Learning in Action

Supervised learning uses labeled examples to learn patterns. Let's explore both **classification** (predicting categories) and **regression** (predicting numbers).

In [None]:
# Supervised Learning Example 1: Classification (Iris Dataset)
print("üå∏ Classification Example: Iris Flower Species")
print("=" * 50)

# Load the famous Iris dataset
iris = load_iris()
X, y = iris.data, iris.target

print(f"Dataset shape: {X.shape}")
print(f"Features: {iris.feature_names}")
print(f"Classes: {iris.target_names}")

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

# Train a Decision Tree classifier
classifier = DecisionTreeClassifier(max_depth=3, random_state=42)
classifier.fit(X_train, y_train)

# Make predictions
y_pred = classifier.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

print(f"\nüìä Results:")
print(f"Training samples: {len(X_train)}")
print(f"Test samples: {len(X_test)}")
print(f"Accuracy: {accuracy:.3f} ({accuracy*100:.1f}%)")

# Visualize results
plt.figure(figsize=(12, 4))

# Plot 1: Feature scatter plot
plt.subplot(1, 3, 1)
colors = ['red', 'green', 'blue']
for i, color in enumerate(colors):
    mask = y == i
    plt.scatter(X[mask, 0], X[mask, 1], c=color, label=iris.target_names[i], alpha=0.7)
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Sepal Width (cm)')
plt.title('Iris Dataset - Sepal Dimensions')
plt.legend()

# Plot 2: Feature importance
plt.subplot(1, 3, 2)
importance = classifier.feature_importances_
plt.bar(range(len(importance)), importance)
plt.xticks(range(len(importance)), [name.split()[0] for name in iris.feature_names], rotation=45)
plt.title('Feature Importance')
plt.ylabel('Importance')

# Plot 3: Confusion matrix-like visualization
plt.subplot(1, 3, 3)
unique_labels = np.unique(y_test)
correct_pred = y_test == y_pred
colors_pred = ['green' if correct else 'red' for correct in correct_pred]
plt.scatter(range(len(y_test)), y_test, c=colors_pred, alpha=0.7)
plt.xlabel('Test Sample')
plt.ylabel('True Class')
plt.title('Predictions (Green=Correct, Red=Wrong)')
plt.yticks(unique_labels, iris.target_names)

plt.tight_layout()
plt.show()

print(f"\nüéØ Sample Predictions:")
for i in range(min(5, len(X_test))):
    print(f"Sample {i+1}: {iris.target_names[y_test[i]]} ‚Üí Predicted: {iris.target_names[y_pred[i]]}")

In [None]:
# Supervised Learning Example 2: Regression (Boston Housing)
print("üè† Regression Example: Boston Housing Prices")
print("=" * 50)

# Create synthetic housing data (since Boston dataset is deprecated)
np.random.seed(42)
n_samples = 500

# Generate features
crime_rate = np.random.exponential(2, n_samples)
rooms = np.random.normal(6.5, 1, n_samples)
age = np.random.uniform(0, 100, n_samples)
distance = np.random.exponential(3, n_samples)

# Create target variable with realistic relationships
price = (50 - 3 * crime_rate + 8 * rooms - 0.1 * age - 2 * distance + 
         np.random.normal(0, 5, n_samples))
price = np.clip(price, 10, 50)  # Reasonable price range

# Combine features
X = np.column_stack([crime_rate, rooms, age, distance])
feature_names = ['Crime Rate', 'Avg Rooms', 'Building Age', 'Distance to Center']

print(f"Dataset shape: {X.shape}")
print(f"Features: {feature_names}")
print(f"Price range: ${price.min():.1f}k - ${price.max():.1f}k")

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, price, test_size=0.2, random_state=42
)

# Train Linear Regression model
regressor = LinearRegression()
regressor.fit(X_train, y_train)

# Make predictions
y_pred = regressor.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)

print(f"\nüìä Results:")
print(f"Mean Squared Error: {mse:.2f}")
print(f"Root Mean Squared Error: {rmse:.2f}k")
print(f"R¬≤ Score: {regressor.score(X_test, y_test):.3f}")

# Visualizations
plt.figure(figsize=(15, 4))

# Plot 1: Actual vs Predicted
plt.subplot(1, 3, 1)
plt.scatter(y_test, y_pred, alpha=0.6)
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'r--', lw=2)
plt.xlabel('Actual Price ($k)')
plt.ylabel('Predicted Price ($k)')
plt.title('Actual vs Predicted Prices')
correlation = np.corrcoef(y_test, y_pred)[0, 1]
plt.text(0.05, 0.95, f'Correlation: {correlation:.3f}', transform=plt.gca().transAxes)

# Plot 2: Feature coefficients
plt.subplot(1, 3, 2)
coefficients = regressor.coef_
colors = ['red' if coef < 0 else 'green' for coef in coefficients]
bars = plt.bar(range(len(coefficients)), coefficients, color=colors, alpha=0.7)
plt.xticks(range(len(coefficients)), feature_names, rotation=45)
plt.title('Feature Coefficients')
plt.ylabel('Impact on Price')
plt.axhline(y=0, color='black', linestyle='-', alpha=0.3)

# Plot 3: Residuals
plt.subplot(1, 3, 3)
residuals = y_test - y_pred
plt.scatter(y_pred, residuals, alpha=0.6)
plt.axhline(y=0, color='red', linestyle='--')
plt.xlabel('Predicted Price ($k)')
plt.ylabel('Residuals')
plt.title('Residual Plot')

plt.tight_layout()
plt.show()

print(f"\nüèòÔ∏è Sample Predictions:")
for i in range(min(5, len(X_test))):
    print(f"House {i+1}: Actual ${y_test.iloc[i] if hasattr(y_test, 'iloc') else y_test[i]:.1f}k ‚Üí Predicted ${y_pred[i]:.1f}k")

## Section 4: Unsupervised Learning Adventures

Unsupervised learning finds hidden patterns in data without labels. Let's explore **clustering** (finding groups) and **dimensionality reduction** (simplifying data).

In [None]:
# Unsupervised Learning Example 1: Clustering (Customer Segmentation)
print("üë• Clustering Example: Customer Segmentation")
print("=" * 50)

# Generate synthetic customer data
np.random.seed(42)
n_customers = 300

# Create customer segments with different spending patterns
# Segment 1: Budget customers (low income, low spending)
budget_customers = np.random.multivariate_normal([25, 20], [[50, 10], [10, 30]], 100)

# Segment 2: Premium customers (high income, high spending) 
premium_customers = np.random.multivariate_normal([70, 80], [[40, 20], [20, 50]], 100)

# Segment 3: Middle-tier customers
middle_customers = np.random.multivariate_normal([45, 50], [[30, 5], [5, 40]], 100)

# Combine all customers
X_customers = np.vstack([budget_customers, premium_customers, middle_customers])

# Apply K-means clustering (without knowing the true segments)
kmeans = KMeans(n_clusters=3, random_state=42, n_init=10)
cluster_labels = kmeans.fit_predict(X_customers)
centroids = kmeans.cluster_centers_

print(f"Dataset shape: {X_customers.shape}")
print(f"Number of clusters: {kmeans.n_clusters}")
print(f"Inertia (within-cluster sum of squares): {kmeans.inertia_:.2f}")

# Visualizations
plt.figure(figsize=(15, 5))

# Plot 1: Original data (without clustering)
plt.subplot(1, 3, 1)
plt.scatter(X_customers[:, 0], X_customers[:, 1], alpha=0.6, c='gray')
plt.xlabel('Annual Income ($k)')
plt.ylabel('Spending Score (1-100)')
plt.title('Customer Data (Before Clustering)')
plt.grid(True, alpha=0.3)

# Plot 2: Clustered data
plt.subplot(1, 3, 2)
colors = ['red', 'green', 'blue']
for i in range(3):
    mask = cluster_labels == i
    plt.scatter(X_customers[mask, 0], X_customers[mask, 1], 
               c=colors[i], alpha=0.6, label=f'Cluster {i+1}')

# Plot centroids
plt.scatter(centroids[:, 0], centroids[:, 1], 
           c='black', marker='x', s=200, linewidths=3, label='Centroids')
plt.xlabel('Annual Income ($k)')
plt.ylabel('Spending Score (1-100)')
plt.title('Customer Segmentation (K-Means)')
plt.legend()
plt.grid(True, alpha=0.3)

# Plot 3: Cluster characteristics
plt.subplot(1, 3, 3)
cluster_stats = []
for i in range(3):
    mask = cluster_labels == i
    cluster_data = X_customers[mask]
    avg_income = cluster_data[:, 0].mean()
    avg_spending = cluster_data[:, 1].mean()
    cluster_stats.append([avg_income, avg_spending])

cluster_stats = np.array(cluster_stats)
x_pos = np.arange(3)
width = 0.35

plt.bar(x_pos - width/2, cluster_stats[:, 0], width, label='Avg Income', alpha=0.7)
plt.bar(x_pos + width/2, cluster_stats[:, 1], width, label='Avg Spending', alpha=0.7)
plt.xlabel('Cluster')
plt.ylabel('Value')
plt.title('Cluster Characteristics')
plt.xticks(x_pos, [f'Cluster {i+1}' for i in range(3)])
plt.legend()

plt.tight_layout()
plt.show()

# Analyze clusters
print(f"\nüìä Cluster Analysis:")
segment_names = ['Budget Shoppers', 'Premium Customers', 'Value Seekers']
for i in range(3):
    mask = cluster_labels == i
    cluster_data = X_customers[mask]
    print(f"\nCluster {i+1} ({segment_names[i]}):")
    print(f"  Size: {np.sum(mask)} customers")
    print(f"  Avg Income: ${cluster_data[:, 0].mean():.1f}k")
    print(f"  Avg Spending: {cluster_data[:, 1].mean():.1f}")
    print(f"  Income Range: ${cluster_data[:, 0].min():.1f}k - ${cluster_data[:, 0].max():.1f}k")

In [None]:
# Unsupervised Learning Example 2: Dimensionality Reduction (PCA)
print("üìâ Dimensionality Reduction Example: PCA on Iris Dataset")
print("=" * 50)

# Use the Iris dataset (4 dimensions)
iris = load_iris()
X_iris = iris.data
y_iris = iris.target

print(f"Original dimensions: {X_iris.shape}")
print(f"Features: {iris.feature_names}")

# Apply PCA to reduce to 2 dimensions
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X_iris)

print(f"Reduced dimensions: {X_pca.shape}")
print(f"Explained variance ratio: {pca.explained_variance_ratio_}")
print(f"Total variance retained: {pca.explained_variance_ratio_.sum():.3f} ({pca.explained_variance_ratio_.sum()*100:.1f}%)")

# Visualizations
plt.figure(figsize=(16, 4))

# Plot 1: Original data (first 2 features)
plt.subplot(1, 4, 1)
colors = ['red', 'green', 'blue']
for i, color in enumerate(colors):
    mask = y_iris == i
    plt.scatter(X_iris[mask, 0], X_iris[mask, 1], c=color, label=iris.target_names[i], alpha=0.7)
plt.xlabel(iris.feature_names[0])
plt.ylabel(iris.feature_names[1])
plt.title('Original Data\n(First 2 Features)')
plt.legend()
plt.grid(True, alpha=0.3)

# Plot 2: Original data (last 2 features)
plt.subplot(1, 4, 2)
for i, color in enumerate(colors):
    mask = y_iris == i
    plt.scatter(X_iris[mask, 2], X_iris[mask, 3], c=color, label=iris.target_names[i], alpha=0.7)
plt.xlabel(iris.feature_names[2])
plt.ylabel(iris.feature_names[3])
plt.title('Original Data\n(Last 2 Features)')
plt.legend()
plt.grid(True, alpha=0.3)

# Plot 3: PCA transformed data
plt.subplot(1, 4, 3)
for i, color in enumerate(colors):
    mask = y_iris == i
    plt.scatter(X_pca[mask, 0], X_pca[mask, 1], c=color, label=iris.target_names[i], alpha=0.7)
plt.xlabel(f'PC1 ({pca.explained_variance_ratio_[0]:.1%} variance)')
plt.ylabel(f'PC2 ({pca.explained_variance_ratio_[1]:.1%} variance)')
plt.title('PCA Transformed Data\n(2 Principal Components)')
plt.legend()
plt.grid(True, alpha=0.3)

# Plot 4: Explained variance
plt.subplot(1, 4, 4)
# Show explained variance for all possible components
pca_full = PCA()
pca_full.fit(X_iris)
cumsum_var = np.cumsum(pca_full.explained_variance_ratio_)

plt.bar(range(1, 5), pca_full.explained_variance_ratio_, alpha=0.7, label='Individual')
plt.plot(range(1, 5), cumsum_var, 'ro-', label='Cumulative')
plt.xlabel('Principal Component')
plt.ylabel('Explained Variance Ratio')
plt.title('Explained Variance by Component')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Show PCA components (loadings)
print(f"\nüßÆ Principal Components (Feature Loadings):")
components_df = pd.DataFrame(
    pca.components_.T,
    columns=['PC1', 'PC2'],
    index=iris.feature_names
)
print(components_df.round(3))

print(f"\nüí° Interpretation:")
print(f"‚Ä¢ PC1 explains {pca.explained_variance_ratio_[0]:.1%} of variance")
print(f"‚Ä¢ PC2 explains {pca.explained_variance_ratio_[1]:.1%} of variance")
print(f"‚Ä¢ Together they capture {pca.explained_variance_ratio_.sum():.1%} of information")
print(f"‚Ä¢ We reduced 4D data to 2D while keeping most information!")

## Section 5: Reinforcement Learning Playground

Reinforcement learning is like teaching a computer to play a game by trial and error. Let's create a simple example where an agent learns to make optimal choices!

In [None]:
# Reinforcement Learning Example: Multi-Armed Bandit Problem
print("üé∞ Reinforcement Learning Example: Multi-Armed Bandit")
print("=" * 50)

class MultiArmedBandit:
    """Simulation of a multi-armed bandit (like slot machines)"""
    def __init__(self, n_arms=4, seed=42):
        np.random.seed(seed)
        self.n_arms = n_arms
        # Each arm has a different reward probability (unknown to agent)
        self.true_rewards = np.random.uniform(0.1, 0.9, n_arms)
        print(f"üéØ True reward probabilities (hidden from agent): {self.true_rewards.round(3)}")
        
    def pull_arm(self, arm):
        """Pull an arm and get reward (1) or no reward (0)"""
        return 1 if np.random.random() < self.true_rewards[arm] else 0

class EpsilonGreedyAgent:
    """Agent that learns using epsilon-greedy strategy"""
    def __init__(self, n_arms, epsilon=0.1):
        self.n_arms = n_arms
        self.epsilon = epsilon  # Exploration rate
        self.counts = np.zeros(n_arms)  # How many times each arm was pulled
        self.values = np.zeros(n_arms)  # Estimated value of each arm
        self.total_reward = 0
        self.history = []
        
    def select_arm(self):
        """Choose which arm to pull (explore vs exploit)"""
        if np.random.random() < self.epsilon:
            # Explore: choose random arm
            return np.random.randint(self.n_arms)
        else:
            # Exploit: choose best arm so far
            return np.argmax(self.values)
    
    def update(self, arm, reward):
        """Update our knowledge after pulling an arm"""
        self.counts[arm] += 1
        # Running average: new_avg = old_avg + (new_value - old_avg) / count
        self.values[arm] += (reward - self.values[arm]) / self.counts[arm]
        self.total_reward += reward
        self.history.append((arm, reward, self.values.copy()))

# Set up the experiment
bandit = MultiArmedBandit(n_arms=4, seed=42)
agent = EpsilonGreedyAgent(n_arms=4, epsilon=0.1)

n_rounds = 1000
print(f"\nü§ñ Agent will play {n_rounds} rounds with epsilon={agent.epsilon}")
print(f"Epsilon-greedy strategy: {(1-agent.epsilon)*100:.0f}% exploitation, {agent.epsilon*100:.0f}% exploration")

# Run the simulation
rewards = []
arm_choices = []
cumulative_rewards = []

for round_num in range(n_rounds):
    # Agent chooses an arm
    chosen_arm = agent.select_arm()
    arm_choices.append(chosen_arm)
    
    # Pull the arm and get reward
    reward = bandit.pull_arm(chosen_arm)
    rewards.append(reward)
    
    # Agent updates its knowledge
    agent.update(chosen_arm, reward)
    cumulative_rewards.append(agent.total_reward)

print(f"\nüìä Results after {n_rounds} rounds:")
print(f"Total reward: {agent.total_reward}")
print(f"Average reward per round: {agent.total_reward/n_rounds:.3f}")
print(f"Agent's learned values: {agent.values.round(3)}")
print(f"Times each arm was pulled: {agent.counts.astype(int)}")

# Find the optimal strategy (for comparison)
optimal_arm = np.argmax(bandit.true_rewards)
optimal_reward_rate = bandit.true_rewards[optimal_arm]
print(f"\nüèÜ Optimal strategy:")
print(f"Best arm: {optimal_arm} (reward rate: {optimal_reward_rate:.3f})")
print(f"Agent's performance: {(agent.total_reward/n_rounds)/optimal_reward_rate:.1%} of optimal")

# Visualizations
plt.figure(figsize=(16, 4))

# Plot 1: Learning curve
plt.subplot(1, 4, 1)
window = 50
moving_avg = np.convolve(rewards, np.ones(window)/window, mode='valid')
plt.plot(range(window-1, n_rounds), moving_avg, label='Agent Performance', linewidth=2)
plt.axhline(y=optimal_reward_rate, color='red', linestyle='--', 
            label=f'Optimal Rate ({optimal_reward_rate:.3f})', alpha=0.7)
plt.xlabel('Round')
plt.ylabel('Average Reward')
plt.title(f'Learning Curve\n(Moving Average, Window={window})')
plt.legend()
plt.grid(True, alpha=0.3)

# Plot 2: Cumulative rewards
plt.subplot(1, 4, 2)
optimal_cumulative = np.arange(1, n_rounds+1) * optimal_reward_rate
plt.plot(cumulative_rewards, label='Agent', linewidth=2)
plt.plot(optimal_cumulative, '--', label='Optimal Strategy', alpha=0.7)
plt.xlabel('Round')
plt.ylabel('Cumulative Reward')
plt.title('Cumulative Rewards')
plt.legend()
plt.grid(True, alpha=0.3)

# Plot 3: Arm selection over time
plt.subplot(1, 4, 3)
colors = ['red', 'green', 'blue', 'orange']
for arm in range(bandit.n_arms):
    arm_selections = [i for i, choice in enumerate(arm_choices) if choice == arm]
    arm_rewards = [i for i in arm_selections]  # Just the round numbers
    if arm_selections:
        plt.scatter(arm_selections, [arm]*len(arm_selections), 
                   c=colors[arm], alpha=0.6, s=1, label=f'Arm {arm}')
plt.xlabel('Round')
plt.ylabel('Arm Chosen')
plt.title('Arm Selection Over Time')
plt.legend()
plt.yticks(range(bandit.n_arms))

# Plot 4: Final comparison
plt.subplot(1, 4, 4)
x_pos = np.arange(bandit.n_arms)
width = 0.35

bars1 = plt.bar(x_pos - width/2, bandit.true_rewards, width, 
               label='True Rewards', alpha=0.7, color='blue')
bars2 = plt.bar(x_pos + width/2, agent.values, width, 
               label='Learned Values', alpha=0.7, color='orange')

# Add pull counts as text
for i, count in enumerate(agent.counts):
    plt.text(i, max(bandit.true_rewards[i], agent.values[i]) + 0.05, 
             f'{int(count)}', ha='center', va='bottom', fontweight='bold')

plt.xlabel('Arm')
plt.ylabel('Reward Probability')
plt.title('True vs Learned Values\n(Numbers show pull counts)')
plt.legend()
plt.ylim(0, 1)

plt.tight_layout()
plt.show()

# Show exploration vs exploitation decisions
exploration_count = sum(1 for i, choice in enumerate(arm_choices) 
                       if choice != np.argmax(agent.history[i][2]) and i > 0)
print(f"\nüîç Decision Analysis:")
print(f"Exploration decisions: {exploration_count} ({exploration_count/n_rounds:.1%})")
print(f"Exploitation decisions: {n_rounds - exploration_count} ({(n_rounds-exploration_count)/n_rounds:.1%})")
print(f"Most pulled arm: {np.argmax(agent.counts)} (optimal: {optimal_arm})")

if np.argmax(agent.counts) == optimal_arm:
    print("üéâ Success! Agent learned to prefer the optimal arm!")
else:
    print("ü§î Agent didn't fully converge to optimal arm. Try more rounds or different epsilon!")

## Section 6: Python Libraries Mastery

Let's dive deeper into the essential Python libraries for machine learning. Understanding these tools is crucial for your ML journey!

In [None]:
# Python Libraries Deep Dive
print("üêç Essential Python Libraries for Machine Learning")
print("=" * 60)

# 1. NumPy: Numerical Computing Foundation
print("\nüìä 1. NumPy - Numerical Computing")
print("-" * 40)

# Create arrays and perform operations
arr_1d = np.array([1, 2, 3, 4, 5])
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
arr_random = np.random.randn(1000)  # 1000 random numbers

print(f"1D array: {arr_1d}")
print(f"2D array shape: {arr_2d.shape}")
print(f"Array operations: mean={np.mean(arr_random):.3f}, std={np.std(arr_random):.3f}")

# Mathematical operations
print(f"Broadcasting: [1,2,3] * 2 = {arr_1d[:3] * 2}")
print(f"Matrix multiplication: \n{arr_2d @ arr_2d.T}")

# 2. Pandas: Data Manipulation Powerhouse
print("\nüêº 2. Pandas - Data Manipulation")
print("-" * 40)

# Create sample dataset
np.random.seed(42)
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
    'Age': np.random.randint(22, 65, 5),
    'Salary': np.random.randint(40000, 120000, 5),
    'Department': np.random.choice(['Engineering', 'Sales', 'Marketing'], 5),
    'Performance': np.random.uniform(3.0, 5.0, 5).round(1)
}

df = pd.DataFrame(data)
print("Sample Dataset:")
print(df)

print(f"\nDataset Info:")
print(f"Shape: {df.shape}")
print(f"Data types:\n{df.dtypes}")
print(f"\nSummary Statistics:\n{df.describe()}")

# Data operations
high_performers = df[df['Performance'] > 4.0]
dept_salary = df.groupby('Department')['Salary'].mean()
print(f"\nHigh performers (>4.0): {len(high_performers)}")
print(f"Average salary by department:\n{dept_salary}")

# 3. Matplotlib: Visualization Magic
print("\nüìà 3. Matplotlib - Data Visualization")
print("-" * 40)

fig, axes = plt.subplots(2, 3, figsize=(15, 8))

# Plot 1: Line plot
x = np.linspace(0, 10, 100)
y = np.sin(x) + np.random.normal(0, 0.1, 100)
axes[0, 0].plot(x, y, alpha=0.7)
axes[0, 0].set_title('Noisy Sine Wave')
axes[0, 0].grid(True, alpha=0.3)

# Plot 2: Histogram
axes[0, 1].hist(arr_random, bins=30, alpha=0.7, color='green')
axes[0, 1].set_title('Random Distribution')
axes[0, 1].axvline(np.mean(arr_random), color='red', linestyle='--', label='Mean')
axes[0, 1].legend()

# Plot 3: Scatter plot
axes[0, 2].scatter(df['Age'], df['Salary'], c=df['Performance'], 
                   cmap='viridis', s=100, alpha=0.7)
axes[0, 2].set_xlabel('Age')
axes[0, 2].set_ylabel('Salary')
axes[0, 2].set_title('Age vs Salary (colored by Performance)')

# Plot 4: Bar chart
dept_counts = df['Department'].value_counts()
axes[1, 0].bar(dept_counts.index, dept_counts.values, alpha=0.7)
axes[1, 0].set_title('Department Distribution')
axes[1, 0].tick_params(axis='x', rotation=45)

# Plot 5: Box plot
departments = df['Department'].unique()
salary_by_dept = [df[df['Department'] == dept]['Salary'].values for dept in departments]
axes[1, 1].boxplot(salary_by_dept, labels=departments)
axes[1, 1].set_title('Salary Distribution by Department')
axes[1, 1].tick_params(axis='x', rotation=45)

# Plot 6: Heatmap-style correlation
numeric_df = df.select_dtypes(include=[np.number])
correlation = numeric_df.corr()
im = axes[1, 2].imshow(correlation, cmap='coolwarm', aspect='auto')
axes[1, 2].set_xticks(range(len(correlation.columns)))
axes[1, 2].set_yticks(range(len(correlation.columns)))
axes[1, 2].set_xticklabels(correlation.columns, rotation=45)
axes[1, 2].set_yticklabels(correlation.columns)
axes[1, 2].set_title('Correlation Matrix')

# Add correlation values
for i in range(len(correlation.columns)):
    for j in range(len(correlation.columns)):
        axes[1, 2].text(j, i, f'{correlation.iloc[i, j]:.2f}', 
                        ha='center', va='center')

plt.tight_layout()
plt.show()

# 4. Scikit-learn: Machine Learning Made Easy
print("\nüéØ 4. Scikit-learn - Machine Learning")
print("-" * 40)

# Create a complete ML pipeline
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

# Load iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Create a pipeline (preprocessing + model)
pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('classifier', RandomForestClassifier(n_estimators=100, random_state=42))
])

# Cross-validation
cv_scores = cross_val_score(pipeline, X, y, cv=5)
print(f"Cross-validation scores: {cv_scores.round(3)}")
print(f"Average CV score: {cv_scores.mean():.3f} ¬± {cv_scores.std():.3f}")

# Train final model
pipeline.fit(X, y)

# Feature importance
feature_importance = pipeline.named_steps['classifier'].feature_importances_
print(f"\nFeature importance:")
for name, importance in zip(iris.feature_names, feature_importance):
    print(f"  {name}: {importance:.3f}")

# Make predictions on new samples
new_samples = np.array([[5.0, 3.0, 1.5, 0.2], [6.5, 3.0, 5.5, 2.0]])
predictions = pipeline.predict(new_samples)
probabilities = pipeline.predict_proba(new_samples)

print(f"\nPredictions on new samples:")
for i, (sample, pred, probs) in enumerate(zip(new_samples, predictions, probabilities)):
    print(f"Sample {i+1}: {sample}")
    print(f"  Predicted: {iris.target_names[pred]}")
    print(f"  Probabilities: {dict(zip(iris.target_names, probs.round(3)))}")

print(f"\nüéì Library Summary:")
print("‚Ä¢ NumPy: Fast numerical operations, array handling")
print("‚Ä¢ Pandas: Data manipulation, analysis, file I/O")
print("‚Ä¢ Matplotlib: Plotting, visualization, charts")
print("‚Ä¢ Scikit-learn: Machine learning algorithms, evaluation tools")
print("\nüöÄ These four libraries form the foundation of most ML projects!")

## Conclusion and Next Steps

üéâ **Congratulations!** You've completed the Introduction to Machine Learning chapter. 

### What You've Learned:
- ‚úÖ **Traditional vs ML Programming**: Seen the fundamental difference in approaches
- ‚úÖ **Supervised Learning**: Explored classification and regression with real examples
- ‚úÖ **Unsupervised Learning**: Discovered clustering and dimensionality reduction
- ‚úÖ **Reinforcement Learning**: Watched an agent learn through trial and error
- ‚úÖ **Python ML Stack**: Mastered NumPy, Pandas, Matplotlib, and Scikit-learn

### Key Takeaways:
1. **ML learns patterns** from data rather than following explicit rules
2. **Different types of ML** solve different types of problems
3. **Python provides powerful tools** for every aspect of ML
4. **Visualization is crucial** for understanding data and results
5. **Practice makes perfect** - the more you experiment, the better you become

### What's Next?
In **Chapter 2: Data Preprocessing**, you'll learn:
- How to clean messy, real-world data
- Techniques for handling missing values
- Methods for splitting datasets effectively
- Best practices for data preparation

### Practice Exercises:
Try these to reinforce your learning:
1. **Modify the spam detector** to include new rules or features
2. **Experiment with different datasets** using the same ML techniques
3. **Change parameters** in the examples and observe the effects
4. **Create your own visualizations** using different plot types

### Resources for Continued Learning:
- üìö Scikit-learn documentation: https://scikit-learn.org/
- üé• Machine Learning courses on Coursera and edX
- üìä Practice datasets: Kaggle.com
- üí¨ Community: Stack Overflow, Reddit r/MachineLearning

Happy learning! üöÄ