# Lab 4 - Module 0: Lifting Data into Higher Dimensions

**Learning Objectives:**
- Understand why adding dimensions can make unsolvable problems solvable
- See XOR become separable when lifted to 3D
- Connect dimension-lifting to what activation functions and hidden layers do
- Build intuition for why neural networks need multiple layers

**Time:** ~12-15 minutes

---

**Remember from Lab 3 - Module 0:** You discovered that XOR cannot be separated by ANY straight line in 2D.

**Remember from Lab 3 - Module 4:** A single perceptron (which draws one line) cannot solve XOR, no matter what parameters you use.

**Today's Big Idea:** What if we could ADD A NEW DIMENSION that makes the problem solvable?

## 1. Setup: Recreate the XOR Problem

Let's bring back the XOR pattern from Lab 3 that gave us so much trouble!

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import plotly.graph_objects as go
import ipywidgets as widgets
from ipywidgets import interact, Dropdown

# Set random seed for reproducibility
np.random.seed(42)

# Create XOR dataset (4 clusters in corners)
n_per_corner = 25
corners_class0 = [[-1.5, -1.5], [1.5, 1.5]]  # Bottom-left and top-right
corners_class1 = [[-1.5, 1.5], [1.5, -1.5]]  # Top-left and bottom-right

X_class0 = []
X_class1 = []

for corner in corners_class0:
    points = np.random.randn(n_per_corner, 2) * 0.3 + corner
    X_class0.append(points)

for corner in corners_class1:
    points = np.random.randn(n_per_corner, 2) * 0.3 + corner
    X_class1.append(points)

X_class0 = np.vstack(X_class0)
X_class1 = np.vstack(X_class1)

# Combine into single dataset
X_2d = np.vstack([X_class0, X_class1])
y = np.hstack([np.zeros(len(X_class0)), np.ones(len(X_class1))])

print("‚úì XOR dataset created!")
print(f"  Total points: {len(X_2d)}")
print(f"  Class 0 (blue): {np.sum(y==0)} points")
print(f"  Class 1 (red): {np.sum(y==1)} points")

## 2. Reminder: XOR is Not Separable in 2D

Let's visualize the XOR problem again to remind ourselves why it's impossible.

In [None]:
# Plot XOR in 2D
fig, ax = plt.subplots(figsize=(8, 8), dpi=100)

ax.scatter(X_class0[:, 0], X_class0[:, 1], c='blue', s=100, alpha=0.7, 
          label='Class 0 (Blue)', edgecolors='k', linewidths=1.5)
ax.scatter(X_class1[:, 0], X_class1[:, 1], c='red', s=100, alpha=0.7, 
          label='Class 1 (Red)', edgecolors='k', linewidths=1.5)

ax.set_xlabel('x‚ÇÅ', fontsize=14, fontweight='bold')
ax.set_ylabel('x‚ÇÇ', fontsize=14, fontweight='bold')
ax.set_title('XOR Pattern in 2D\n(Impossible to Separate with One Line)', 
            fontsize=15, fontweight='bold')
ax.legend(fontsize=12, loc='upper right')
ax.grid(True, alpha=0.3)
ax.set_aspect('equal')
ax.axhline(y=0, color='gray', linewidth=0.5, linestyle='--', alpha=0.5)
ax.axvline(x=0, color='gray', linewidth=0.5, linestyle='--', alpha=0.5)

plt.tight_layout()
plt.show()

print("\nüí° The Problem:")
print("   Blue points are in opposite corners (bottom-left and top-right)")
print("   Red points are in opposite corners (top-left and bottom-right)")
print("   No single straight line can separate them!")

## 3. The Magic Trick: Add a Third Dimension

Here's the key insight: **What if we add a third dimension (x‚ÇÉ) that's computed from x‚ÇÅ and x‚ÇÇ?**

We'll experiment with several different "features" (ways to compute x‚ÇÉ). Each represents a different way to combine the input coordinates:

### Feature Options:

1. **x‚ÇÉ = x‚ÇÅ √ó x‚ÇÇ** (Product) - Captures interaction between coordinates
2. **x‚ÇÉ = x‚ÇÅ + x‚ÇÇ** (Sum) - Simple addition
3. **x‚ÇÉ = x‚ÇÅ¬≤ + x‚ÇÇ¬≤** (Sum of squares) - Distance from origin
4. **x‚ÇÉ = |x‚ÇÅ| + |x‚ÇÇ|** (Manhattan distance) - Absolute values
5. **x‚ÇÉ = max(x‚ÇÅ, x‚ÇÇ)** (Maximum) - Takes larger coordinate

Let's create all versions and see which ones help separate XOR!

In [None]:
# Create 3D versions of the data with different features

# Dictionary to store all feature transformations
features = {}

# Feature 1: Product (x‚ÇÅ √ó x‚ÇÇ)
features['product'] = {
    'data': np.column_stack([X_2d, X_2d[:, 0] * X_2d[:, 1]]),
    'label': 'x‚ÇÉ = x‚ÇÅ √ó x‚ÇÇ',
    'description': 'Product captures interaction'
}

# Feature 2: Sum (x‚ÇÅ + x‚ÇÇ)
features['sum'] = {
    'data': np.column_stack([X_2d, X_2d[:, 0] + X_2d[:, 1]]),
    'label': 'x‚ÇÉ = x‚ÇÅ + x‚ÇÇ',
    'description': 'Simple addition'
}

# Feature 3: Sum of squares (x‚ÇÅ¬≤ + x‚ÇÇ¬≤)
features['sum_squares'] = {
    'data': np.column_stack([X_2d, X_2d[:, 0]**2 + X_2d[:, 1]**2]),
    'label': 'x‚ÇÉ = x‚ÇÅ¬≤ + x‚ÇÇ¬≤',
    'description': 'Distance from origin'
}

# Feature 4: Manhattan distance (|x‚ÇÅ| + |x‚ÇÇ|)
features['manhattan'] = {
    'data': np.column_stack([X_2d, np.abs(X_2d[:, 0]) + np.abs(X_2d[:, 1])]),
    'label': 'x‚ÇÉ = |x‚ÇÅ| + |x‚ÇÇ|',
    'description': 'Manhattan distance'
}

# Feature 5: Maximum (max(x‚ÇÅ, x‚ÇÇ))
features['maximum'] = {
    'data': np.column_stack([X_2d, np.maximum(X_2d[:, 0], X_2d[:, 1])]),
    'label': 'x‚ÇÉ = max(x‚ÇÅ, x‚ÇÇ)',
    'description': 'Maximum coordinate'
}

print("‚úì Created 5 different 3D feature transformations!")
print("\nOriginal 2D shape:", X_2d.shape)
print("New 3D shape:     ", features['product']['data'].shape)
print("\nAvailable features:")
for i, (key, feat) in enumerate(features.items(), 1):
    print(f"  {i}. {feat['label']:25} - {feat['description']}")

## 4. Visualize: XOR in 3D Space

Now let's see what happens when we view the XOR data in 3D!

**Instructions:**
1. Choose a feature from the dropdown menu
2. **Rotate the plot** by clicking and dragging with your mouse
3. **Zoom** with scroll wheel or pinch gesture
4. Look for patterns - can you see if a flat plane could separate blue from red?

In [None]:
def plot_xor_3d_plotly(feature_type='product'):
    """
    Plot XOR in 3D using plotly for interactive rotation.
    
    Args:
        feature_type: which feature transformation to use
    """
    # Get the selected feature data
    X_3d = features[feature_type]['data']
    feature_label = features[feature_type]['label']
    
    # Create plotly figure
    fig = go.Figure()
    
    # Add Class 0 points (blue)
    fig.add_trace(go.Scatter3d(
        x=X_3d[y==0, 0],
        y=X_3d[y==0, 1],
        z=X_3d[y==0, 2],
        mode='markers',
        name='Class 0 (Blue)',
        marker=dict(
            size=8,
            color='blue',
            opacity=0.8,
            line=dict(color='black', width=1)
        )
    ))
    
    # Add Class 1 points (red)
    fig.add_trace(go.Scatter3d(
        x=X_3d[y==1, 0],
        y=X_3d[y==1, 1],
        z=X_3d[y==1, 2],
        mode='markers',
        name='Class 1 (Red)',
        marker=dict(
            size=8,
            color='red',
            opacity=0.8,
            line=dict(color='black', width=1)
        )
    ))
    
    # Add separating plane (different for each feature)
    # Create a mesh grid for the plane
    x_plane = np.linspace(-3, 3, 20)
    y_plane = np.linspace(-3, 3, 20)
    X_plane, Y_plane = np.meshgrid(x_plane, y_plane)
    
    # Determine the plane equation based on feature
    if feature_type == 'product':
        # For x‚ÇÉ = x‚ÇÅ √ó x‚ÇÇ, separating plane is x‚ÇÉ = 0 (horizontal)
        Z_plane = np.zeros_like(X_plane)
        plane_color = 'green'
        plane_name = 'Separating Plane (x‚ÇÉ = 0)'
    elif feature_type == 'sum':
        # For x‚ÇÉ = x‚ÇÅ + x‚ÇÇ, separating plane is x‚ÇÉ = 0
        Z_plane = np.zeros_like(X_plane)
        plane_color = 'yellow'
        plane_name = 'Plane (x‚ÇÉ = 0)'
    elif feature_type == 'sum_squares':
        # For x‚ÇÉ = x‚ÇÅ¬≤ + x‚ÇÇ¬≤, try plane at mean
        threshold = np.mean(X_3d[:, 2])
        Z_plane = np.full_like(X_plane, threshold)
        plane_color = 'orange'
        plane_name = f'Plane (x‚ÇÉ = {threshold:.2f})'
    elif feature_type == 'manhattan':
        # Similar to sum_squares
        threshold = np.mean(X_3d[:, 2])
        Z_plane = np.full_like(X_plane, threshold)
        plane_color = 'purple'
        plane_name = f'Plane (x‚ÇÉ = {threshold:.2f})'
    else:  # maximum
        threshold = np.mean(X_3d[:, 2])
        Z_plane = np.full_like(X_plane, threshold)
        plane_color = 'cyan'
        plane_name = f'Plane (x‚ÇÉ = {threshold:.2f})'
    
    # Add the plane
    fig.add_trace(go.Surface(
        x=X_plane,
        y=Y_plane,
        z=Z_plane,
        opacity=0.3,
        colorscale=[[0, plane_color], [1, plane_color]],
        showscale=False,
        name=plane_name,
        hoverinfo='skip'
    ))
    
    # Update layout
    fig.update_layout(
        title=dict(
            text=f'XOR in 3D Space<br><sub>{feature_label}</sub>',
            x=0.5,
            xanchor='center',
            font=dict(size=16)
        ),
        scene=dict(
            xaxis_title='x‚ÇÅ',
            yaxis_title='x‚ÇÇ',
            zaxis_title='x‚ÇÉ',
            camera=dict(
                eye=dict(x=1.5, y=1.5, z=1.3)
            ),
            aspectmode='cube'
        ),
        width=900,
        height=700,
        showlegend=True,
        legend=dict(x=0.7, y=0.9)
    )
    
    # Show the plot
    fig.show()
    
    # Provide feedback based on feature type
    print("\n" + "="*70)
    
    if feature_type == 'product':
        # Calculate accuracy
        pred = (X_3d[:, 2] > 0).astype(int)
        acc = np.mean(pred == y) * 100
        
        print("‚úÖ EXCELLENT! With x‚ÇÉ = x‚ÇÅ √ó x‚ÇÇ:")
        print(f"   Accuracy: {acc:.1f}%")
        print("\n   üîµ Blue points (Class 0):")
        print("      ‚Ä¢ Bottom-left: (-)(-)  = + POSITIVE")
        print("      ‚Ä¢ Top-right:   (+)(+)  = + POSITIVE")
        print("      ‚Üí Blue points have x‚ÇÉ > 0 (above green plane)")
        print("\n   üî¥ Red points (Class 1):")
        print("      ‚Ä¢ Top-left:     (-)(+)  = - NEGATIVE")
        print("      ‚Ä¢ Bottom-right: (+)(-)  = - NEGATIVE")
        print("      ‚Üí Red points have x‚ÇÉ < 0 (below green plane)")
        print("\n   üí° The green plane (x‚ÇÉ = 0) perfectly separates them!")
        
    elif feature_type == 'sum':
        pred = (X_3d[:, 2] > 0).astype(int)
        acc = np.mean(pred == y) * 100
        print(f"‚ö†Ô∏è With x‚ÇÉ = x‚ÇÅ + x‚ÇÇ:")
        print(f"   Accuracy: {acc:.1f}%")
        print("\n   This feature doesn't separate XOR well.")
        print("   Both classes have mixed positive and negative values.")
        print("   Try switching to 'product' to see perfect separation!")
        
    elif feature_type == 'sum_squares':
        threshold = np.mean(X_3d[:, 2])
        pred = (X_3d[:, 2] > threshold).astype(int)
        acc = np.mean(pred == y) * 100
        print(f"‚ö†Ô∏è With x‚ÇÉ = x‚ÇÅ¬≤ + x‚ÇÇ¬≤:")
        print(f"   Accuracy: {acc:.1f}%")
        print("\n   This measures distance from origin.")
        print("   Both classes are at similar distances, so it doesn't help!")
        print("   All corner points are roughly equidistant from (0,0).")
        
    elif feature_type == 'manhattan':
        threshold = np.mean(X_3d[:, 2])
        pred = (X_3d[:, 2] > threshold).astype(int)
        acc = np.mean(pred == y) * 100
        print(f"‚ö†Ô∏è With x‚ÇÉ = |x‚ÇÅ| + |x‚ÇÇ|:")
        print(f"   Accuracy: {acc:.1f}%")
        print("\n   Manhattan distance also doesn't help.")
        print("   All corners are at the same Manhattan distance!")
        
    else:  # maximum
        threshold = np.mean(X_3d[:, 2])
        pred = (X_3d[:, 2] > threshold).astype(int)
        acc = np.mean(pred == y) * 100
        print(f"‚ö†Ô∏è With x‚ÇÉ = max(x‚ÇÅ, x‚ÇÇ):")
        print(f"   Accuracy: {acc:.1f}%")
        print("\n   Taking the maximum coordinate doesn't separate XOR.")
        print("   Try 'product' to see the feature that actually works!")
    
    print("="*70)

# Create interactive widget with dropdown menu
print("Interactive 3D XOR Visualization")
print("="*70)
print("Choose different features to see which one makes XOR separable!")
print("Click and drag to rotate | Scroll to zoom | Hover for values\n")

interact(
    plot_xor_3d_plotly,
    feature_type=Dropdown(
        options=[
            ('Product: x‚ÇÉ = x‚ÇÅ √ó x‚ÇÇ', 'product'),
            ('Sum: x‚ÇÉ = x‚ÇÅ + x‚ÇÇ', 'sum'),
            ('Sum of Squares: x‚ÇÉ = x‚ÇÅ¬≤ + x‚ÇÇ¬≤', 'sum_squares'),
            ('Manhattan: x‚ÇÉ = |x‚ÇÅ| + |x‚ÇÇ|', 'manhattan'),
            ('Maximum: x‚ÇÉ = max(x‚ÇÅ, x‚ÇÇ)', 'maximum')
        ],
        value='product',
        description='Feature:'
    )
);

## 5. What Just Happened? The Math Behind the Magic

Let's understand WHY the product feature (x‚ÇÉ = x‚ÇÅ √ó x‚ÇÇ) works so well for XOR, while the others fail!

### XOR Pattern in 2D:
```
Point          x‚ÇÅ    x‚ÇÇ    Label
--------------------------------------
Bottom-left:  -1.5  -1.5   Class 0 (Blue)
Top-right:    +1.5  +1.5   Class 0 (Blue)
Top-left:     -1.5  +1.5   Class 1 (Red)
Bottom-right: +1.5  -1.5   Class 1 (Red)
```

### After Adding Different Features:

**1. Product Feature: x‚ÇÉ = x‚ÇÅ √ó x‚ÇÇ** ‚úÖ *Perfect separation!*
```
Point          x‚ÇÅ    x‚ÇÇ    x‚ÇÉ = x‚ÇÅ√óx‚ÇÇ   Label       Insight
-------------------------------------------------------------------
Bottom-left:  -1.5  -1.5   +2.25       Class 0     Same sign ‚Üí +
Top-right:    +1.5  +1.5   +2.25       Class 0     Same sign ‚Üí +
Top-left:     -1.5  +1.5   -2.25       Class 1     Diff sign ‚Üí -
Bottom-right: +1.5  -1.5   -2.25       Class 1     Diff sign ‚Üí -
```
**Rule:** x‚ÇÉ > 0 ‚Üí Class 0 (Blue), x‚ÇÉ < 0 ‚Üí Class 1 (Red) ‚úì Works perfectly!

**2. Sum Feature: x‚ÇÉ = x‚ÇÅ + x‚ÇÇ** ‚ùå *Doesn't help*
```
Point          x‚ÇÅ    x‚ÇÇ    x‚ÇÉ = x‚ÇÅ+x‚ÇÇ   Label
---------------------------------------------------
Bottom-left:  -1.5  -1.5   -3.0        Class 0
Top-right:    +1.5  +1.5   +3.0        Class 0
Top-left:     -1.5  +1.5    0.0        Class 1
Bottom-right: +1.5  -1.5    0.0        Class 1
```
Classes overlap in x‚ÇÉ values - no clean separation!

**3. Sum of Squares: x‚ÇÉ = x‚ÇÅ¬≤ + x‚ÇÇ¬≤** ‚ùå *All similar*
```
Point          x‚ÇÅ    x‚ÇÇ    x‚ÇÉ = x‚ÇÅ¬≤+x‚ÇÇ¬≤   Label
------------------------------------------------------
Bottom-left:  -1.5  -1.5   4.5           Class 0
Top-right:    +1.5  +1.5   4.5           Class 0
Top-left:     -1.5  +1.5   4.5           Class 1
Bottom-right: +1.5  -1.5   4.5           Class 1
```
All points equidistant from origin - useless for separation!

### The Key Insight:
- **Class 0 (Blue):** Both coordinates have the **same sign** ‚Üí product is **POSITIVE**
- **Class 1 (Red):** Coordinates have **opposite signs** ‚Üí product is **NEGATIVE**

This captures the **interaction** between x‚ÇÅ and x‚ÇÇ, which is exactly what defines XOR!

Now we can use a simple rule in 3D: **"If x‚ÇÉ > 0, predict Blue; otherwise predict Red"**

This is just a **horizontal plane** at x‚ÇÉ = 0, which is easy for a linear model!

## 6. Project the 3D Plane Back to 2D

The really cool part: when you look at the 3D separating plane from above (bird's eye view), what does it look like in 2D?

Let's visualize this!

In [None]:
# Show side-by-side: Original 2D problem and what it looks like after lifting
fig = plt.figure(figsize=(16, 6), dpi=100)

# Left plot: Original 2D XOR (impossible)
ax1 = fig.add_subplot(121)
ax1.scatter(X_class0[:, 0], X_class0[:, 1], c='blue', s=100, alpha=0.7,
           label='Class 0 (Blue)', edgecolors='k', linewidths=1.5)
ax1.scatter(X_class1[:, 0], X_class1[:, 1], c='red', s=100, alpha=0.7,
           label='Class 1 (Red)', edgecolors='k', linewidths=1.5)
ax1.set_xlabel('x‚ÇÅ', fontsize=13, fontweight='bold')
ax1.set_ylabel('x‚ÇÇ', fontsize=13, fontweight='bold')
ax1.set_title('BEFORE: XOR in 2D\n(No straight line works)', 
             fontsize=13, fontweight='bold')
ax1.legend(fontsize=10)
ax1.grid(True, alpha=0.3)
ax1.set_aspect('equal')
ax1.axhline(y=0, color='gray', linewidth=0.5, linestyle='--', alpha=0.5)
ax1.axvline(x=0, color='gray', linewidth=0.5, linestyle='--', alpha=0.5)

# Right plot: Decision boundary based on x‚ÇÉ = x‚ÇÅ √ó x‚ÇÇ
ax2 = fig.add_subplot(122)

# Create a mesh grid
x1_range = np.linspace(-3, 3, 200)
x2_range = np.linspace(-3, 3, 200)
X1_mesh, X2_mesh = np.meshgrid(x1_range, x2_range)

# Compute x‚ÇÉ = x‚ÇÅ √ó x‚ÇÇ for each point
X3_mesh = X1_mesh * X2_mesh

# Decision rule: x‚ÇÉ > 0 ‚Üí Class 0, x‚ÇÉ < 0 ‚Üí Class 1
decision = (X3_mesh > 0).astype(int)

# Plot decision regions
ax2.contourf(X1_mesh, X2_mesh, decision, levels=1, alpha=0.3, colors=['red', 'blue'])

# Plot the decision boundary (where x‚ÇÅ √ó x‚ÇÇ = 0)
ax2.contour(X1_mesh, X2_mesh, X3_mesh, levels=[0], colors='green', linewidths=3)

# Plot data points
ax2.scatter(X_class0[:, 0], X_class0[:, 1], c='blue', s=100, alpha=0.7,
           label='Class 0 (Blue)', edgecolors='k', linewidths=1.5)
ax2.scatter(X_class1[:, 0], X_class1[:, 1], c='red', s=100, alpha=0.7,
           label='Class 1 (Red)', edgecolors='k', linewidths=1.5)

ax2.set_xlabel('x‚ÇÅ', fontsize=13, fontweight='bold')
ax2.set_ylabel('x‚ÇÇ', fontsize=13, fontweight='bold')
ax2.set_title('AFTER: Decision Boundary in 2D\n(Based on x‚ÇÉ = x‚ÇÅ √ó x‚ÇÇ)', 
             fontsize=13, fontweight='bold')
ax2.legend(fontsize=10)
ax2.grid(True, alpha=0.3)
ax2.set_aspect('equal')
ax2.set_xlim(-3, 3)
ax2.set_ylim(-3, 3)

plt.tight_layout()
plt.show()

print("\nüéØ The Big Reveal:")
print("   LEFT: Impossible to separate with a straight line")
print("   RIGHT: The 3D plane projects to a CURVED boundary in 2D!")
print("\n   The green curve shows where x‚ÇÅ √ó x‚ÇÇ = 0")
print("   (This is a hyperbola - two curves meeting at the origin)")
print("\n   üí° A STRAIGHT PLANE in 3D becomes a CURVED BOUNDARY in 2D!")

## 7. Connection to Neural Networks

**This is EXACTLY what hidden layers in neural networks do!**

### What You Just Did Manually:
1. Started with 2D data (x‚ÇÅ, x‚ÇÇ)
2. **Manually engineered** a new feature: x‚ÇÉ = x‚ÇÅ √ó x‚ÇÇ
3. Used a simple linear rule in the new 3D space (x‚ÇÉ > 0 ‚Üí Blue)
4. This gave you a curved boundary in the original 2D space

### What Hidden Layers Do Automatically:
1. Start with input data (x‚ÇÅ, x‚ÇÇ)
2. **Automatically learn** new features through hidden neurons
3. Each hidden neuron creates a new "dimension" (like your x‚ÇÉ)
4. The output layer uses simple linear rules in this new space
5. Result: Curved boundaries in the original space!

### The Key Insight:

**You don't need to manually figure out which features to add!**
- Hidden layers **invent useful features automatically** during training
- They learn which combinations of inputs help separate the classes
- This is why neural networks are so powerful!

### Remember from Lab 3:
- **Activation functions** warp space to create curved boundaries
- But a **single perceptron** still creates only ONE boundary

### Coming in Module 1:
- **Multiple hidden neurons** create multiple new dimensions
- Combining these dimensions gives flexible, complex boundaries
- You'll see a 2-2-1 network solve XOR automatically!

## 8. Key Takeaways from Module 0

### 1. Dimension Lifting Makes Hard Problems Easy
- XOR is impossible to separate in 2D with a straight line
- Adding x‚ÇÉ = x‚ÇÅ √ó x‚ÇÇ makes it perfectly separable in 3D
- A flat plane in 3D becomes a curved boundary in 2D
- **Not all features help equally!** You discovered which one works best.

### 2. Feature Engineering vs. Feature Learning
- **Manual (what you did):** You figured out that x‚ÇÉ = x‚ÇÅ √ó x‚ÇÇ helps
- **Neural Networks:** Hidden layers learn useful features automatically
- This is why deep learning is powerful - no manual feature design needed!
- The network discovers the right features during training

### 3. Connection to Hidden Layers
- Each hidden neuron creates a new "dimension" (like the x‚ÇÉ you added)
- More hidden neurons = more dimensions = more flexibility
- Output layer makes simple linear decisions in this richer space
- Complex curved boundaries in original space = simple planes in lifted space

### 4. Why We Need Multiple Layers
- **Single perceptron:** 1 boundary in original space ‚Üí Can't solve XOR
- **Hidden layer:** Lifts data to higher dimensions ‚Üí Simple rules work!
- This is the foundation of all deep learning
- Next module: See how a 2-2-1 network does this automatically!

In [None]:
# Summary
print("="*70)
print("Next: In Module 1, you'll build a 2-2-1 neural network and see")
print("how TWO hidden neurons create two new dimensions to solve XOR!")
print("="*70)

## Questions for Your Answer Sheet

**Q1.** In 2D, can you draw a straight line that separates the XOR pattern? Why or why not?

**Q2.** After adding the third dimension (x‚ÇÉ = x‚ÇÅ √ó x‚ÇÇ), describe what you observed in the 3D plot. Could you see how a flat plane separates the two classes?

**Q3.** When you view the 3D separating plane from directly above (bird's eye view), what shape does the decision boundary have in 2D? Is it a straight line?

**Q4.** How is adding a third dimension similar to what activation functions did in Lab 3? (Hint: Think about "transforming" or "warping" space)

## Next Steps

1. **Answer Q1-Q4** on your answer sheet
2. **Return to the LMS** and continue to Module 1
3. In Module 1, you'll see how a 2-2-1 neural network solves XOR by creating two new dimensions automatically!