# 1.1 - Introduction to AI, ML, and Deep Learning

Welcome to your first step in the AI journey! In this chapter, we'll explore the foundations of Artificial Intelligence, Machine Learning, and Deep Learning. You'll understand how these concepts relate to each other and build your first simple ML model.

## What You'll Learn

- What is Artificial Intelligence?
- The relationship between AI, ML, and DL
- Key ML paradigms (Supervised, Unsupervised, Reinforcement)
- Build a simple linear regression model
- The GenAI landscape in 2026

## Setup

First, let's install and import the libraries we'll need.

In [None]:
# Install required packages (uncomment if running in Colab)
# !pip install numpy matplotlib scikit-learn -q

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

# Set style for better-looking plots
plt.style.use('seaborn-v0_8-darkgrid')
plt.rcParams['figure.figsize'] = (10, 6)
plt.rcParams['font.size'] = 11

print(f"NumPy version: {np.__version__}")
print("Setup complete!")

---

## 1. What is Artificial Intelligence?

**Artificial Intelligence (AI)** is the field of computer science focused on creating systems that can perform tasks that typically require human intelligence.

### The AI Hierarchy

Think of AI as a set of nested concepts:

```

   Artificial Intelligence (AI)       ← Broad field: Any intelligent behavior
    
     Machine Learning (ML)           ← Learning from data
        
       Deep Learning (DL)          ← Neural networks with many layers
            
        GenAI / LLMs             ← Generative models (GPT, Claude, etc.)
            
        
    

```

**Key Insight:** All Deep Learning is Machine Learning, all Machine Learning is AI, but not all AI is Machine Learning!

## 2. Machine Learning Paradigms

Machine Learning can be categorized into three main paradigms based on how the model learns:

In [None]:
# Let's visualize the ML paradigms
paradigms = {
    ' Supervised Learning': {
        'description': 'Learning from labeled data (input → output pairs)',
        'examples': ['Classification', 'Regression'],
        'use_cases': ['Spam detection', 'House price prediction', 'Image recognition']
    },
    ' Unsupervised Learning': {
        'description': 'Finding patterns in unlabeled data',
        'examples': ['Clustering', 'Dimensionality Reduction'],
        'use_cases': ['Customer segmentation', 'Anomaly detection', 'Data compression']
    },
    ' Reinforcement Learning': {
        'description': 'Learning through trial and error with rewards',
        'examples': ['Q-Learning', 'Policy Gradients'],
        'use_cases': ['Game playing (AlphaGo)', 'Robotics', 'Self-driving cars']
    }
}

for paradigm, info in paradigms.items():
    print(f"\n{paradigm}")
    print(f"   {info['description']}")
    print(f"   Examples: {', '.join(info['examples'])}")
    print(f"   Use Cases: {', '.join(info['use_cases'])}")

## 3. Your First ML Model: Linear Regression

Let's build a simple **supervised learning** model to predict house prices based on size. This is a classic regression problem.

### The Problem

Given the size of a house (in square feet), can we predict its price?

We'll use the formula: **Price = m × Size + b**

Where:
- **m** = slope (how much price increases per square foot)
- **b** = intercept (base price)

In [None]:
# Generate synthetic house data
np.random.seed(42)

# House sizes (square feet)
sizes = np.random.randint(500, 3500, 100).reshape(-1, 1)

# Prices (with some realistic noise)
# True relationship: Price = 150 * Size + 50000 + noise
prices = 150 * sizes + 50000 + np.random.randn(100, 1) * 20000

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    sizes, prices, test_size=0.2, random_state=42
)

print(f"Training samples: {len(X_train)}")
print(f"Testing samples: {len(X_test)}")
print(f"\nSample data:")
print(f"Size: {X_train[0][0]:.0f} sq ft → Price: ${y_train[0][0]:,.0f}")

In [None]:
# Visualize the data
plt.figure(figsize=(12, 6))

plt.scatter(X_train, y_train, alpha=0.6, c='#3b82f6', 
           edgecolors='white', s=80, label='Training Data')
plt.scatter(X_test, y_test, alpha=0.6, c='#f59e0b', 
           edgecolors='white', s=80, label='Test Data')

plt.xlabel('House Size (sq ft)', fontsize=13, fontweight='bold')
plt.ylabel('Price ($)', fontsize=13, fontweight='bold')
plt.title('House Prices vs Size', fontsize=15, fontweight='bold', pad=20)
plt.legend(fontsize=11)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print(" Our goal: Find the best line that fits this data!")

### Training the Model

Now let's train a linear regression model to learn the relationship between size and price.

In [None]:
# Create and train the model
model = LinearRegression()
model.fit(X_train, y_train)

# Get the learned parameters
slope = model.coef_[0][0]
intercept = model.intercept_[0]

print(" Model Training Complete!\n")
print(f"Learned equation: Price = {slope:.2f} × Size + {intercept:,.0f}")
print(f"\nInterpretation:")
print(f"  • Base price: ${intercept:,.0f}")
print(f"  • Price per sq ft: ${slope:.2f}")
print(f"\n This means each additional square foot adds ${slope:.2f} to the price!")

In [None]:
# Evaluate the model
train_score = model.score(X_train, y_train)
test_score = model.score(X_test, y_test)

print(f"Model Performance (R² Score):")
print(f"  Training: {train_score:.4f}")
print(f"  Testing:  {test_score:.4f}")
print(f"\n A score close to 1.0 means excellent predictions!")

In [None]:
# Visualize predictions
plt.figure(figsize=(12, 6))

# Plot data points
plt.scatter(X_train, y_train, alpha=0.6, c='#3b82f6', 
           edgecolors='white', s=80, label='Training Data')
plt.scatter(X_test, y_test, alpha=0.6, c='#f59e0b', 
           edgecolors='white', s=80, label='Test Data')

# Plot the learned line
X_line = np.linspace(sizes.min(), sizes.max(), 100).reshape(-1, 1)
y_line = model.predict(X_line)
plt.plot(X_line, y_line, 'r-', linewidth=3, label='Learned Model', alpha=0.8)

plt.xlabel('House Size (sq ft)', fontsize=13, fontweight='bold')
plt.ylabel('Price ($)', fontsize=13, fontweight='bold')
plt.title('Linear Regression: Predictions vs Actual', fontsize=15, fontweight='bold', pad=20)
plt.legend(fontsize=11)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

### Making Predictions

Now let's use our trained model to predict prices for new houses!

In [None]:
# Predict prices for new houses
new_houses = np.array([[1000], [1500], [2000], [2500], [3000]])
predicted_prices = model.predict(new_houses)

print(" Price Predictions for New Houses:\n")
print("Size (sq ft)  →  Predicted Price")
print("" * 40)
for size, price in zip(new_houses, predicted_prices):
    print(f"{size[0]:>6,} sq ft  →  ${price[0]:>12,.0f}")

## 4. From ML to Deep Learning

Linear regression is simple, but what if the relationship isn't linear? That's where **Deep Learning** comes in!

### Key Differences

| Aspect | Traditional ML | Deep Learning |
|--------|---------------|---------------|
| **Model** | Simple (linear, trees) | Neural networks with many layers |
| **Features** | Manual engineering | Automatic learning |
| **Data Needed** | Works with small data | Needs large datasets |
| **Complexity** | Simple patterns | Complex patterns |
| **Examples** | Linear regression, SVM | CNNs, Transformers, LLMs |

### Neural Network Visualization

```
Input Layer    Hidden Layers    Output Layer
                                    
               
                                          
                       
```

Each connection has a **weight** that the network learns during training!

## 5. The GenAI Landscape (2026)

Today's Generative AI ecosystem is built on deep learning, specifically **Transformer** architectures.

### Popular Models

| Category | Examples | Key Features |
|----------|----------|-------------|
| **Large Language Models** | GPT-4, Claude, Gemini | Text generation, reasoning, coding |
| **Small Language Models** | Phi-3, Gemma, Qwen | Efficient, edge-deployable |
| **Multimodal Models** | GPT-4V, Gemini Pro | Vision + Language |
| **Code Models** | Codex, StarCoder, CodeLlama | Specialized for programming |
| **Open Source** | Llama 3, Mistral, Mixtral | Community-driven, customizable |

### How LLMs Work (Simplified)

1. **Training**: Learn patterns from billions of text examples
2. **Tokenization**: Break text into pieces (tokens)
3. **Prediction**: Predict the next token based on context
4. **Generation**: Repeat to generate full responses

We'll dive deeper into these concepts in later modules!

---

##  Key Takeaways

1. **AI** is the broad field; **ML** learns from data; **DL** uses neural networks
2. **Supervised learning** uses labeled data (like our house price example)
3. The core ML loop: **Data → Model → Training → Predictions**
4. **Deep Learning** adds layers and complexity for harder problems
5. **LLMs** are the frontier of GenAI, built on transformer architectures

##  Next Steps

Continue to [Chapter 2: Your First Neural Network](chapter-2-first-neural-network.ipynb) to build a real neural network from scratch!

---

##  Practice Exercise

Try modifying the code above to:
1. Change the relationship between size and price
2. Add more features (e.g., number of bedrooms)
3. Experiment with different train/test splits

---

*© 2026 MadeForAI. Learn more at [madeforai.github.io](https://madeforai.github.io/madeforai)*