# 🧪 Week 01 - Introduction to Machine Learning

## 🧠 What is Machine Learning?

Machine learning (ML) is a field of AI where computers learn from data instead of being explicitly programmed.  
There are **two main types**:

- **Supervised Learning** 🎯
- **Unsupervised Learning** 🔍

Of these two, supervised learning is the most widely used in real-world applications and has seen the most rapid innovation.

---

## 🎯 Supervised Learning

Supervised learning algorithms learn to map **inputs (x)** to **outputs (y)**, using example pairs (x, y).  
The model is trained on data where the correct outputs are known, so it can later predict the output for new, unseen inputs.

### 📌 Example Tasks:
- **Spam Detection**:  
  Input: email text → Output: spam (1) or not spam (0)
  
- **Speech Recognition**:  
  Input: audio clip → Output: transcript (text)

---

## 📈 Types of Supervised Learning

### 🔢 Regression
- Predicts **continuous numeric values**.
- Output: any number (infinite possibilities).
- Example: Predicting house prices, temperature, age, etc.

### 🧮 Classification
- Predicts **discrete categories**.
- Output: a class label (e.g., 0 or 1).
- Example: Predicting if a tumor is **benign (0)** or **malignant (1)**.

Key difference:
> Regression → infinite possible outputs  
> Classification → limited number of classes

---

## ✅ Summary

| Task               | Input             | Output         | Type            |
|--------------------|-------------------|----------------|-----------------|
| Spam detection     | Email text        | 0 or 1         | Classification  |
| House price        | Features of house | Price ($)      | Regression      |
| Tumor diagnosis    | Medical data      | Benign / Malig | Classification  |

---




## 🔍 Unsupervised Learning

In **unsupervised learning**, the algorithm is given **data without associated labels (y)**.

For example, imagine a dataset with patient information:
- Tumor size
- Patient age  
...but without knowing whether the tumor is **benign** or **malignant**.

Here, we're not giving the algorithm the "correct answer" — there's **no supervision**. Instead, we ask the algorithm to **find patterns or structure in the data** on its own.

---

### 📊 Clustering

A common type of unsupervised learning is **clustering**, where the algorithm groups similar data points together based on some similarity.

Example:
- The algorithm might discover **two natural clusters** of patients, based on age and tumor size.
- These clusters might represent different biological subtypes, even if the algorithm was never told that.

---

### ⚠️ Anomaly Detection

Another important unsupervised learning technique is **anomaly detection**.

It identifies **rare or unusual patterns** in data that don't fit the norm.  

Used in:
- 🔐 **Fraud detection** (e.g., abnormal credit card transactions)
- 🚨 **Network security** (detecting intrusions)
- 🧪 **Medical diagnostics** (spotting outliers in patient data)

---

### 🔽 Dimensionality Reduction

Dimensionality reduction compresses a dataset with **many features** into fewer dimensions while retaining as much **relevant information** as possible.

Useful for:
- Visualizing high-dimensional data in 2D or 3D
- Reducing noise and overfitting
- Speeding up learning algorithms

---

## ✅ Summary

| Technique              | Goal                            | Example Use Case           |
|------------------------|----------------------------------|----------------------------|
| Clustering             | Group data into clusters         | Market segmentation        |
| Anomaly Detection      | Detect unusual data points       | Fraud detection            |
| Dimensionality Reduction | Compress data efficiently       | Visualizing gene expression |

---

## Linear Regression
