
## What is Machine Learning?

**Machine Learning** is a field of Artificial Intelligence (AI) that gives computers the ability to **learn from data** and make **decisions or predictions** without being explicitly programmed for every task.

> **In simple terms:** You feed data to a machine, and it learns patterns from that data to make predictions or decisions.


## Example

Suppose you want to predict whether an email is **spam or not**:

-  You give the model **hundreds of emails** labeled as "spam" or "not spam".
-  It **learns patterns** (e.g., spammy words like "lottery", "free", "prize").
-  Then, it can predict whether **new emails** are spam or not.

## Types of Machine Learning

Machine Learning is mainly divided into **three types**:

### 1. Supervised Learning

You train the model using **labeled data** (data with answers).

#### Example:

-  Predicting house prices based on features like area, number of rooms, location, etc.
-  Email spam detection (as explained above).

#### Input → Features (e.g., area, rooms)

#### Output → Labels (e.g., price)

#### Algorithms:

- Linear Regression
- Logistic Regression
- Decision Trees
- Random Forest
- Support Vector Machines (SVM)
- K-Nearest Neighbors (KNN)


### 2. Unsupervised Learning

You train the model using **unlabeled data** (no output/answers provided). The model tries to find **patterns or groups**.

#### Example:

- Grouping customers into segments based on their shopping behavior.
- Detecting unusual behavior in network traffic (Anomaly Detection).

#### Input → Just features, no labels

#### Algorithms:

* K-Means Clustering
* Hierarchical Clustering
* Principal Component Analysis (PCA)

### 3. Reinforcement Learning

The model learns by **trial and error**, like training a dog. It takes actions, receives **rewards or penalties**, and learns the best strategy over time.

#### Example:

- Teaching a robot to walk.
- Self-driving cars.
- Game-playing agents (like AlphaGo).

#### Agent → Takes action in environment

#### Reward → Learns from feedback

#### Algorithms:

- Deep Q Networks (DQN)
- Q-Learning
- Policy Gradient Methods

## Summary Table

| Type                   | Data Used      | Goal                      | Example                           |
| ---------------------- | -------------- | ------------------------- | --------------------------------- |
| Supervised Learning    | Labeled data   | Predict outcomes          | Predicting prices, spam detection |
| Unsupervised Learning  | Unlabeled data | Find hidden patterns      | Customer segmentation             |
| Reinforcement Learning | Experience     | Maximize reward over time | Game playing, robot walking       |

# What is Supervised Learning?

In **Supervised Learning**, we teach the machine using **labeled data**, where each input comes with the correct output.

> The model learns the relationship between **features** (inputs) and **labels** (outputs), so it can predict labels for new, unseen data.

## Real-World Examples

| Problem                            | Input (Features)              | Output (Label)   | Type           |
| ---------------------------------- | ----------------------------- | ---------------- | -------------- |
| House price prediction             | Area, rooms, location, etc.   | Price            | Regression     |
| Email spam classification          | Email text                    | Spam or Not Spam | Classification |
| Student marks prediction           | Study hours, attendance       | Marks            | Regression     |
| Tumor diagnosis (benign/malignant) | Size, shape, texture of tumor | Diagnosis label  | Classification |

## Two Main Types of Supervised Learning

### 1. Regression** – Predicting a continuous value

- Output is **numeric** (e.g., 45.6, 100, 23.5)
- Example: Predicting house prices, temperature

### 2. Classification** – Predicting categories or labels

- Output is **categorical** (e.g., Yes/No, Spam/Not Spam, Class A/B/C)
- Example: Email spam detection, disease diagnosis

## Basic Workflow of Supervised Learning

1. Collect Data – Use real-world datasets (like from Kaggle or `sklearn.datasets`)
2. Prepare Data – Clean and preprocess using Pandas, NumPy
3. Split Data – Training set and Testing set (usually 80%-20%)
4. Choose a Model – Like Linear Regression, Decision Tree, etc.
5. Train the Model – Using the training data
6. Evaluate the Model – Using the testing data
7. Predict – Make predictions on new/unseen data

## Simple Example: Predicting Student Marks

Let’s say we have this dataset:

| Study Hours | Marks |
| ----------- | ----- |
| 1           | 50    |
| 2           | 55    |
| 3           | 65    |
| 4           | 70    |
| 5           | 80    |

Here:

* Input (feature) = Study Hours
* Output (label) = Marks

We can use **Linear Regression** to fit a line and predict how many marks a student might get if they study for 6 hours.


## What We can Learn in Supervised Learning

1. **Regression Models**

   - Linear Regression
   - Polynomial Regression

2. **Classification Models**

   - Logistic Regression
   - K-Nearest Neighbors (KNN)
   - Decision Trees
   - Support Vector Machines (SVM)
   - Naive Bayes

3. **Model Evaluation**

   - Accuracy, Precision, Recall, F1-score (for classification)
   - MSE, RMSE, R² score (for regression)

4. **Tools/Libraries**

   - `scikit-learn`
   - `pandas`, `numpy`
   - `matplotlib`, `seaborn` for visualization