# Machine Learning - Full Topics Overview

---

## Phase 1: Core ML Foundations

### What is Machine Learning?

### Types of Machine Learning
- Supervised Learning
- Unsupervised Learning
- Reinforcement Learning (Briefly)

### Training vs Testing vs Validation Sets

### Underfitting vs Overfitting

### Bias-Variance Tradeoff

---

## Phase 2: Supervised Learning

### Linear Regression
- Concept & Equation
- Use Cases
- scikit-learn Example

### Loss Functions
- Mean Squared Error (MSE)
- Mean Absolute Error (MAE)

### Gradient Descent
- Intuition & Update Rule
- Manual Numerical Example

### Logistic Regression
- Classification Use Case
- Sigmoid Function

### Evaluation Metrics
- Accuracy
- Precision
- Recall
- F1 Score
- Confusion Matrix

---

## Phase 3: More Supervised Algorithms

### K-Nearest Neighbors (KNN)

### Decision Trees

### Random Forests

### Naive Bayes

### Support Vector Machine (SVM)

---

## Phase 4: Unsupervised Learning

### Clustering Overview

### K-Means Clustering

### Hierarchical Clustering

### Dimensionality Reduction (PCA)

---

## Bonus (Advanced )

### Feature Engineering
- Handling Missing Data
- Encoding Categorical Variables
- Feature Scaling (Normalization/Standardization)

### Cross-Validation

### Hyperparameter Tuning
- GridSearchCV
- RandomizedSearchCV

---

# Tools Used
- scikit-learn
- NumPy
- Matplotlib / Seaborn (for visualization)


### Typical ML Flow:
Collect/Load Data

Preprocess Data (cleaning, splitting)

Train a Model

Make Predictions

Evaluate the Model

### ✅ Overfitting (vs Underfitting)

- **Overfitting**: Model learns training data *too well* (even noise), performs poorly on new data.
- **Underfitting**: Model is *too simple*, fails to learn patterns from training data.

| Property             | Underfitting 🐢 | Overfitting 🐇 |
|----------------------|----------------|----------------|
| Training Accuracy    | Low            | Very High      |
| Test Accuracy        | Low            | Low            |
| Model Complexity     | Too Simple     | Too Complex    |
| Generalization       | Poor           | Poor           |


# 🧠 Machine Learning (ML) - Quick Notes

## What is Machine Learning?
- ML is a technique where computers **learn from data** instead of being explicitly programmed.
- The goal is to **find patterns** in data and make predictions or decisions.

---

## 🔍 Types of Machine Learning

### 1. **Supervised Learning**
- Learns from **labeled data** (input + output given).
- Model finds the mapping function from input to output.
- **Examples**: House price prediction, Spam detection

-✅ Supervised Learning (Linear & Logistic Regression)

-✅ Evaluation Metrics


#### 🔹 More Algorithms (Supervised Learning)
Here are the topics we'll cover one by one:

- K-Nearest Neighbors (KNN)

- Decision Trees

- Random Forest

- Naive Bayes

- SVM (Support Vector Machine)



### 2. **Unsupervised Learning**
- Works with **unlabeled data** (only inputs).
- Finds hidden patterns or groups in the data.
- **Examples**: Customer segmentation, Anomaly detection

### 3. **Reinforcement Learning**
- Learns through **trial and error** using rewards and penalties.
- **Examples**: Game playing (like Chess, Go), Robotics

---

# 🛠️ scikit-learn (sklearn)

- A popular Python library for implementing ML algorithms.
- Offers tools for **classification, regression, clustering, model evaluation**, etc.
- Very beginner-friendly and widely used in the industry.

### Commonly Used Functions:

#### 🔹 `.fit(X, y)`
- Trains the model on data.
- `X`: input features (independent variables)
- `y`: target/output (dependent variable)

#### 🔹 `.predict(X)`
- Makes predictions using the trained model.

#### 🔹 `.score(X, y)`
- Returns the accuracy or performance score of the model.

---

# Summary
- ML helps build models that can **learn from data**.
- scikit-learn simplifies model building with **easy syntax**.


### Goal: Predict house price using area (Linear Regression)


In [1]:
from sklearn.linear_model import LinearRegression
import numpy as np

# Data: area vs price
area = np.array([[1000], [1500], [2000], [2500]])
price = np.array([300000, 400000, 500000, 600000])

# Create model
model = LinearRegression()

# Train (fit)
model.fit(area, price)

# Predict price for 1800 sq ft
prediction = model.predict([[1800]])
print("Predicted price:", prediction[0])


Predicted price: 460000.0
