# Deep Learning – Common Terminology

Before diving into neural networks, it’s important to understand some **core terms** that frequently appear in deep learning discussions.  
These concepts are explained visually and intuitively in the reference video, and the notes below follow the same flow.

---

## Perceptron

A **perceptron** is the **simplest form of a neural network** — it consists of just **one neuron**.

It:
- Takes multiple **inputs**
- Multiplies them with **weights**
- Adds a **bias**
- Passes the result through an **activation function**

The perceptron then makes a **binary decision**, such as classifying an output as **0 or 1**, similar to the demonstration shown in the video.

---

## Neural Network

A **neural network** is a collection of **interconnected perceptrons (neurons)** organized into layers.

Each layer:
- Applies **weights**
- Adds **biases**
- Uses **activation functions** to transform inputs

A **deep neural network** contains **multiple hidden layers**, allowing it to learn and model **complex patterns**, as illustrated layer-by-layer in the video.

---

## Hyperparameters

**Hyperparameters** are values that are **set before training begins**.  
They are **not learned from the data**, but they strongly influence how learning happens.

Common hyperparameters include:

- **Learning rate** – Controls how much weights are updated
- **Number of epochs** – How many times the full dataset is used
- **Batch size** – Number of samples processed before weight updates
- **Number of layers or neurons** – Defines the network’s structure

These are tuned experimentally, just like shown in the training walkthrough in the video.

---

## Learning Rate (η)

The **learning rate (η)** controls **how much the model adjusts its weights** after each training step.

- Too **high** → model may overshoot the optimal solution
- Too **low** → training becomes very slow

The video visually demonstrates how different learning rates affect convergence.

---

## Training

**Training** is the phase where the model **learns from data**.

During training:
- The model makes predictions
- Errors are calculated by comparing predictions with actual outputs
- Weights are updated to reduce these errors

This iterative learning process is shown clearly in the training loop explained in the video.

---

## Backpropagation

**Backpropagation** is the algorithm used to **update weights** in a neural network.

It works by:
- Computing the **error (loss)**
- Calculating **gradients** using the **chain rule**
- Propagating the error **backwards** through the network

This allows the model to **learn from its mistakes**, exactly as visualized step-by-step in the video.

---

## Inference

**Inference** is the phase where a **trained model** is used to make predictions on **new, unseen data**.

- No learning happens here
- Weights remain fixed
- The model only performs forward computation

The video highlights this difference between training and inference clearly.

---

## Activation Function

An **activation function** introduces **non-linearity** into the network, enabling it to learn complex relationships.

Common activation functions include:

- **ReLU (Rectified Linear Unit)**
- **Sigmoid** – outputs values between 0 and 1
- **Tanh** – outputs values between -1 and 1

While early perceptrons used a **step function**, modern neural networks primarily use **ReLU**, as shown in the video examples.

---

## Epoch

An **epoch** means **one complete pass** through the entire training dataset.

- Models are trained for **multiple epochs**
- Each epoch helps refine the learned patterns
- More epochs generally improve learning (up to a point)

This repeated learning cycle is demonstrated clearly in the training timeline shown in the video.