# Neural Networks – Fundamental Concepts and Linear Models

This notebook explains the fundamental concepts of neural networks with clear theory, mathematical formulation, illustrative examples, Python syntax, and practice-oriented explanations. The structure and formatting follow a clean academic style suitable for coursework and learning.

## Biological Neuron

A biological neuron is the basic unit of the nervous system. It is responsible for receiving, processing, and transmitting information.

A biological neuron consists of:
- **Dendrites** – receive signals from other neurons
- **Cell body (Soma)** – processes and integrates signals
- **Axon** – transmits signals to other neurons

Information is transmitted as electrical impulses and chemical signals. Learning occurs by strengthening or weakening synaptic connections.

## Artificial Neuron

An artificial neuron is a **mathematical model** inspired by the biological neuron.

It mimics biological behavior as follows:
- Inputs → signals
- Weights → strength of connections
- Output → processed signal

The artificial neuron performs a weighted sum of inputs followed by an activation function.

## Structure of an Artificial Neuron

An artificial neuron computes its output in two main steps:

1. **Linear combination**
   z = w₁x₁ + w₂x₂ + ... + wₙxₙ + b

2. **Activation function**
   a = f(z)

Where:
- x → input vector
- w → weight vector
- b → bias
- f( ) → activation function

This structure enables the neuron to learn from data.

In [None]:
import numpy as np

# Simple artificial neuron
x = np.array([1.0, 2.0, 3.0])
w = np.array([0.2, 0.5, -0.3])
b = 0.1

z = np.dot(w, x) + b
z

## Role of Weights and Bias

### Weights
- Control the **importance** of each input
- Learned during training
- Higher weight → stronger influence

### Bias
- Allows the neuron to **shift the activation**
- Helps the model fit data better
- Prevents forcing the output through the origin

Together, weights and bias define the decision boundary.

In [None]:
# Effect of different weights
x = np.array([2.0, 3.0])
w1 = np.array([0.1, 0.1])
w2 = np.array([0.9, 0.9])
b = 0.5

np.dot(w1, x) + b, np.dot(w2, x) + b

## Activation Functions and Non-Linearity

Activation functions introduce **non-linearity** into neural networks.

Why non-linearity is important:
- Real-world data is non-linear
- Linear models cannot capture complex patterns

Common activation functions:
- Step function
- Sigmoid
- Tanh
- ReLU

Without activation functions, deep networks behave like linear models.

In [None]:
# ReLU activation example
z = np.array([-3, -1, 0, 2, 4])
relu = np.maximum(0, z)
relu

## Single Neuron Model (Perceptron)

The perceptron is the simplest neural network model used for binary classification.

Decision rule:
- Output = 1 if (w·x + b) ≥ 0
- Output = 0 otherwise

Limitations:
- Works only for linearly separable data
- Cannot solve XOR problem

## Linear Transformation in Neural Networks (Wx + b)

Linear transformation is the core computation in neural networks.

Matrix form:
z = Wx + b

This allows efficient computation of multiple neurons simultaneously and supports GPU acceleration.

In [None]:
# Matrix-based linear transformation
X = np.array([[1, 2], [3, 4]])
W = np.array([[0.2, 0.4], [0.6, 0.8]])
b = np.array([0.1, 0.2])
np.dot(X, W) + b

## From Single Neuron to Fully Connected (Linear) Layer

A fully connected layer consists of multiple neurons.

Characteristics:
- Each neuron connects to all inputs
- Each neuron has its own weights and bias
- Enables learning of multiple features

Fully connected layers are widely used in multilayer perceptrons.

## Layers in a Neural Network

### Input Layer
- Receives raw input features
- No trainable parameters

### Hidden Layer
- Performs feature extraction
- Applies linear and non-linear transformations

### Output Layer
- Produces final prediction
- Depends on task (classification or regression)

## Forward Pass: Flow of Information Through the Network

The forward pass describes how data flows through the neural network.

Steps involved:
1. Input data enters the network
2. Linear transformation is applied
3. Activation function is applied
4. Output is produced

The forward pass is used during both training and inference.