# 1 Feedforward Neural Networks (FNNs)

1.1 Architecture: Input Layer → Hidden Layers (with activation functions such as ReLU, Sigmoid, Tanh) → Output Layer (with softmax or linear activation)

1.2 Description: Fully connected layers with no cycles or feedback loops, purely feedforward processing of data.

1.3 Applications: Classification, Regression, Function approximation.

1.4 Examples: Predicting house prices, Email spam detection.

## 1.5 Techniques:

1.5.1 Weight Initialization (He, Xavier)

1.5.2 Activation Functions (ReLU, Sigmoid, Tanh)

1.5.3 Loss Functions (Cross-Entropy Loss for classification, Mean Squared Error for regression)

1.5.4 Optimization Algorithms (Stochastic Gradient Descent, Adam, RMSprop)

1.5.5 Regularization (L2 Regularization, Dropout)

# 2 Convolutional Neural Networks (CNNs)

2.1 Architecture: Convolutional Layers (with filters/kernels) → Pooling Layers (Max Pooling, Average Pooling) → Fully Connected Layers (Dense Layers)

2.2 Description: Utilizes convolutional filters to capture spatial hierarchies and patterns.

2.3 Applications: Image Classification, Object Detection, Image Segmentation.

2.4 Examples: Recognizing digits in MNIST dataset, Detecting objects in COCO dataset.

## 2.5 Layers

2.5.1 Convolutional Layer (with parameters like filter size, stride, padding)

2.5.2 Pooling Layer (Max Pooling, Average Pooling, Global Pooling)

2.5.3 Fully Connected Layer (Dense Layer)

## 2.6 Techniques

2.6.1 Data Augmentation (Rotation, Scaling, Flipping, Cropping)

2.6.2 Transfer Learning (using pre-trained models like VGG, ResNet, Inception)

2.6.3 Fine-Tuning (adjusting a pre-trained model to a new task)

2.6.4 Batch Normalization

2.6.5 Dropout

2.6.6 Learning Rate Schedulers (Step Decay, Exponential Decay, Cyclical Learning Rates)

# 3 Recurrent Neural Networks (RNNs)

3.1 Architecture: Input → Recurrent Hidden Layers (with activation functions like Tanh or GRU/LSTM cells) → Output

3.2 Description: Processes sequential data by maintaining a hidden state that captures information from previous time steps.

3.3 Applications: Time Series Forecasting, Language Modeling, Sequence Classification.

3.4 Examples: Predicting stock prices, Text generation.

## 3.5 Variants

### 3.5.1 Long Short-Term Memory (LSTM)
---------------------------------------
3.5.1.1 Description: Mitigates vanishing gradient problem using gating mechanisms.

3.5.1.2 Gates: Input Gate, Forget Gate, Output Gate.

3.5.1.3 Applications: Speech Recognition, Machine Translation.

3.5.1.4 Examples: Google Translate.

### 3.5.2 Gated Recurrent Units (GRU)
---------------------------------------
3.5.2.1 Description: Simplified version of LSTM with fewer gates.

3.5.2.2 Gates: Update Gate, Reset Gate.

3.5.2.3 Applications: Similar to LSTM.

3.5.2.4 Examples: Time series prediction, Sentiment analysis.

## 3.6 Techniques

3.6.1 Backpropagation Through Time (BPTT)

3.6.2 Gradient Clipping (to handle exploding gradients)

3.6.3 Regularization (Dropout, L2 Regularization)

3.6.4 Sequence Padding and Masking

3.6.5 Attention Mechanisms (for focusing on specific parts of the sequence)

# 4 Autoencoders

4.1 Architecture: Encoder (with convolutional or fully connected layers) → Bottleneck (Latent Space) → Decoder (mirroring the encoder structure)

4.2 Description: Learns to compress input data into a lower-dimensional representation and then reconstruct it from this representation.

4.3 Applications: Dimensionality Reduction, Anomaly Detection, Data Denoising.

4.4 Examples: Reducing dimensionality of image datasets, Detecting fraud.

## 4.5 Variants

### 4.5.1 Variational Autoencoders (VAEs)
-----------------------------------------
4.5.1.1 Description: Probabilistic approach to autoencoders with latent variables following a specific distribution.

4.5.1.2 Applications: Image Generation, Data Compression.

4.5.1.3 Examples: Generating new faces.

### 4.5.2 Denoising Autoencoders
---------------------------------
4.5.2.1 Description: Trained to remove noise from input data.

4.5.2.2 Applications: Image Denoising.

### 4.5.3 Sparse Autoencoders
--------------------------------------------
4.5.3.1 Description: Enforces sparsity constraint on hidden representations to learn useful features.

4.5.3.2 Applications: Feature Learning.

## 4.6 Techniques:

4.6.1 Regularization (L1 for sparsity, L2 for weight decay)

4.6.2 Contractive Penalty (to enforce robustness to small input perturbations)

4.6.3 Reconstruction Loss (Mean Squared Error, Binary Cross-Entropy for binary data)

4.6.4 KL Divergence (for VAEs to measure the difference between the learned distribution and the prior)


# All concept available we have to go through which is available in Neural Network
## 1> Neurons

## 2> Layers

> 1.1> Input Layer

> 1.2> Hidden Layers

> 1.3> Output Layer

## 3> Weights

## 4> Bias

## 5> Activation Function

> 5.1> Sigmoid(it is used for binary class)

> 5.2> Softmax (for multi-class classification)

> 5.3> Tanh (Hyperbolic Tangent)

> 5.4> ReLU (Rectified Linear Unit)

> 5.5> LeakyReLU

## 6> Loss Function (Cost Function)

> 6.1> Binary Crossentropy

> 6.2> Categorical Crossentropy

> 6.3> Mean Squared Error (MSE)

> 6.4> Mean Absolute Error (MAE)

> 6.5> Huber Loss
## 7> Optimizer

> 7.1> Gradient Descent

> 7.2> Stochastic Gradient Descent (SGD)

> 7.3> Adam (Adaptive Moment Estimation)

> 7.4> RMSprop (Root Mean Square Propagation)

> 7.5> Adagrad (Adaptive Gradient Algorithm)
## 8> Learning Rate

## 9> Epoch : One complete pass through the entire training dataset.

## 10> Batch Size

## 11> Training Set

## 12> Validation Set

## 13> Test Set

## 14> Forward Propagation

## 15> Backward Propagation (Backpropagation)

## 16> Gradient

## 17> Dropout

## 18> Regularization

> 18.1> L1 Regularization (Lasso)

>18.2> L2 Regularization (Ridge)

## 19> Overfitting

## 20> Underfitting

## 21> Early Stopping

## 22> Normalization/Standardization

## 23> Epochs : The number of complete passes through the training dataset during the training process.

## 24> Mini-batch Gradient Descent