# Layer Types Reference


<small>🟢 Fundamental 🟡 Common 🟠 Specialized 🔴 Advanced</small>

**Linear Layers**

| Layer  | PyTorch API | Use case | Found in |
|---|---|---|---|
| 🟢 Linear (Dense) | [nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html) | Dimensionality increase / decrease (e.g., 256 &rarr; 10 for classification) | Any neural network (MLPs, final layers) |
| 🟢 Convolutional | [nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html) | Extract patterns from images (e.g. edges) using kernels | Any CNN (ResNet, VGG, EfficientNet) |
| 🟡 TransposedConv2d | [nn.ConvTranspose2d](https://pytorch.org/docs/stable/generated/torch.nn.ConvTranspose2d.html) | Learnable upsampling for image generation | GANs, U-Net decoders, SRCNN |

<details>
<summary><small>More Linear</small></summary>

| Layer  | PyTorch API | Use case | Found in |
|---|---|---|---|
| 🟠 Conv1d | [nn.Conv1d](https://pytorch.org/docs/stable/generated/torch.nn.Conv1d.html) | Extract patterns from sequential data | Audio processing, signal analysis |
| 🟠 Conv3d | [nn.Conv3d](https://pytorch.org/docs/stable/generated/torch.nn.Conv3d.html) | Extract patterns from 3D data | Video processing, 3D object detection |
| 🟠 GroupedConv2d | [nn.Conv2d(groups=n)](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html) | Process channels in groups for efficiency | ResNeXt, MobileNet, EfficientNet |
| 🟠 DilatedConv2d | [nn.Conv2d(dilation=n)](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html) | Sample pixels with gaps to see larger area without more parameters | DeepLab, semantic segmentation |
| 🟠 TransposedConv1d | [nn.ConvTranspose1d](https://pytorch.org/docs/stable/generated/torch.nn.ConvTranspose1d.html) | Learnable upsampling for 1D data | Audio generation, speech synthesis |

</details>


**Recurrent Layers**

| Layer  | PyTorch API | Use case | Found in |
|---|---|---|---|
| 🟢 RNN | [nn.RNN](https://pytorch.org/docs/stable/generated/torch.nn.RNN.html) | Simple sequence processing | Teaching, very simple tasks |
| 🟢 LSTM | [nn.LSTM](https://pytorch.org/docs/stable/generated/torch.nn.LSTM.html) | Process sequences with long-term dependencies | Traditional NLP, time series |
| 🟠 Multihead Attention | [nn.MultiheadAttention](https://pytorch.org/docs/stable/generated/torch.nn.MultiheadAttention.html) | Modern sequence processing with global context | Transformers (BERT, GPT) |

<details>
<summary><small>More Recurrent</small></summary>

| Layer  | PyTorch API | Use case | Found in |
|---|---|---|---|
| 🟡 GRU | [nn.GRU](https://pytorch.org/docs/stable/generated/torch.nn.GRU.html) | Efficient sequence processing | Lighter NLP models, small datasets |
| 🟠 Transformer Encoder | [nn.TransformerEncoder](https://pytorch.org/docs/stable/generated/torch.nn.TransformerEncoder.html) | Bidirectional context processing | BERT, ViT, modern NLP |
| 🟠 Transformer Decoder | [nn.TransformerDecoder](https://pytorch.org/docs/stable/generated/torch.nn.TransformerDecoder.html) | Autoregressive sequence generation | GPT, translation models |

</details>

**Normalization Layers**

| Layer  | PyTorch API | Use case | Found in |
|---|---|---|---|
| 🟢 LayerNorm | [nn.LayerNorm](https://pytorch.org/docs/stable/generated/torch.nn.LayerNorm.html) | Stabilize transformer/MLP training | Transformers, modern architectures |
| 🟡 BatchNorm2d | [nn.BatchNorm2d](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html) | Stabilize CNN training (also available for 1d and 3d) | Most CNNs (ResNet, DenseNet) |

<details>
<summary><small>More Normalization</small></summary>

| Layer  | PyTorch API | Use case | Found in |
|---|---|---|---|
| 🟡 GroupNorm | [nn.GroupNorm](https://pytorch.org/docs/stable/generated/torch.nn.GroupNorm.html) | Alternative to BatchNorm for small batches | When batch size is limited, ViTs |
| 🟠 InstanceNorm2d | [nn.InstanceNorm2d](https://pytorch.org/docs/stable/generated/torch.nn.InstanceNorm2d.html) | Normalize per instance | Style transfer, GANs, image gen |

</details>



**Non-Linear Activation**

| Layer  | PyTorch API | Use case | Found in |
|---|---|---|---|
| 🟢 ReLU | [nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html) | Basic non-linearity | Most neural networks |
| 🟢 Softmax | [nn.Softmax](https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html) | Output probabilities (normalizes between 0 and 1, and sums to 1) | Classification outputs |
| 🟡 Sigmoid | [nn.Sigmoid](https://pytorch.org/docs/stable/generated/torch.nn.Sigmoid.html) | Output 0-1 range | Binary tasks, gates |
| 🟡 Tanh | [nn.Tanh](https://pytorch.org/docs/stable/generated/torch.nn.Tanh.html) | Output -1 to 1 range | GAN generators, RNNs |

<details>
<summary><small>More Activation</small></summary>

| Layer  | PyTorch API | Use case | Found in |
|---|---|---|---|
| 🟡 LeakyReLU | [nn.LeakyReLU](https://pytorch.org/docs/stable/generated/torch.nn.LeakyReLU.html) | Prevent dying ReLU | GANs, deep networks |
| 🟠 GELU | [nn.GELU](https://pytorch.org/docs/stable/generated/torch.nn.GELU.html) | Smooth non-linearity | Transformers (BERT, GPT) |
| 🟠 SiLU/Swish | [nn.SiLU](https://pytorch.org/docs/stable/generated/torch.nn.SiLU.html) | Modern ReLU alternative (also see SwiGLU) | EfficientNet, mobile networks |
| 🟠 LogSoftmax | [nn.LogSoftmax](https://pytorch.org/docs/stable/generated/torch.nn.LogSoftmax.html) | Numerical stability | With NLLLoss, classification |

</details>



**Regularization**

| Layer  | PyTorch API | Use case | Found in |
|---|---|---|---|
| 🟢 Dropout | [nn.Dropout](https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html) | Prevent overfitting by adding noise to activations | Most networks during training |
| 🟡 Dropout2d | [nn.Dropout2d](https://pytorch.org/docs/stable/generated/torch.nn.Dropout2d.html) | Spatial feature regularization | CNNs, spatial tasks |



**Other / Special**

| Layer  | PyTorch API | Use case | Found in |
|---|---|---|---|
| 🟢 MaxPool2d | [nn.MaxPool2d](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html) | Downsample keeping strongest features | Traditional CNNs (VGG) |
| 🟢 AvgPool2d | [nn.AvgPool2d](https://pytorch.org/docs/stable/generated/torch.nn.AvgPool2d.html) | Downsample preserving average | Modern CNNs (ResNet) |

<details>
<summary><small>More Special</small></summary>

| Layer  | PyTorch API | Use case | Found in |
|---|---|---|---|
| 🟢 Flatten | [nn.Flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html) | Convert to 1D | Between Conv and Linear |
| 🟢 Embedding | [nn.Embedding](https://pytorch.org/docs/stable/generated/torch.nn.Embedding.html) | Convert categorical data to dense vectors by learning a lookup table | NLP, recommendation systems |
| 🟡 AdaptiveAvgPool2d | [nn.AdaptiveAvgPool2d](https://pytorch.org/docs/stable/generated/torch.nn.AdaptiveAvgPool2d.html) | Dynamic size pooling | Modern CNNs, ResNet |
| 🟡 Upsample | [nn.Upsample](https://pytorch.org/docs/stable/generated/torch.nn.Upsample.html) | Simple resize/upscale | GANs, image generation |
| 🟠 PixelShuffle | [nn.PixelShuffle](https://pytorch.org/docs/stable/generated/torch.nn.PixelShuffle.html) | Efficient upsampling | Super-resolution, ESRGAN |
| 🔴 UnFold | [nn.UnFold](https://pytorch.org/docs/stable/generated/torch.nn.Unfold.html) | Extract image patches (rectangular areas in a 2D grid) | Custom convolution ops |
| 🔴 Fold | [nn.Fold](https://pytorch.org/docs/stable/generated/torch.nn.Fold.html) | Merge image patches | Custom convolution ops |

</details>