Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions article-neural-networks/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# ANN, CNN, RNN, and Transformers — A Simple Guide

This is a small article I wrote to explain how deep learning models grew over time, in a simple way that anyone can understand.

The idea is simple: every new model was created because the old one had a real problem. Once you see this chain of problems and fixes, picking the right model for your task becomes much easier.

## What's inside

The article walks through the main neural network families step by step:

- **Activation functions** → why they matter, and why ReLU and GELU changed everything
- **CNNs** → how they made image learning possible
- **RNNs** → how they added memory for sequences (and why they forget too fast)
- **LSTM and GRU** → how gates fixed the memory problem
- **Transformers** → why "attention" changed the whole field
- **BERT** → a quick look at how big pretrained models work today

You also get:

- 2 comparison tables (activations + architectures) so you can pick a model fast
- 3 small code examples (PyTorch + HuggingFace) showing how each idea looks in real production code
- 7 images to help visualize the concepts

## Who is this for

- Engineers who want to understand *why* each model exists, not just memorize names
- Students who are just starting with deep learning
- Anyone who wants a clean, simple overview before diving into papers or courses

You don't need a math background. The article uses simple words, short sentences, and real examples.


## Main takeaway

Deep learning didn't appear all at once. Each model fixed a real weakness of the one before it:

- ANNs were too heavy for images → CNNs
- CNNs ignored order → RNNs
- RNNs forgot too fast → LSTM and GRU
- LSTMs were too slow → Transformers

If you keep this chain in mind, you'll always know which model fits your task — and why.

You can also read my Medium article on this topic here:

[Open Medium Article](https://medium.com/p/6ad62a95f98f?postPublishedType=initial)
460 changes: 460 additions & 0 deletions article-neural-networks/article.md

Large diffs are not rendered by default.

Binary file added article-neural-networks/images/CNN.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added article-neural-networks/images/LSTM_GRU.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added article-neural-networks/images/RNN.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added article-neural-networks/images/all.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added article-neural-networks/images/tranformers_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added article-neural-networks/images/transformers_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading