# **Level 1: The Origins — Intro to LLMs & Chatbots**

## **Section 1: Fundamentals of AI Models**

---

# **Part 2: Neural Networks**

# Neural Networks Explained (For Absolute Beginners)

Imagine you're trying to bake the perfect cake. You have several ingredients — flour, sugar, eggs, butter — and depending on how much of each you add, your cake turns out differently. You might experiment with different amounts:

* A little more sugar? The cake gets sweeter.
* Less flour? The cake might be too soft.
* More eggs? It holds together better.

Eventually, through trial and error, you learn the right balance of ingredients to get the perfect cake.

A Neural Network works in a very similar way. But instead of ingredients for a cake, it adjusts **numbers called weights and biases** to get the "perfect result" — like recognizing a picture, understanding text, or predicting what word comes next.

---

## Think of it Like a Group of Decision-Makers

Picture a group of people in a company working together to make a decision:

1. **The First Group (Input Layer)**: These people collect raw information. Maybe they look at sales numbers, customer reviews, or market trends.

2. **The Middle Groups (Hidden Layers)**: These groups analyze the information in different ways. One group might focus on customer satisfaction, another on product quality, another on pricing.

3. **The Final Group (Output Layer)**: This group looks at all the analysis and makes the final decision — like "Should we launch the product?" or "What price should we set?"

In a Neural Network:

* Each "person" is like a **neuron**, doing a small part of the thinking.
* The "connections" between them carry information, and each connection has a **weight**, representing how much importance that connection has.
* Each neuron can also adjust its own "opinion" with a **bias**, allowing for more flexibility.

---

## A Simple Example: Deciding If You Should Bring an Umbrella

Let's build a mini neural network in your head for a familiar task: Should you bring an umbrella today?

### You consider:

* **Is it cloudy?** (Input 1)
* **Did the weather app say it might rain?** (Input 2)
* **Do you hear thunder?** (Input 3)

Your brain weighs each of these:

* If it's cloudy, that's important — maybe weight = 0.5
* If the app says rain, even more important — weight = 0.8
* If you hear thunder, that's a big clue — weight = 1.0

You multiply each input by its weight, add them up, maybe adjust with a small **bias**, and based on the total, you decide:

* If the result is high enough → "Yes, bring an umbrella."
* If not → "No, you probably don't need it."

This is basically what a neural network does — but instead of weather inputs, it might process numbers representing pixels in an image, words in a sentence, or sounds in a voice recording.

---

## How Does a Neural Network Learn?

When you're young, you make guesses about the world, sometimes wrong:

* You might think every cloud means rain. You learn that's not always true.
* Over time, with experience, your brain adjusts how much importance you give to each clue.

A neural network learns similarly:

1. It starts with random weights.
2. It makes a guess (e.g., "this is a cat").
3. It checks if it was right (using data with correct answers).
4. If it was wrong, it adjusts the weights a little.
5. Repeat this process thousands or millions of times.

Eventually, just like you get better at deciding when to carry an umbrella, the neural network gets better at recognizing cats, translating languages, or generating responses.

---

## Real-World Example: Recognizing Handwritten Numbers

One famous early use of neural networks was teaching computers to recognize handwritten numbers (0 to 9).

* The computer sees an image of a handwritten "3".
* It turns the image into numbers (representing the pixels).
* The neural network processes those numbers through its layers.
* It makes a guess: "I think this is a 3."
* If it's wrong, it adjusts the weights.
* Over many examples, it learns how to recognize all the digits correctly.

---

## Summary

Neural Networks are like groups of decision-makers, or recipes for decision-making, that adjust themselves over time to get better at a task. They process information, give more or less importance to different pieces, and "learn" by adjusting how they combine that information — just like you refine your judgment about the weather or how to bake a better cake.

Once trained, these networks power AI systems that can:

* Recognize faces or objects in photos.
* Understand and generate human language.
* Predict stock prices or medical conditions.
* Even chat with you, like I’m doing now.



In the previous part, we established that Artificial Intelligence often refers to computer systems capable of performing tasks that typically require human intelligence. One of the most fundamental building blocks behind modern AI systems, especially in fields like image recognition, language understanding, and autonomous systems, is the **Neural Network**.

To understand Neural Networks properly, let's build the idea from the ground up.

---

### **What is a Neural Network?**

A Neural Network is a computational model inspired by how the human brain works. In simple terms, it is a system of interconnected units, called **neurons**, that process information collectively and in parallel to perform tasks like recognizing patterns, making predictions, or generating outputs.

In AI, however, these neurons are purely mathematical objects, not biological cells. But the overall structure loosely mimics the idea of how biological neurons pass signals to each other.

---

### **Basic Structure of a Neural Network**

A typical Neural Network is organized into layers:

* **Input Layer**: Receives raw data (for example, pixel values for images, or tokenized text for language tasks).
* **Hidden Layers**: Perform transformations and computations on the data. There can be one or many hidden layers.
* **Output Layer**: Produces the final result, like a class label, a probability, or a generated response.

Each layer is made up of **neurons** (also called **nodes** or **units**), and these neurons are connected to neurons in the next layer. Each connection has an associated **weight**, which determines the strength or importance of that connection.

---

### **Weights, Biases, and Parameters**

To grasp how Neural Networks learn and make predictions, it’s essential to understand these components:

* **Weights**: Every connection between two neurons has a weight — a numeric value that represents the strength or importance of that connection. During training, these weights are adjusted so the network improves its performance.

* **Biases**: In addition to weights, each neuron often has a bias — a numeric value added to the neuron's input to allow the model to shift the output and better fit the data.

* **Parameters**: Collectively, all the weights and biases in a neural network are referred to as its parameters. These are the values the network learns during training.

Illustration to make this clear:

Suppose a neuron receives two inputs:

* Input 1 = 0.8, with a weight of 0.4
* Input 2 = 0.5, with a weight of 0.6
* The neuron also has a bias of 0.2

The neuron's computation looks like this:

**(0.8 × 0.4) + (0.5 × 0.6) + 0.2 = 0.32 + 0.3 + 0.2 = 0.82**

This result is then passed through an **activation function**, which determines whether the neuron "fires" or how much it contributes to the next layer. Common activation functions include **ReLU**, **Sigmoid**, and **Tanh**, each introducing non-linearity, which allows the network to model complex patterns.

---

### **How Neural Networks Learn**

Neural Networks improve their performance through a process called **training**, which involves:

1. Feeding the network input data (like images or text).
2. Generating an output (initially random or poor).
3. Comparing the output to the correct answer using a **loss function**, which measures how wrong the prediction is.
4. Adjusting the weights and biases using a method called **backpropagation** combined with an optimization algorithm like **gradient descent**.

Over many iterations, this process fine-tunes the parameters so the network gets better at the task.

---

### **Illustrative Example: Recognizing Handwritten Digits**

Imagine building a Neural Network to recognize handwritten numbers from 0 to 9, like those in the MNIST dataset.

* The **input layer** takes the pixel values of the image (say, 28×28 = 784 inputs).
* The **hidden layers** perform mathematical operations on these inputs.
* The **output layer** has 10 neurons, each representing the probability of the digit being 0 through 9.

During training:

* The network processes an image, predicts a digit.
* The loss function compares the prediction to the correct digit.
* The weights and biases are adjusted to reduce the error.
* Over time, the network "learns" to recognize patterns in the images that correspond to different digits.

---

### **Deep Neural Networks**

If a network has many hidden layers, it is called a **Deep Neural Network**. The "deep" part refers to the network's depth, not necessarily its complexity. Deep networks have shown great success in tasks that require recognizing complex structures, such as:

* Understanding natural language.
* Detecting objects in images.
* Generating human-like text.

The famous architectures that power systems like ChatGPT, image recognition models, and speech recognition systems are all based on deep neural networks.

---

### **Summary**

Neural Networks form the backbone of modern AI systems. At their core, they process inputs through layers of interconnected neurons, each connection governed by adjustable parameters called weights and biases. Through training, they learn to adjust these parameters to solve tasks like classification, prediction, and generation.

In the next part of this lecture, we will build on this foundation by exploring **Tokens and Tokenization**, which explains how raw text data is prepared for AI models — a crucial step before neural networks like Transformers can process language.