# Introduction to Large Language Models (LLMs)

---

## Definition
Large Language Models (LLMs) are **deep learning models trained on massive text corpora** to understand, process, and generate human-like language.  
They can perform a wide range of NLP tasks, such as text generation, summarization, translation, and question answering.

---

## Key Concepts

### 1. Transformers
- **Definition:** A neural network architecture that processes sequences in parallel rather than sequentially.  
- **Purpose:** Efficiently captures dependencies between all tokens in a sequence, regardless of distance.  
- **Components:**
  - **Encoder:** Processes input sequences and generates contextual embeddings.  
  - **Decoder:** Generates output sequences based on encoder information (used in sequence-to-sequence tasks).  

### 2. Attention Mechanism
- **Definition:** A mechanism that allows the model to focus on the most relevant parts of the input sequence when generating an output.  
- **Importance:** Helps the model capture **long-range dependencies** and contextual relationships between words.  
- **Types:**
  - **Self-Attention:** Each token attends to all other tokens in the same sequence.  
  - **Cross-Attention:** In sequence-to-sequence models, output tokens attend to input tokens.

---

## Evolution of LLMs

| Model | Developer | Year | Key Idea |
|-------|-----------|------|---------|
| **GPT** | OpenAI | 2018 | Autoregressive Transformer for text generation |
| **BERT** | Google | 2018 | Bidirectional encoder for understanding context |
| **T5** | Google | 2019 | Text-to-text framework for NLP tasks |
| **GPT-2 / GPT-3 / GPT-4** | OpenAI | 2019–2023 | Larger models with few-shot and zero-shot capabilities |
| **Other LLMs** | Meta, Anthropic, etc. | 2023+ | Efficient and open-weight LLMs for research and industry |

**Summary:**  
LLMs are built on the Transformer architecture and leverage attention mechanisms to understand context. Their evolution from GPT to BERT, T5, and GPT-4 has progressively improved their ability to generate and comprehend human-like text across diverse NLP tasks.


In [1]:
from transformers import T5Tokenizer, T5ForConditionalGeneration

# Load pre-trained T5 model
tokenizer = T5Tokenizer.from_pretrained("t5-small")
model = T5ForConditionalGeneration.from_pretrained("t5-small")

# Input text
text = "Transformers are neural network architectures that have revolutionized NLP. They use attention mechanisms to process sequences efficiently."

# Prepend task prefix for summarization
input_text = "summarize: " + text
inputs = tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True)

# Generate summary
summary_ids = model.generate(inputs["input_ids"], max_length=50, min_length=10, length_penalty=2.0, num_beams=4, early_stopping=True)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

print("Summary:\n", summary)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/2.32k [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.39M [00:00<?, ?B/s]

You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565


config.json:   0%|          | 0.00/1.21k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/242M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

Summary:
 Transformers are neural network architectures that have revolutionized NLP. they use attention mechanisms to process sequences efficiently.
