# Lesson 1: Introduction to Large Language Models and Transformers

## Course: Development of Agentic AI Systems for Advertising Campaign Analysis

---

### Lesson Objectives

In this lesson we will explore:
1. What Large Language Models (LLMs) are and how they work
2. The Transformer architecture and its key components
3. Overview of the main available LLM models
4. Practical applications in business contexts
5. **NEW: Text embeddings and semantic search**

---

## Part 1: What are Large Language Models

Large Language Models are artificial intelligence models trained on enormous amounts of text to understand and generate natural language. Unlike traditional rule-based systems, LLMs learn linguistic patterns directly from data.

### Main characteristics:

- **Context understanding**: LLMs analyze the meaning of words based on surrounding context
- **Coherent generation**: They produce fluid and grammatically correct text
- **Multitask**: Can perform various tasks without being reprogrammed
- **Few-shot learning**: Learn new tasks with few examples

## Part 2: The Transformer Architecture

The Transformer architecture, introduced in 2017 in the paper "Attention is All You Need", represents the foundation of all modern LLMs.

### Fundamental Components

#### 1. Tokens
Tokens are the basic units of processing. Text is divided into tokens that can represent entire words, parts of words, or single characters.

**Tokenization example:**

In [None]:
# Install the tiktoken library for tokenization (used by OpenAI)
# !pip install tiktoken

import tiktoken

# Create an encoder for GPT-4
encoding = tiktoken.encoding_for_model("gpt-4")

# Example text
text = "Analysis of television advertising campaign performance."

# Tokenize the text
tokens = encoding.encode(text)

print(f"Original text: {text}")
print(f"\nNumber of tokens: {len(tokens)}")
print(f"\nToken IDs: {tokens}")

# Decode each token
print("\nDecoded tokens:")
for i, token_id in enumerate(tokens):
    token_text = encoding.decode([token_id])
    print(f"  {i+1}. '{token_text}' (ID: {token_id})")

#### 2. Self-Attention

The self-attention mechanism allows the model to consider all words in a sentence simultaneously, assigning different "attention weights" to each word based on their contextual relevance.

**Conceptual example:**

In the sentence: *"The advertising campaign reached the predicted target"*

- When the model processes "target", it pays particular attention to "campaign" and "advertising"
- When processing "predicted", it focuses on "target" and "reached"

This mechanism allows the model to capture long-range relationships in text.

**Simplified visualization of the mechanism:**

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Simplified example of attention matrix
words = ['The', 'campaign', 'reached', 'the', 'target', 'goal']

# Simulated attention matrix (normalized)
# Each row shows how much a word "pays attention" to others
attention_matrix = np.array([
    [0.1, 0.3, 0.1, 0.1, 0.2, 0.2],  # The
    [0.2, 0.3, 0.1, 0.1, 0.1, 0.2],  # campaign
    [0.1, 0.2, 0.2, 0.3, 0.1, 0.1],  # reached
    [0.1, 0.2, 0.2, 0.2, 0.1, 0.2],  # the
    [0.1, 0.1, 0.1, 0.1, 0.2, 0.4],  # target
    [0.1, 0.3, 0.1, 0.2, 0.2, 0.1],  # goal
])

# Visualize the matrix
plt.figure(figsize=(8, 6))
sns.heatmap(attention_matrix, 
            xticklabels=words, 
            yticklabels=words,
            annot=True, 
            fmt='.2f',
            cmap='YlOrRd',
            cbar_kws={'label': 'Attention Weight'})
plt.title('Example of Attention Matrix (Self-Attention)', fontsize=14, pad=20)
plt.xlabel('Words (Key)', fontsize=12)
plt.ylabel('Words (Query)', fontsize=12)
plt.tight_layout()
plt.show()

print("\nInterpretation:")
print("- More intense colors indicate greater attention")
print("- 'target' pays a lot of attention to 'goal' (0.4)")
print("- 'campaign' is relevant to many other words")

#### 3. Context Window

The context window represents the maximum amount of text that a model can process in a single inference. This capability is measured in number of tokens, not words or characters.

**Context window comparison:**

| Model | Context Window | Approximate Equivalent |
|-------|----------------|------------------------|
| GPT-3.5 | 4,096 tokens | ~3,000 words |
| GPT-4 | 128,000 tokens | ~96,000 words |
| Claude 3 | 200,000 tokens | ~150,000 words |

**Practical example:**

In [None]:
# Calculate how much text fits in different context windows

example_document = """
Quarterly Campaign Report - Q1 2024

Executive Summary:
The Spring campaign for Brand XYZ achieved significant reach across multiple 
channels. Total investment of ‚Ç¨500,000 delivered 15.2M impressions with an 
average frequency of 4.2.

Key Performance Indicators:
- Overall Reach: 62% (4.1M unique contacts)
- Frequency: 4.2
- Total GRPs: 260
- Average CPP: ‚Ç¨1,923
- CPM: ‚Ç¨32.89

Channel Distribution:
Linear TV: 55% of budget (‚Ç¨275,000)
BVOD: 30% of budget (‚Ç¨150,000)
Social Media: 15% of budget (‚Ç¨75,000)
"""

# Tokenize the document
tokens = encoding.encode(example_document)
num_tokens = len(tokens)
num_words = len(example_document.split())

print(f"Document Analysis:")
print(f"- Number of characters: {len(example_document)}")
print(f"- Number of words: {num_words}")
print(f"- Number of tokens: {num_tokens}")
print(f"- Token/word ratio: {num_tokens/num_words:.2f}")

# Check compatibility with different models
print("\nContext Window Compatibility:")
models = {
    'GPT-3.5': 4096,
    'GPT-4': 8192,
    'GPT-4-Turbo': 128000,
    'Claude 3': 200000
}

for model_name, window_size in models.items():
    fits = "‚úì" if num_tokens < window_size else "‚úó"
    percentage = (num_tokens / window_size) * 100
    print(f"{fits} {model_name}: {percentage:.2f}% of context window used")

## Part 3: Overview of Main Large Language Models

### Comparative Table

| Feature | GPT-4 | Claude 3 | Llama 3 70B | Mistral Large |
|---------|-------|----------|-------------|---------------|
| **Type** | Proprietary | Proprietary | Open-source | Hybrid |
| **Context** | 128K | 200K | 8K | 32K |
| **Parameters** | ~1.7T | ~200B | 70B | ~100B |
| **Cost** | $$$ | $$$ | Free* | $$ |
| **Deploy** | API | API | Local/Cloud | API/Local |

*Self-hosting requires infrastructure

### Model Selection Criteria

1. **Context window size**: Evaluate typical document length
2. **Budget**: Consider cost per token and expected volume
3. **Privacy requirements**: For sensitive data, consider on-premise solutions
4. **Multilingual capabilities**: Verify support for required languages
5. **Latency requirements**: Consider response time needs

## Part 4: Practical Application - Campaign Assistant

In [None]:
# Simple Campaign Assistant (without LLM, rule-based)

class CampaignAssistant:
    def __init__(self):
        self.campaign_data = {
            'reach': 0.58,  # 58%
            'frequency': 4.1,
            'unique_contacts': 3_800_000,
            'grp': 237,
            'period': '6 weeks'
        }
    
    def process_query(self, query):
        query_lower = query.lower()
        
        if 'reach' in query_lower:
            return f"The campaign reach is {self.campaign_data['reach']*100}% ({self.campaign_data['unique_contacts']:,} unique contacts)"
        
        elif 'frequency' in query_lower:
            return f"The average frequency is {self.campaign_data['frequency']}"
        
        elif 'grp' in query_lower:
            return f"Total GRPs: {self.campaign_data['grp']}"
        
        elif 'period' in query_lower or 'duration' in query_lower:
            return f"The campaign duration is {self.campaign_data['period']}"
        
        else:
            return "I can help you with information about reach, frequency, GRP, or campaign period."

# Test the assistant
assistant = CampaignAssistant()

queries = [
    "What is the reach of the campaign?",
    "Tell me the frequency",
    "How long did the campaign last?",
    "What about the budget?"  # Question the assistant cannot answer
]

for query in queries:
    print(f"Q: {query}")
    print(f"A: {assistant.process_query(query)}")
    print()

## Part 5: Introduction to Text Embeddings üÜï

### What are Embeddings?

Embeddings are numerical vector representations of text that capture semantic meaning. Words or sentences with similar meanings have similar embedding vectors.

### Key Characteristics:

- **Dense vectors**: Typically 384-1536 dimensions
- **Semantic similarity**: Similar texts have similar embeddings
- **Use cases**: 
  - Semantic search
  - Document clustering
  - Recommendation systems
  - Question answering systems

### Why are they important for our project?

Embeddings will allow us to:
1. Search through campaign documentation semantically
2. Find similar campaigns based on descriptions
3. Build a knowledge base for the agentic system
4. Enable more natural query understanding

### Installing Required Libraries

In [None]:
# Install required libraries
# !pip install sentence-transformers scikit-learn numpy pandas

### Generating Embeddings with Sentence Transformers

In [None]:
from sentence_transformers import SentenceTransformer
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

# Load a pre-trained embedding model
# We'll use 'all-MiniLM-L6-v2' which is fast and efficient
model = SentenceTransformer('all-MiniLM-L6-v2')

# Example campaign descriptions
campaign_descriptions = [
    "TV advertising campaign targeting adults 25-54 with focus on prime time slots",
    "Digital video campaign on streaming platforms for young adults 18-34",
    "Multi-channel campaign combining linear TV and BVOD for brand awareness",
    "Social media advertising campaign targeting female audience 25-44",
    "Prime time television campaign for product launch reaching broad audience"
]

# Generate embeddings
embeddings = model.encode(campaign_descriptions)

print(f"Number of campaigns: {len(campaign_descriptions)}")
print(f"Embedding dimensions: {embeddings.shape[1]}")
print(f"\nEmbedding for first campaign (first 10 dimensions):")
print(embeddings[0][:10])

### Semantic Similarity Search

In [None]:
# Function to find similar campaigns
def find_similar_campaigns(query, campaign_descriptions, embeddings, top_k=3):
    """
    Find the most similar campaigns to a query using semantic search.
    
    Args:
        query: Search query string
        campaign_descriptions: List of campaign description strings
        embeddings: Pre-computed embeddings of campaigns
        top_k: Number of most similar results to return
    
    Returns:
        List of tuples (campaign_description, similarity_score)
    """
    # Generate embedding for the query
    query_embedding = model.encode([query])
    
    # Calculate cosine similarity
    similarities = cosine_similarity(query_embedding, embeddings)[0]
    
    # Get top k most similar
    top_indices = np.argsort(similarities)[::-1][:top_k]
    
    results = []
    for idx in top_indices:
        results.append({
            'campaign': campaign_descriptions[idx],
            'similarity': similarities[idx]
        })
    
    return results

# Test semantic search
test_queries = [
    "streaming video advertising for millennials",
    "television commercials during evening hours",
    "social advertising for women"
]

for query in test_queries:
    print(f"\n{'='*60}")
    print(f"Query: {query}")
    print('='*60)
    
    results = find_similar_campaigns(query, campaign_descriptions, embeddings, top_k=3)
    
    for i, result in enumerate(results, 1):
        print(f"\n{i}. Similarity: {result['similarity']:.3f}")
        print(f"   Campaign: {result['campaign']}")

### Visualizing Embeddings with Dimensionality Reduction

In [None]:
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt

# Reduce embeddings from 384 dimensions to 2 for visualization
pca = PCA(n_components=2)
embeddings_2d = pca.fit_transform(embeddings)

# Create visualization
plt.figure(figsize=(12, 8))

# Plot campaign embeddings
plt.scatter(embeddings_2d[:, 0], embeddings_2d[:, 1], 
           s=200, alpha=0.6, c='steelblue', edgecolors='black', linewidth=2)

# Add labels for each point
for i, desc in enumerate(campaign_descriptions):
    # Truncate long descriptions for readability
    label = desc[:40] + '...' if len(desc) > 40 else desc
    plt.annotate(f"C{i+1}", 
                xy=(embeddings_2d[i, 0], embeddings_2d[i, 1]),
                xytext=(5, 5), textcoords='offset points',
                fontsize=12, fontweight='bold')

plt.title('Campaign Embeddings Visualization (PCA)', fontsize=16, pad=20)
plt.xlabel('First Principal Component', fontsize=12)
plt.ylabel('Second Principal Component', fontsize=12)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Print legend
print("\nCampaign Legend:")
for i, desc in enumerate(campaign_descriptions, 1):
    print(f"C{i}: {desc}")

print("\nNote: Campaigns closer together in the plot have more similar semantic meaning.")

### Building a Simple Campaign Knowledge Base

In [None]:
import pandas as pd

class CampaignKnowledgeBase:
    """
    A simple knowledge base for campaign information using embeddings.
    This will be useful for our agentic system to retrieve relevant information.
    """
    
    def __init__(self, model_name='all-MiniLM-L6-v2'):
        self.model = SentenceTransformer(model_name)
        self.documents = []
        self.embeddings = None
        self.metadata = []
    
    def add_documents(self, documents, metadata=None):
        """
        Add documents to the knowledge base.
        
        Args:
            documents: List of text documents
            metadata: Optional list of metadata dictionaries for each document
        """
        self.documents.extend(documents)
        
        if metadata:
            self.metadata.extend(metadata)
        else:
            self.metadata.extend([{} for _ in documents])
        
        # Generate embeddings for all documents
        self.embeddings = self.model.encode(self.documents)
        
        print(f"Added {len(documents)} documents. Total documents: {len(self.documents)}")
    
    def search(self, query, top_k=3):
        """
        Search for relevant documents using semantic similarity.
        
        Args:
            query: Search query string
            top_k: Number of results to return
        
        Returns:
            List of dictionaries with document text, metadata, and similarity score
        """
        if not self.documents:
            return []
        
        # Generate query embedding
        query_embedding = self.model.encode([query])
        
        # Calculate similarities
        similarities = cosine_similarity(query_embedding, self.embeddings)[0]
        
        # Get top k results
        top_indices = np.argsort(similarities)[::-1][:top_k]
        
        results = []
        for idx in top_indices:
            results.append({
                'document': self.documents[idx],
                'metadata': self.metadata[idx],
                'similarity': float(similarities[idx])
            })
        
        return results
    
    def get_stats(self):
        """Get statistics about the knowledge base."""
        return {
            'num_documents': len(self.documents),
            'embedding_dimensions': self.embeddings.shape[1] if self.embeddings is not None else 0
        }

# Create knowledge base
kb = CampaignKnowledgeBase()

# Add campaign documentation
campaign_docs = [
    "Reach represents the percentage of unique individuals who saw the advertisement at least once during the campaign period. It is measured as a deduplicated count.",
    "Frequency indicates the average number of times each person in the target audience was exposed to the advertisement during the campaign.",
    "GRP (Gross Rating Points) is calculated as Reach √ó Frequency and represents the total weight of the campaign delivery.",
    "Impacts (impressions) refer to the total number of times the advertisement was displayed. Unlike reach, impacts are additive across time periods.",
    "BVOD (Broadcaster Video On Demand) refers to streaming video content from traditional broadcasters, measured on both big screens (TV) and small screens (mobile/tablet).",
    "The target audience can be defined by demographic filters including sex (M/F) and age breaks (03-14, 15-24, 25-34, 35-44, 45-54, 55-64, 65+).",
    "CPM (Cost Per Mille) is the cost of reaching 1,000 impressions and is calculated as (Total Cost / Total Impacts) √ó 1000.",
    "Prime time refers to the evening broadcast period from 20:00 to 22:30, typically achieving the highest viewership and impact delivery."
]

# Add metadata for each document
metadata = [
    {'category': 'metrics', 'topic': 'reach'},
    {'category': 'metrics', 'topic': 'frequency'},
    {'category': 'metrics', 'topic': 'grp'},
    {'category': 'metrics', 'topic': 'impacts'},
    {'category': 'channels', 'topic': 'bvod'},
    {'category': 'targeting', 'topic': 'demographics'},
    {'category': 'metrics', 'topic': 'cost'},
    {'category': 'timing', 'topic': 'daypart'}
]

kb.add_documents(campaign_docs, metadata)

# Test the knowledge base
print(f"\nKnowledge Base Stats: {kb.get_stats()}")

### Testing the Knowledge Base with Queries

In [None]:
# Test queries
test_queries = [
    "How do I calculate the total campaign weight?",
    "What is the difference between reach and impressions?",
    "How can I target young adults in my campaign?",
    "What are the best time slots for maximum audience?"
]

print("Testing Campaign Knowledge Base\n")
print("="*80)

for query in test_queries:
    print(f"\nüìã Query: {query}")
    print("-"*80)
    
    results = kb.search(query, top_k=2)
    
    for i, result in enumerate(results, 1):
        print(f"\n  Result {i} (Similarity: {result['similarity']:.3f})")
        print(f"  Category: {result['metadata'].get('category', 'N/A')}")
        print(f"  Text: {result['document']}")
    
    print()

### Practical Application: Enhanced Campaign Assistant with Embeddings

In [None]:
class EnhancedCampaignAssistant:
    """
    Campaign assistant that combines rule-based responses with semantic search.
    This demonstrates how we'll use embeddings in our agentic system.
    """
    
    def __init__(self, knowledge_base):
        self.kb = knowledge_base
        self.campaign_data = {
            'reach': 0.58,
            'frequency': 4.1,
            'unique_contacts': 3_800_000,
            'grp': 237,
            'impressions': 15_580_000,
            'budget': 500_000,
            'period': '6 weeks'
        }
    
    def process_query(self, query):
        """
        Process user query by combining data lookup with knowledge base search.
        """
        query_lower = query.lower()
        
        # Check if query is asking for specific campaign data
        if any(metric in query_lower for metric in ['reach', 'frequency', 'grp', 'impressions', 'budget']):
            response = self._get_metric_data(query_lower)
        else:
            # Use knowledge base for conceptual questions
            response = self._search_knowledge_base(query)
        
        return response
    
    def _get_metric_data(self, query):
        """Get specific metric data from campaign."""
        if 'reach' in query:
            return f"Campaign reach: {self.campaign_data['reach']*100}% ({self.campaign_data['unique_contacts']:,} unique contacts)"
        elif 'frequency' in query:
            return f"Average frequency: {self.campaign_data['frequency']}"
        elif 'grp' in query:
            return f"Total GRPs: {self.campaign_data['grp']}"
        elif 'impressions' in query or 'impacts' in query:
            return f"Total impressions: {self.campaign_data['impressions']:,}"
        elif 'budget' in query:
            return f"Campaign budget: ‚Ç¨{self.campaign_data['budget']:,}"
    
    def _search_knowledge_base(self, query):
        """Search knowledge base for relevant information."""
        results = self.kb.search(query, top_k=2)
        
        if not results:
            return "I don't have information about that. Please try rephrasing your question."
        
        # Format response with most relevant result
        best_result = results[0]
        response = f"Based on the knowledge base (similarity: {best_result['similarity']:.2f}):\n\n"
        response += best_result['document']
        
        if len(results) > 1 and results[1]['similarity'] > 0.5:
            response += f"\n\nAdditionally:\n{results[1]['document']}"
        
        return response

# Create enhanced assistant
enhanced_assistant = EnhancedCampaignAssistant(kb)

# Test with various queries
test_queries = [
    "What was the reach of the campaign?",
    "How do I calculate GRP?",
    "Explain the difference between reach and impressions",
    "What is BVOD?",
    "Tell me about the campaign frequency"
]

print("Enhanced Campaign Assistant with Embeddings\n")
print("="*80)

for query in test_queries:
    print(f"\n‚ùì Question: {query}")
    print("-"*80)
    response = enhanced_assistant.process_query(query)
    print(f"üí° Answer: {response}")
    print()

### Key Takeaways: Embeddings for Agentic Systems

1. **Semantic Understanding**: Embeddings enable our system to understand the meaning of queries, not just keywords

2. **Knowledge Retrieval**: We can build a knowledge base that the agent can query intelligently

3. **Scalability**: As we add more campaign documentation, embeddings allow efficient semantic search

4. **Foundation for RAG**: This is the basis for Retrieval-Augmented Generation, which we'll use in our agentic system

### Next Steps in the Course:

- Integrate embeddings with LLMs for more sophisticated responses
- Build a vector database for efficient large-scale search
- Implement RAG (Retrieval-Augmented Generation) patterns
- Use embeddings to help agents decide which tools to use

---

## Practical Exercises

### Exercise 1: Tokenization Analysis

**Objective:** Understand how different texts are tokenized.

**Task:** 
1. Take the following advertising campaign text
2. Calculate the number of tokens using GPT-4 encoder
3. Analyze the token/word ratio
4. Determine if the text can be processed by models with different context windows

In [None]:
# EXERCISE 1

exercise_text = """
The multi-channel campaign for brand XYZ featured a coordinated implementation
across linear television, streaming video, and social media platforms. The flight
period extended for 6 consecutive weeks, with a budget allocation of 60% on linear TV,
30% on BVOD, and 10% on social media.

Preliminary results show:
- Overall reach: 58% (3.8M unique contacts)
- Average frequency: 4.1
- Total GRPs: 237
- Average CPP: ‚Ç¨1,250

Performance by time slot shows a peak during prime time (20:00-22:30)
with 45% of impressions concentrated in this time window.
"""

# Your code here:
# 1. Tokenize the text
# 2. Calculate number of tokens, words, characters
# 3. Calculate token/word ratio
# 4. Check compatibility with context windows: GPT-3.5 (4K), GPT-4 (8K), Claude (200K)

# Solution:


### Exercise 2: Assistant Extension

**Objective:** Create an enhanced conversational assistant.

**Task:**
Extend the `CampaignAssistant` class by adding:
1. The ability to answer budget questions
2. The ability to calculate CPM (Cost Per Thousand)
3. The ability to compare two campaigns

In [None]:
# EXERCISE 2

class ExtendedCampaignAssistant(CampaignAssistant):
    def __init__(self):
        super().__init__()
        # Add budget data
        self.campaign_data['budget'] = 150000  # euros
    
    def process_query(self, query):
        # Your code here:
        # Extend the method to handle:
        # - Budget questions
        # - CPM calculation
        # - Other metrics
        
        pass

# Test your extended assistant
# assistant_extended = ExtendedCampaignAssistant()
# print(assistant_extended.process_query("What is the campaign budget?"))
# print(assistant_extended.process_query("What is the CPM?"))

### Exercise 3: Model Comparison

**Objective:** Understand when to choose one model over another.

**Task:**
For each of the following scenarios, indicate which LLM model you would recommend and why:

1. Analysis of monthly reports of 50 pages each to extract KPIs
2. Customer service chatbot with 10,000 requests per day
3. Sensitive data analysis system that must remain on-premise
4. Creative content generation for multilingual social campaigns
5. Quick prototype for client demo

In [None]:
# EXERCISE 3
# Write your answers below as comments

exercise_3_answers = {
    'scenario_1': {
        'recommended_model': '',  # Insert model
        'rationale': ''  # Explain why
    },
    'scenario_2': {
        'recommended_model': '',
        'rationale': ''
    },
    'scenario_3': {
        'recommended_model': '',
        'rationale': ''
    },
    'scenario_4': {
        'recommended_model': '',
        'rationale': ''
    },
    'scenario_5': {
        'recommended_model': '',
        'rationale': ''
    }
}

# Example answer for scenario_1:
# exercise_3_answers['scenario_1'] = {
#     'recommended_model': 'Claude 3',
#     'rationale': 'Context window of 200K tokens allows analyzing very long documents in a single call'
# }

### Exercise 4: Building Your Own Knowledge Base üÜï

**Objective:** Create a knowledge base for a specific domain.

**Task:**
1. Create a knowledge base with at least 5 documents about advertising concepts
2. Add appropriate metadata for each document
3. Test it with 3 different queries
4. Analyze the similarity scores

In [None]:
# EXERCISE 4

# Create your own knowledge base
my_kb = CampaignKnowledgeBase()

# Your documents here
my_documents = [
    # Add your documents
]

my_metadata = [
    # Add your metadata
]

# Add documents to knowledge base
# my_kb.add_documents(my_documents, my_metadata)

# Test with queries
# test_queries = [
#     "Your query 1",
#     "Your query 2",
#     "Your query 3"
# ]

# for query in test_queries:
#     print(f"\nQuery: {query}")
#     results = my_kb.search(query, top_k=2)
#     for result in results:
#         print(f"Similarity: {result['similarity']:.3f}")
#         print(f"Document: {result['document']}")

### Further Reading Resources:

**Core Papers:**
- **Original Transformer Paper**: "Attention is All You Need" (Vaswani et al., 2017)
- **Sentence-BERT**: "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks" (Reimers & Gurevych, 2019)

**Documentation:**
- **OpenAI Documentation**: https://platform.openai.com/docs/
- **Anthropic Claude Documentation**: https://docs.anthropic.com/
- **Hugging Face NLP Course**: https://huggingface.co/learn/nlp-course/
- **Sentence Transformers**: https://www.sbert.net/

**Additional Topics:**
- Vector databases (Pinecone, Weaviate, ChromaDB)
- RAG (Retrieval-Augmented Generation)
- Prompt engineering techniques

---

**End of Lesson 1**