# Lesson 33: Compose Test Cases for All Public APIs

## Introduction (5 minutes)

Welcome to our lesson on composing test cases for all public APIs in our RAG system. In this 60-minute session, we'll explore the importance of API testing, different types of test cases, and how to implement them for our RAG system's components.

## Lesson Objectives

By the end of this lesson, you will be able to:
1. Understand the importance of API testing in RAG systems
2. Identify different types of test cases for API testing
3. Write test cases for the main components of our RAG system
4. Implement automated tests using Python and pytest

## 1. Importance of API Testing in RAG Systems (10 minutes)

API testing is crucial for ensuring the reliability, performance, and correctness of our RAG system. It helps us:

- Verify that each component works as expected
- Ensure proper integration between components
- Detect errors and edge cases early in the development process
- Maintain system stability during updates and modifications

For our RAG system, we'll focus on testing the following main components:
1. Embedding model API
2. Vector database API
3. Language model API
4. RAG proxy service API

## 2. Types of Test Cases for API Testing (10 minutes)

Let's explore different types of test cases we should consider:

1. Functional tests: Verify that the API behaves correctly for valid inputs
2. Input validation tests: Check how the API handles invalid or unexpected inputs
3. Performance tests: Assess the API's response time and resource usage
4. Error handling tests: Ensure the API returns appropriate error messages
5. Integration tests: Verify that different components work together correctly

## 3. Writing Test Cases for RAG System Components (30 minutes)

Let's write test cases for each main component of our RAG system:

### 3.1 Embedding Model API Tests

In [None]:
import pytest
from embedding_model import EmbeddingModel

@pytest.fixture
def embedding_model():
    return EmbeddingModel()

def test_embedding_generation(embedding_model):
    text = "This is a test sentence."
    embedding = embedding_model.generate_embedding(text)
    assert len(embedding) == 384  # Assuming 384-dimensional embeddings
    assert all(isinstance(x, float) for x in embedding)

def test_embedding_similarity(embedding_model):
    text1 = "The cat sat on the mat."
    text2 = "A feline rested on a rug."
    similarity = embedding_model.compute_similarity(text1, text2)
    assert 0 <= similarity <= 1  # Similarity should be between 0 and 1

def test_empty_input(embedding_model):
    with pytest.raises(ValueError):
        embedding_model.generate_embedding("")

### 3.2 Vector Database API Tests

In [None]:
import pytest
from vector_db import VectorDB

@pytest.fixture
def vector_db():
    db = VectorDB()
    db.connect()
    yield db
    db.disconnect()

def test_insert_and_query(vector_db):
    text = "Test document"
    embedding = [0.1] * 384  # Mock embedding
    doc_id = vector_db.insert(text, embedding)
    assert isinstance(doc_id, int)
    
    results = vector_db.query(embedding, top_k=1)
    assert len(results) == 1
    assert results[0]['id'] == doc_id
    assert results[0]['text'] == text

def test_query_empty_db(vector_db):
    results = vector_db.query([0.1] * 384, top_k=5)
    assert len(results) == 0

def test_delete_document(vector_db):
    text = "Document to delete"
    embedding = [0.2] * 384
    doc_id = vector_db.insert(text, embedding)
    assert vector_db.delete(doc_id) is True
    results = vector_db.query(embedding, top_k=1)
    assert len(results) == 0

### 3.3 Language Model API Tests

In [None]:
import pytest
from language_model import LanguageModel

@pytest.fixture
def language_model():
    return LanguageModel()

def test_text_generation(language_model):
    prompt = "Once upon a time"
    generated_text = language_model.generate(prompt, max_length=50)
    assert len(generated_text) > len(prompt)
    assert generated_text.startswith(prompt)

def test_text_completion(language_model):
    prompt = "The capital of France is"
    completion = language_model.complete(prompt)
    assert "Paris" in completion

def test_long_input(language_model):
    long_prompt = "a" * 1000  # Very long input
    generated_text = language_model.generate(long_prompt, max_length=100)
    assert len(generated_text) <= 100

def test_invalid_input(language_model):
    with pytest.raises(ValueError):
        language_model.generate("")

### 3.4 RAG Proxy Service API Tests

In [None]:
import pytest
from rag_proxy_service import RAGProxyService

@pytest.fixture
def rag_service():
    return RAGProxyService()

def test_query_processing(rag_service):
    query = "What is the capital of France?"
    response = rag_service.process_query(query)
    assert isinstance(response, str)
    assert len(response) > 0
    assert "Paris" in response

def test_query_with_context(rag_service):
    query = "What is the population of Tokyo?"
    context = "Tokyo is the capital of Japan and has a population of approximately 14 million people."
    response = rag_service.process_query_with_context(query, context)
    assert "14 million" in response

def test_invalid_query(rag_service):
    with pytest.raises(ValueError):
        rag_service.process_query("")

def test_query_performance(rag_service):
    query = "What is the theory of relativity?"
    start_time = time.time()
    rag_service.process_query(query)
    end_time = time.time()
    assert end_time - start_time < 5  # Assuming response should be under 5 seconds

## 4. Implementing Automated Tests (5 minutes)

To run these tests automatically, we'll use pytest. Here's how to set it up:

1. Install pytest:
   ```
   pip install pytest
   ```

2. Create a `tests` directory in your project and place your test files there.

3. Run the tests:
   ```
   pytest tests/
   ```

You can also integrate these tests into your CI/CD pipeline for automatic testing on each commit or pull request.

## Conclusion and Next Steps (5 minutes)

In this lesson, we've explored the importance of API testing in RAG systems and composed test cases for all the main components of our system. We've written functional tests, input validation tests, and even some basic performance tests.

For the next lesson, we'll focus on packaging our RAG system as a Docker image and deploying it to a cloud environment. This will involve containerizing our application and setting up cloud infrastructure.

Are there any questions about the test cases we've written or the testing process in general?

## Additional Resources

1. pytest documentation: https://docs.pytest.org/
2. "API Testing Best Practices" article: https://www.katalon.com/resources-center/blog/api-testing-best-practices/
3. "Python Testing with pytest" book by Brian Okken
4. "Effective Python Testing with Pytest" tutorial: https://realpython.com/pytest-python-testing/

For the next lesson, please review basic Docker concepts and familiarize yourself with cloud deployment principles.