# Assignment: Building a REST API for a RAG Pipeline with FastAPI

## Objective
This assignment focuses on transforming a Retrieval Augmented Generation (RAG) pipeline into a functional REST API using FastAPI. You will integrate previously built RAG components (document indexing, retrieval, and LLM generation) into a web service, allowing external applications to interact with your chatbot.

## Part 1: Setup and RAG Pipeline Integration (20 Marks)

1.  **Environment Setup:**
    * Create a new Python virtual environment.
    * Install all necessary libraries: `fastapi`, `uvicorn`, `pydantic`, `sentence-transformers`, `faiss-cpu`, `torch`/`tensorflow` (for LLM), `transformers`.
    * Provide a `requirements.txt` file.

2.  **RAG Pipeline Review/Re-implementation:**
    * **Option A (Recommended):** Reuse the core components (document indexing, retrieval, and a basic LLM integration) from your previous RAG assignment.
    * **Option B:** If you haven't completed the previous RAG assignment, implement a simplified version for this assignment:
        * **Document Corpus:** A small collection of text documents (e.g., 10-20 short paragraphs about a specific topic).
        * **Embedding Model:** `sentence-transformers/all-MiniLM-L6-v2`.
        * **Vector Store:** `FAISS`.
        * **LLM:** A basic pre-trained language model (e.g., `distilgpt2` or `gpt2` from Hugging Face Transformers) for text generation, *without* fine-tuning.
    * Ensure your RAG pipeline is encapsulated in a Python class or a set of functions that can be easily imported and used.
    * Demonstrate that your RAG pipeline can successfully answer a few sample queries before proceeding to the API part.

In [None]:
# Your code for environment setup, requirements.txt, and RAG pipeline integration/re-implementation here.
# Demonstrate your RAG pipeline with a few sample queries and responses.

## Part 2: FastAPI Application (40 Marks)

1.  **FastAPI Application Structure:**
    * Create a FastAPI application instance.
    * Define a `main.py` (or similar) file that initializes your RAG pipeline and starts the FastAPI app.

2.  **Pydantic Models:**
    * Define a Pydantic `Request` model for the incoming user query. It should have a field for `query` (string).
    * Define a Pydantic `Response` model for the API's output. It should include at least:
        * `answer` (string): The LLM-generated response.
        * `retrieved_docs` (List[str]): A list of the top `k` retrieved document chunks (or their identifiers).
        * (Optional, for extra credit) `similarity_scores` (List[float]): The similarity scores for the retrieved documents.

3.  **API Endpoint for Chat:**
    * Create a `POST` endpoint (e.g., `/chat` or `/ask`) that accepts your `Request` model.
    * Inside this endpoint:
        * Extract the `query` from the request.
        * Call your RAG pipeline's retrieval function with the query.
        * Call your RAG pipeline's generation function with the query and retrieved context.
        * Return the response using your `Response` model.

4.  **Health Check Endpoint (Optional, for extra credit):**
    * Implement a simple `GET` endpoint (e.g., `/health`) that returns a status message (e.g., `{"status": "ok"}`) to indicate the API is running.

In [None]:
# Your code for the FastAPI application, Pydantic models, and API endpoints here.
# You will typically put this in a separate Python file (e.g., `main.py` or `app.py`) for the actual API, but for the assignment, you can demonstrate the core logic here.

## Part 3: Running and Testing the API (30 Marks)

1.  **Run FastAPI Application:**
    * Provide instructions on how to run your FastAPI application using `uvicorn`.
        * Example: `uvicorn main:app --reload`

2.  **Test with cURL/Requests (20 Marks):**
    * Once the API is running, use `curl` commands or a Python `requests` script to test your `/chat` endpoint.
    * Provide at least **3 distinct test queries**.
    * For each test, show:
        * The `curl` command or Python `requests` code.
        * The JSON response received from the API.
        * A brief analysis of the response (e.g., is the `answer` relevant? Are the `retrieved_docs` correct?).

3.  **Explore FastAPI Docs (10 Marks):**
    * Access the automatically generated OpenAPI (Swagger UI) documentation at `http://127.0.0.1:8000/docs` (or your chosen host/port).
    * Take a screenshot of your `/chat` endpoint within the Swagger UI.
    * Briefly describe how the interactive documentation helps in testing and understanding your API.

In [None]:
# Provide instructions on running the API.
# Your curl commands/requests script, responses, and analysis here.
# (For the screenshot, you'll need to insert an image or describe where to find it.)

## Part 4: Error Handling and Edge Cases (Bonus - 10 Marks)

1.  **Basic Error Handling:**
    * Implement basic error handling for your API. For example, if the LLM fails to generate a response or if no relevant documents are found.
    * Use FastAPI's `HTTPException` or custom exception handlers.
    * Demonstrate an error response (e.g., by sending an empty query or a query that intentionally triggers an error condition you've set up).

2.  **Input Validation (Optional, for extra credit):**
    * Add more robust input validation using Pydantic (e.g., minimum/maximum query length).
    * Demonstrate how invalid input is handled by the API.

In [None]:
# Your code for error handling and demonstration here.

## Submission Guidelines

* Submit this Jupyter Notebook (.ipynb file) with all cells executed and outputs visible.
* Submit a separate Python file (e.g., `main.py`) containing your full FastAPI application code.
* Ensure your code is well-commented and easy to understand.
* Provide a `requirements.txt` file listing all dependencies.
* Include a brief `README.md` file (optional but recommended) explaining how to run your API and any specific instructions.
* Make sure your notebook and API run without errors in the specified environment.