# Part 1 Semantic Search Engine: Demonstration Notebook

This notebook is designed to test and demonstrate the functionality of the FastAPI-based semantic search engine.

**Workflow:**
1.  It will read all compatible files (`.txt`, `.pdf`, `.docx`) from a local folder named `documents`.
2.  It will upload these files to the `/index` endpoint to be processed and indexed.
3.  It will then allow you to run several search queries against the `/search` endpoint to see the results.

### Before You Begin, Please Complete These Two Steps:

**1. Create Your `documents` Folder**
* In the same directory as this notebook, create a folder named **`documents`**.
* Place the files you want to search inside this folder. The API can handle `.txt`, `.pdf`, and `.docx` files.

**2. Launch the FastAPI Server**
* This notebook only acts as a **client**. The API server must be running in the background for this to work.
* Open a **separate terminal**, navigate to your project directory, and run the following command:
    ```bash
    uvicorn main:app --reload
    ```
* Keep that terminal open while you use this notebook.

In [None]:
import requests
import json
import os

# The base URL of your running FastAPI application
BASE_URL = "http://127.0.0.1:8000"
DOCS_FOLDER = "documents"

print("Setup complete. Ready to proceed.")

**Step 1 - Index Local Documents (Code Cell)**

In [None]:
print(f"--- Reading and indexing all files from the '{DOCS_FOLDER}' folder ---")

# Check if the documents folder exists
if not os.path.exists(DOCS_FOLDER):
    print(f"❌ Error: The '{DOCS_FOLDER}' directory was not found. Please complete the prerequisite steps.")
else:
    # Prepare the list of files to upload
    upload_files = []
    for filename in os.listdir(DOCS_FOLDER):
        file_path = os.path.join(DOCS_FOLDER, filename)
        # Open each file in binary read mode
        upload_files.append(('files', (filename, open(file_path, 'rb'))))

    if not upload_files:
        print("❌ Error: No files found in the 'documents' folder.")
    else:
        try:
            # Make the POST request to the /index endpoint
            response = requests.post(f"{BASE_URL}/index", files=upload_files)
            response.raise_for_status()  # Raise an exception for bad status codes
            print("\n✅ Indexing successful!")
            print(response.json())
        except requests.exceptions.RequestException as e:
            print(f"\n❌ Error during indexing: {e}")
        finally:
            # Important: Close the files after the request is made
            for _, file_tuple in upload_files:
                file_tuple[1].close()

In [None]:
print("\n--- Performing Search Queries ---")

# TODO: Customize these queries to match the content of your documents!
queries = [
    "what is the main theme of the documents?",
    "find information about technology",
    "tell me about history",
    "what is the most recent event mentioned?"
]

for query in queries:
    print(f"\n🔎 Searching for: '{query}'")
    try:
        response = requests.get(f"{BASE_URL}/search", params={'query': query})
        response.raise_for_status()
        results = response.json()
        print("✅ Results found:")
        print(json.dumps(results, indent=2))
    except requests.exceptions.RequestException as e:
        print(f"❌ Error during search: {e}")